Large files

This guide goes into more depth for storing very large files (over 1TiB), providing more insights into how data is stored and some best practices.

This guide assumes you are familiar with the normal storage deal workflow.

Maximizing storage per sector

In Filecoin, files must be smaller than the sector in which they are stored.

Usually sectors are incorrectly described as being 32 gigabytes or 1 megabyte, where in reality they are 32 gibibytes (GiB) and 1 mebibyte (MiB) respectively. Thus a 32GiB sector has 2^30 bytes (1,073,741,824 bytes) instead of 1,000,000,000 bytes.

However, not all the space in a sector can be used effectively. For each 256 bits, 2 bits are reserved for proof processes. Thus the usable size of the sector is:

sector-size * 254 / 256**

Here is a quick table:

Sector SizeUsable size
2KiB2,032 bytes
8MiB8,323,072 bytes
512MiB532,676,608 bytes
32GiB34,091,302,912 bytes
64GiB68,182,605,824 bytes

Deals with offline data transfer

Filecoin’s offline data transfer feature is recommended for petabyte-scale datasets and larger. This allows users with very large datasets to complete the data transfer step offline (e.g. by shipping hard drives from the client to the storage miner), while the storage deal continues to work as intended on-chain.

It is implemented via a flag on the storage deal command that tells the client not to transfer the data over the network, and provides a piece CID (a unique identifier describing the data) to the miner instead, which a miner must then match for the deal to go through. This gives the client node flexibility in how it can set up the deal — for example, passing miners a specific location on a hard drive for the data they can use to generate the piece CID.

Generate a unique piece CID

  1. Use the Lotus client to generate a CAR file of the input without importing:
lotus client generate-car <input_path/filename> <output_path/data.car>
  1. Use the Lotus client to generate the piece CID and size:
lotus client commP <inputCarFilePath>

Transfer the data to the miner offline

This can be done several ways, such as shipping hard drives from the client to the storage miner.

Finalizing: The miner’s role

The miner can import the data and deal manually with:

lotus-miner storage-deals import-data <dealCid> <carFilePath>

Once the first Proof of Spacetime (PoSt) hits the chain, the storage deal is considered active.

Edit this page on GitHub