Troubleshooting Sector Removal
Problem
Sector sealing can fail for a number of reasons and may cause various issues. Whilst it is always suggested that storage provider admin or system maintenance be performed with an empty sealing queue, it is not always possible. Disruptions to the sealing pipeline can cause sector removal or termination difficulties which may in turn negatively impact windowPoSts.
Depending on the nature of the failure you may find that lotus-miner sectors remove
or lotus-miner sectors terminate
commands do not result in successful removal or termination of the sector in question.
Detailed below are a few methods that may help finalize the removal or termination of failed sectors. Please apply the methods in the order listed in order to minimize further disruption to active sealing tasks that you may wish to retain.
Environment
- Mainnet
- Calibnet
- Butterflynet
Resolution
Step 1: Initiate a Scheduler Update
It may initially appear that running lotus-miner sectors remove
or lotus-miner sectors terminate
command fails to have any immediate effect.
Just like any other sealing step transition (e.g. PC1 to PC2), remove and terminate requests are queued by the lotus scheduler (FSM). If the sector in question is still actively trying to complete sealing, the remove/terminate command will not actually trigger until the current task has been completed.
Example
You experienced an unscheduled disruption to your sealing tasks and now sector 1237
is persistently failing to complete PC2. It is continuously looping back to PC1 which takes several hours to complete. In this example your miner ID is f01234
.
lotus-miner sealing jobs
ID Sector Worker Hostname Task State Time
2e62c170 1234 1abc2abc SERVER-1 PC1 running 2h35.5s
e87a6d1c 1235 2cde3cde SERVER-1 PC2 running 23m48.9s
80509f7d 1236 2cde3cde SERVER-1 C2 running 22m42.4s
4680f2f0 1237 1abc2abc SERVER-1 PC1 running 30m54.6s
Rather than restarting your worker or miner which will disrupt the other sealing sectors, you can simply run:
lotus-miner sealing remove 1237
lotus-miner sealing abort 1237
This will prompt the Lotus scheduler to cancel the current PC1 job and apply the `lotus-miner sealing remove' request that you have just issued.
Step 2: Terminate With Lotus Shed
Lotus Shed is not installed by default when installing Lotus. Please follow the steps detailed here.
Lotus Shed includes a function to terminate sectors. Using the example data above the correct command to run would be:
lotus-shed sectors terminate --really-do-it 1237
Step 3: Create Dummy Sector Data
You may encounter a sector removal situation where the sealed
,unsealed
and/or cache
sector files have been deleted but the sector remains in your lotus-miner sectors list
and/or on-chain.
You can repopulate the missing data by either duplicating an existing sector’s data and renaming the folder/files or by creating empty files/folders to act as placeholders using touch
.
In the event that the sector successfully completed sealing, you will need to create the dummy data in your long-term storage folder as follows:
/long-term-storage-folder/sealed/s-t01234-1237
/long-term-storage-folder/unsealed/s-t01234-1237
/long-term-storage-folder/cache/s-t01234-1237 # s-t01234-1237 refers to a folder here
In the event that the sector failed to complete sealing, you will need to create the dummy data in your sealing folder as follows:
/sealing-folder/sealed/s-t01234-1237
/sealing-folder/unsealed/s-t01234-1237
/sealing-folder/cache/s-t01234-1237 # s-t01234-1237 refers to a folder here
NOTE: your miner ID is specified with a t
prefix for sector asset naming convention and not the familiar f
.
Having created the dummy data you will then need to restart your miner (and worker if applicable) and then run the lotus-miner sectors remove
or lotus-miner sectors terminate
command again.
NOTE: restarting your miner will also restart all other sealing tasks that may be active.
Extras
If you are unable to remove/terminate your sector after following all three steps, please reach out to the Lotus Team in Slack.