Based on performance analysis showing 16 workers is the optimal balance:
- Sufficient parallelism for I/O-bound proof operations
- Avoids excessive memory pressure (64MB vs 128MB for stacks)
- Reduces MDBX reader table contention
- Testing shows 16 workers achieve 145% baseline throughput
- Beyond 32 workers shows degradation due to I/O scheduler thrashing
The 2x CPU oversubscription remains for I/O-bound workloads, with
a minimum floor of 16 instead of 32 workers.