Difficulty And Calibration
CoreTex accepts work by measuring improvement over the current parent substrate. The parent and candidate are scored on the same gate and confirm packs, so the threshold is about marginal retrieval gain rather than absolute benchmark score.
The state-advance threshold is:
stateAdvanceThresholdPpm =
minImprovementPpm + baselineVariancePpm + replayTolerancePpm
The v16 launch profile pins replayTolerancePpm = 250 and
baselineParentScorePpm = 288438. baselineVariancePpm is present when the
current baseline was sampled broadly enough to measure variance. The difficulty
controller clamps minImprovementPpm between 2,500 ppm and 150,000 ppm.
Baseline state is tracked per live root. The epoch-start context pins the parent
root and baseline manifest hash. Each accepted state advance moves the live
root, and the coordinator records the accepted scoreAfterPpm as the effective
baseline for that new root. If a root appears without a known baseline, such as
after a foreign advance or rotation, submissions return awaiting_baseline_recompute
until the recalibrated baseline is installed.
The screener threshold is derived from the same live baseline. It combines:
| Signal | Effect |
|---|---|
| Remaining headroom | Higher parent scores leave less easy gain |
| Recent noise floor | Raises the gate above measured replay noise |
| State-advance floor | Keeps screeners tied to the real advance threshold |
| Probe pass rate | Adds an anti-gaming penalty when weak probes pass too often |
| Static floor | Operator floor from coordinator config, if set |
Epoch difficulty updates use nextMinImprovementPpm:
| Epoch signal | Controller response |
|---|---|
| More state advances than target | Ramp threshold upward |
| Zero advances with many quality attempts | Decay threshold |
| Some quality attempts below target advances | Ease through under_target_recovery |
| Zero advances and zero quality attempts | Drift toward the floor |
| Major corpus delta | Freeze for one epoch with major_delta_grace |
A major delta is detected with
isMajorDelta(nextEvalHiddenCount, previousEvalHiddenCount, majorDeltaThreshold).
The v16 profile pins majorDeltaThreshold = 220. During grace, the threshold
is held steady while the baseline is recomputed against the new corpus,
frontier, and query-pack context.
Calibration artifacts bind the pieces that matter for replay: corpus root,
active frontier root, query-pack policy, baseline score, variance, fixed-pack
repeatability, model pins, runtime pins, and profile hash. Corpus evolution also
checks that positive eval_hidden qrels still point to available documents
before publishing a rotation.