Difficulty And Calibration

CoreTex accepts work by measuring improvement over the current parent substrate. The parent and candidate are scored on the same gate and confirm packs, so the threshold is about marginal retrieval gain rather than absolute benchmark score.

The state-advance threshold is:

stateAdvanceThresholdPpm =
  minImprovementPpm + baselineVariancePpm + replayTolerancePpm

The v16 launch profile pins replayTolerancePpm = 250 and baselineParentScorePpm = 288438. baselineVariancePpm is present when the current baseline was sampled broadly enough to measure variance. The difficulty controller clamps minImprovementPpm between 2,500 ppm and 150,000 ppm.

Baseline state is tracked per live root. The epoch-start context pins the parent root and baseline manifest hash. Each accepted state advance moves the live root, and the coordinator records the accepted scoreAfterPpm as the effective baseline for that new root. If a root appears without a known baseline, such as after a foreign advance or rotation, submissions return awaiting_baseline_recompute until the recalibrated baseline is installed.

The screener threshold is derived from the same live baseline. It combines:

Signal Effect
Remaining headroom Higher parent scores leave less easy gain
Recent noise floor Raises the gate above measured replay noise
State-advance floor Keeps screeners tied to the real advance threshold
Probe pass rate Adds an anti-gaming penalty when weak probes pass too often
Static floor Operator floor from coordinator config, if set

Epoch difficulty updates use nextMinImprovementPpm:

Epoch signal Controller response
More state advances than target Ramp threshold upward
Zero advances with many quality attempts Decay threshold
Some quality attempts below target advances Ease through under_target_recovery
Zero advances and zero quality attempts Drift toward the floor
Major corpus delta Freeze for one epoch with major_delta_grace

A major delta is detected with isMajorDelta(nextEvalHiddenCount, previousEvalHiddenCount, majorDeltaThreshold). The v16 profile pins majorDeltaThreshold = 220. During grace, the threshold is held steady while the baseline is recomputed against the new corpus, frontier, and query-pack context.

Calibration artifacts bind the pieces that matter for replay: corpus root, active frontier root, query-pack policy, baseline score, variance, fixed-pack repeatability, model pins, runtime pins, and profile hash. Corpus evolution also checks that positive eval_hidden qrels still point to available documents before publishing a rotation.