Retrieval Evaluation

CoreTex evaluates each candidate patch against hidden query packs derived after the patch is received. The seed is bound to:

epochSecret + future Base blockhash + epochId + patchHash
+ parentRoot + minerAddress + corpusRoot + bundleHash

The future blockhash is not known when the patch arrives, so the coordinator cannot pre-test the patch against its actual hidden pack at receive time. Re-submitting the same (parentRoot, patchBytes) uses the cached verdict instead of rolling a fresh pack.

The coordinator records receivedAtBlock when a patch enters the eval queue. That value is part of the signed evaluation report, along with the target block, target blockhash, patch hash, and duplicate key. Replay watchers verify the blockhash against Base RPC data.

For each hidden query:

  1. Decode active substrate slots.
  2. Compare the query embedding to active retrieval-key vectors.
  3. Take the top retrieval candidates.
  4. Resolve candidates to corpus documents through memory slots.
  5. Rerank (query, document) pairs with the pinned Qwen3 reranker.
  6. Score the ranked list against graded qrels.

Two independent packs are used:

Pack Purpose
Gate First hidden-pack pass
Confirm Second hidden-pack pass to filter pack-luck wins

A state advance must clear threshold on both packs.

The dominant metric is nDCG@10. It rewards ranking highly relevant answer-bearing documents near the top and penalizes plausible wrong answers. The evaluator also tracks temporal current/stale correctness, multi-hop relation recall, abstention behavior, and structural validity.

The default composite shape is retrieval-dominant:

Component Role
Retrieval nDCG@10 Main signal for whether the substrate retrieves answer-bearing documents
Temporal score Rewards current facts and stale-memory rejection
Relation recall Rewards useful multi-hop routing through the relation region
Abstention Penalizes surfacing irrelevant memories when no answer should be retrieved
Structural sanity Ensures the substrate is well-formed and replayable

The exact weights are bundle-profile values. The design requires retrieval to remain the dominant component and structural sanity to remain a small guardrail, not the reward law.