Model And Reranker Calibration

Two models are pinned in the bundle. BGE-M3 converts text into vectors. Qwen3-Reranker-0.6B scores (query, document) pairs. Both are open-weight, both run on CPU.

Calibration pins file hashes, runtime versions, the prompt template, the score-to-relevance mapping, and the measured noise floor across multiple runs on different hosts. Pinning all of that matters because retrieval scores drift if anything in the stack changes, whether a different transformer version, a different BLAS, or a different tokenizer revision. Drift without pinning means honest verifiers would disagree with the coordinator over scores that were both technically correct. A verifier whose runtime fingerprint differs from the pinned values produces non-canonical scores, and the coordinator will refuse to treat its disagreements as proof of fraud.