Open weights are great — until a regulator asks a question
I've spent most of my career selling AI into financial services — credit, claims, mortgage, risk. In every one of those markets, there is a moment that looks like this: something goes wrong, a regulator or an internal auditor shows up, and someone opens a laptop and says, “show me the model that made this decision.”
If your model is an open-weights checkpoint that a dozen teams have downloaded, fine-tuned, re-quantized, and redeployed across five environments, that question has no clean answer. You can produce a version of the model. You can't produce the version that made the decision. That distinction is where regulated-AI deployments fall apart.
Q-Prime's license model is a chain-of-custody guarantee
Q-Prime ships as a HuggingFace model card with no weights download and with managed API access. Every inference request is versioned, logged, and attributable to a specific Q-Prime release. When the regulator asks which model produced this decision, the answer is a cryptographically verifiable record, not a reconstructed guess.
The trade-off we're making
Open weights maximize researcher freedom. Managed access maximizes audit defensibility. We are explicitly optimizing for the second. That's a real trade-off, and I want to say it out loud: if you want to fork Q-Prime, quantize it to 4 bits, and run it on a laptop, we're not the right vendor. Go use one of the excellent open-weights embedding models and build something.
If you're an insurance underwriter, a mortgage compliance officer, or a clinical decisioning product leader — people whose AI outputs have to survive contact with a regulator — the calculus flips. The thing you want is not “a model I can download.” The thing you want is a vendor who stands behind every inference the model ever produced for you, with a record that survives examination. That's what the license model delivers.
It also forces our discipline
The managed-access model isn't just a feature for customers — it's a forcing function for us. We cannot ship an opaque checkpoint and move on. Every Q-Prime release has to be reproducible from its published spec, versioned end-to-end, and retired on a published sunset schedule. That's a harder bar than shipping a weights file and walking away. It's also the bar regulated customers actually need us to meet.
What you actually get on HuggingFace
The Q-Prime model card on HuggingFace is the technical front door. It has the spec, the benchmarks, the intended-use documentation, and the link to QGI's Commercial Model License. It's where a research engineer starts. The actual inference endpoint lives behind the license — with audit logs, versioning, and a phone number we answer when the regulator calls.
Read next
If you want the full architectural picture — how Q-Prime fits with the QAG Engine, Qualtron, QGM, and Neural Symbolic Agents — read Sam's map of the five-layer stack.