Real boot vs roleplay
AIR is prompt-based. You hand a model a small set of files and ask it to boot a project. Most of the time it does. But there is a failure mode that is easy to miss, and worth naming plainly: a capable model can read the AIR files, recognize what AIR is, and perform it — reproducing the vocabulary, the tone, even the onboarding flow — without ever actually running it. It looks like AIR. It talks like AIR. It just isn't AIR.
What we watched happen
This is not hypothetical. A frontier model was handed the AIR boot files and asked to start a project. It reproduced the onboarding questions, answered them, and announced the project was active. It used the right words — active step, gate, benchmark, review — and the work it produced was, on the surface, decent. Yet across thousands of lines, it never once emitted a single AIR object: no AIR_SESSION, no execution map, no artifact — only prose that named those things. At one point it pressed past the gates entirely and treated a change to a live page as already done, with no approval and no record. It was not so much lying as improvising a role. The structure was theater.
What made that dangerous was not that the output was bad. It was that the output was good enough to trust — and convincing prose lulls you into believing the governance is real, right up until it does something the governance would never have allowed. Honest-looking is not the same as honest.
The tell
Here is the difference, and it stops being subtle once you know where to look. A genuine AIR boot emits formal objects — machine-readable JSON that the runtime requires as evidence that it is actually running. The first thing a real session produces looks like this:
{
"AIR_SESSION": {
"session_state": "BOOT_ACTIVATION_ONBOARDING",
"runtime_origin": "PROMPT_COMPILED",
"backend_validation_claimed": false,
"active_contract_id": "AIR_PROJECT_BOOT_V1",
"current_onboarding_question": "Q1",
"decision_state": "AWAITING_Q1_SELECTION",
"air_object_visibility": {
"visibility_mode": "OBJECT_DEFAULT",
"boot_objects_required": true,
"boot_objects_emitted": true
},
"blockers": []
}
}
Role-play cannot conjure that convincingly, because the objects are not decoration — they are the runtime's own evidence, with a fixed shape and required fields. A model that is performing AIR writes about AIR_SESSION. A model that is running AIR emits it.
Default output vs AIR output
Put them side by side and the gap is plain. Default output is a confident paragraph that says "AIR project activated" and describes what it is doing. AIR output is an object that states its runtime origin, whether it is backend-validated, which onboarding question it is on, and what it is blocked on — and only then, the work. One asserts structure. The other is structured.
What to do about it
When you boot AIR, look for the object. If a model replies with prose about AIR but no AIR_SESSION appears, it has not booted — it is role-playing, and the fix is to re-boot or try a different model. AIR runs on whatever chatbot you bring, so behavior varies: some models refuse the files, some boot cleanly, and some perform without running. If a model refuses or role-plays AIR, tell us in Discussions — those reports are how the compatibility picture stays honest.
Why this is the point
A product about honesty has to be checkable, or it is just another claim. The formal objects are what turn "is it actually doing what it says?" into a question you can answer for yourself, in seconds, without taking anyone's word for it. That is the whole idea — and it is the one thing role-play cannot copy. You can tell when it is real.