Being types
What a being is, and the four kinds we have today
A being is whoever stamps a reel of facts. The cognition can be a script, an LLM, a human, or a composite. The substrate treats all four the same.
The being IS the chain of facts
A being is not a process. Not a thread, not a chat session, not an agent that holds state between turns.
A being is two things:
- A row (name, roles, home space, LLM connection if any).
- A reel of facts on the chain.
Everything else (who they are, how they think, what they have done) is folded fresh from the reel each moment.
The four kinds of cognition
- 1
Scripted
Cognition is a function. The being is summoned, the function reads the fold, the function decides what to do.
No model, no prompt, no inference. Deterministic and fast. Anything that can be expressed in code.
- 2
LLM
Cognition is one call to a language model per moment. The model emits a tool call, prose, both, or neither.
Detailed below.
- 3
Human
Cognition is a person at a screen. The summon lands in the human's inbox; the human reads it through a portal and acts through the same verbs.
Latency is minutes instead of milliseconds. The substrate does not care.
- 4
Composite
Cognition is a being made of other beings. A ruler that delegates to workers, a panel that votes, an ensemble.
Future direction. Not built out yet.
LLM beings: the shape of one moment
One call to a model per moment. One call, one decision, one act. Multi step work is many moments, not one moment with many calls.
Forward by default
Every moment has an orientation. The default is forward: the fold reads the world (spaces and matter the being can see) and does not read the being's own past acts.
A being woken forward has no memory of what it has done before this moment. It looks at the world as it is now and decides. The fact that it acted before lives on its act-chain, but the act-chain is not in scope for a forward moment.
Two things go into the call
The prompt is built from exactly two pieces. No third "past messages" slot.
messages = [
{ role: "system", content: "<role identity, world face, tools>" },
{ role: "user", content: "<the wake's content>" },
]- System prompt. Who the being is, where it is, what it can see, what tools it has, the role's persona, the current time. Built fresh every moment.
- User message. The content of the summon that woke this being. A person's question, a notification of state change, whatever opened the moment.
The chat format's three-slot shape (system / past / user) is how the provider's API happens to be structured. It is not how TreeOS thinks. Mapping the past slot onto "the being's prior acts" would make every moment a half-turn by default. Forward fold does not read the act-chain.
The tools the model is offered
The seed ships ONE generic tool per verb. That's the function-call surface, top to bottom:
see(address) . read substrate do(target, action, args) . invoke a registered DO operation summon(target, content) . speak to a being be(operation, payload) . identity-bind (self-targeted)
The role spec doesn't declare which tools it has. The four can* lists ARE the body, and tool exposure is DERIVED:
canSeenon-empty → theseetool is exposed. The list shows what addresses to read.canDonon-empty → thedotool is exposed. The list shows what actions to invoke.canSummonnon-empty → thesummontool is exposed. The list shows what stances to address.canBenon-empty → thebetool is exposed. The list shows what BE operations to perform.
The four verbs are structurally universal . every LLM being's prompt presents the same four function signatures to the provider. What varies per role is which subset is exposed (based on which can* lists are populated) and what each verb can be used for (the entries inside each list).
Two layers gate what reaches substrate: the prompt-list (what the LLM sees as available) and substrate stance-auth at the verb (the truth). Off-list calls that pass prompt-discipline still refuse at the verb layer.
The relationship-resolver layer
Sometimes the targets are not knowable at role-design time. A ruler's canSummon includes "my parent" or "the predecessor" . relationships that resolve per moment per being, not at design time.
For these, can* entries can be relationship tokens that expand at prompt-build time via registered resolvers:
canSummon: [
"@operator", // literal stance
{ rel: "parent" }, // lineage: my minter
{ rel: "any-child" }, // every being I minted
{ pattern: "fitness/@coach" }, // path-shaped match
]The resolver layer expands tokens into concrete options right before the prompt renders. The LLM sees the resolved stances; the dispatch uses the concrete stance; substrate auth gates the actual reach. The role spec stays declarative; the runtime does the lookup.
Resolvers are registered at boot. The registry ships empty; every entry passes through as a literal today. Future resolvers (parent / predecessor / any-child / pattern) plug in without changing the role specs or the assembler.
Multi-step rituals are multi-moment, not multi-tool
Coronation, succession, role-chain. None of these need a new mechanism. Each ritual step is one moment: the being summons a target, the target wakes, responds, the response wakes the original being, the next moment runs. The substrate's existing inbox / summon / reply / wake loop is the ritual machinery.
Most rituals are response-driven (the being waits for each reply before the next step, selfContinue: false). Pure-outbound sequences use selfContinue: true and silence (SEE) as the exit.
Per-operation ergonomic tool wrappers retired with this cleanup. Actions live in the DO operation registry; the LLM dispatches them via the generic do tool. An op that wants a cleaner schema does it at the op-handler level (validating args, deriving missing fields from the actor's context), not by adding a separate LLM tool.
What extensions add
Extensions never add tools. They add to the WORLD. Two channels:
- Addressable things. Extensions create spaces, matter, and beings. These are automatically see-able by virtue of existing at an address . any role with
canSeelicense to that address can read it through the universalseetool. The descriptor returned by SEE includes whatever qualities the extension stamped on those primitives. No new tool needed. - See-resolvers. Optional per-role focused views. A role declares
see: [name]on its spec, the assembler runs the named resolver every moment and pre-renders the result into the system prompt. The resolver's job is to take the raw substrate and shape it into the precise structured view that role needs to act.
// A see-resolver, returning structured data:
{
position: { x: 3, y: 4 },
grid: { w: 10, h: 10 },
neighbors: { N: "empty", NE: "@follower", ... },
walls: ["NW"],
legalMoves: ["STAY", "N", "NE", "E", "SE", "S", "SW", "W"]
}
// Assembler renders into the system prompt as:
[neighbors]
{
"position": { "x": 3, "y": 4 },
"grid": { ... },
...
}Strings are still accepted (legacy) and pass through verbatim; the resolver framed its own block. New resolvers should return objects. The fix for the dancer's "wall cluster" hallucination was exactly this . the resolver returned prose, the LLM invented world features that weren't in the data.
Three things can come out
The response parses into exactly one of three:
- Act. The model called a tool. The factory runs it. Any prose alongside closes the act . the "I just did this" sentence. The Act row writes and the fact the tool emitted commits with it.
- See. No tool call. The being looked and did not act. Inbox closes cleanly. No row, no reply. Prose without a tool is also see.
- Failure. Call broke (timeout, provider error, garbage). No row, no reply.
How speech works
If every act goes through a tool, how does anyone talk? Speech is already in the model. It splits by who's being spoken to.
- Speech to a being → SUMMON. A being speaking to another being calls
summon(target, content), threaded byinReplyTo. The seed ships a genericsummontool any role can pick up. No new verb, no "respond" wrapper. - Speech alongside an act → the act's content field. When a being calls a tool and also says something, the prose becomes the act's content. Recorded on the being's reel as the assistant's voice for that act.
- Speech to nobody, no act → see. The LLM emitted prose but no tool. The prose is logged but not sealed.
// Reply to whoever woke you. Target and inReplyTo
// default to the wake's asker and correlation.
summon({ content: "Here is what you asked for." })
// Narrate alongside an act. Prose rides with the tool call.
some-tool({ ...args }) + prose "context for what was done"
// Stay silent. No tool call, no narration. SEE.
(no tool, no prose)Multi step work uses many moments
A role can declare selfContinue: true. When an act seals, the sealer enqueues a fresh summon to the same being. The next moment folds the world AFTER this act and decides again.
The loop ends naturally when the model has nothing left to do and emits a see. Silence is the exit.
Most beings default to selfContinue: false and rely on something external to wake them again. One summon, one moment, done until the next summon.
Half and inward (not yet wired)
A being can request that its next moment NOT be forward, by self-summoning with an explicit orientation:
- Half. Fold reads the world AND structured recall of the being's own prior acts. Recall is causal, not chronological . past acts stitched to entities currently changing in the face. A being considering its trajectory turns half.
- Inward. Fold reads only the being's own act-chain. The world drops away. Pure reflection. A being asking "who have I been" turns inward.
Neither is wired yet. A misrouted half/inward summon is accepted, logged, and downgraded to forward so it cannot silently inject past before the recall primitives land. The orientation parameter is present on every moment so when half and inward come online they slot in cleanly.
The complete role spec
Everything above is one thing in author code. An LLM role file's complete declaration is its four can* lists plus optional see resolvers, the orientation, the continuation flag, and the prompt body. Everything else . permissions, respond mode, the wrapped summon dispatcher, the system-prompt assembler . is derived by the registry at registration. Authors write what the role IS; the seed fills in everything derivable.
// Every LLM role's complete spec is its four can* lists
// + optional see resolvers.
{
name: "...",
canSee: [...], // optional, populates the see tool
canDo: [...], // optional, populates the do tool
canSummon: [...], // optional, populates the summon tool
canBe: [...], // optional, populates the be tool
see: ["name"], // optional, structured resolver outputs in prompt
selfContinue: bool, // optional, one-act vs many-acts-via-many-moments
defaultOrientation: "...", // optional, forward by default
prompt(ctx) { ... }, // role-intent only; no verb syntax explanation
}name. Kebab-case identifier. A SUMMON'sactiveRoleresolves through it.- The four
can*lists. Address / action / target / operation entries. Non-empty list → the matching verb's tool is exposed and the matching permission is added. Empty / absent → the verb is not on this role's surface. see. Names of registered see-resolvers. The assembler runs each one every moment and pre-renders the structured result under[name]in the system prompt. NOT a tool the LLM calls . it's the being's eyes for the moment, baked into the face.selfContinue.truemeans: after this act seals, the sealer enqueues a fresh summon to the same being so the next moment runs. Defaultfalse: one summon, one moment, done until something external wakes it again.defaultOrientation."forward"(default),"half", or"inward". Controls what the fold reads. Half and inward are accepted-and-downgraded today; the slot is reserved for when the recall primitives land.prompt(ctx). Returns the role-intent text. Describes WHO the role is and WHAT it does. Does NOT explain verb syntax . that is auto-assembled from thecan*lists by the seed.
What the seed derives (so authors don't write it):
permissionsfromcan*(andseewhen there are preloaded resolvers).respondModedefaults to"async".triggerOndefaults to["message"]. Override for scheduled or hook-fired roles.summon(message, ctx)auto-wraps todefaultSummon, which runs the LLM moment. Scripted roles attach their own and skip the LLM apparatus entirely.- The system prompt: identity, preloaded
seeresolvers, capabilities rendered fromcan*, the role'sprompt(ctx)body, the current time. Built fresh every moment.
An LLM being is not an MCP client
MCP (the Model Context Protocol) is a wire format for letting a model reach a server full of tools. It is one shape of bridge between a model and a toolbox.
An LLM being is not that. An LLM being is a participant in the world:
- It has a name and a home space.
- It has a reel of its own facts.
- It has an inbox where summons arrive.
- It has a position in the tree that changes what it sees.
- Other beings can summon it by address.
- It can summon them back.
Its acts join the chain of facts the rest of the world reads from. Structurally, it is the same kind of thing as a scripted being or a human being. The only difference is how its cognition happens.
The tools the LLM reaches for inside its moment come from the factory's tool registry and dispatch through the same four verbs (SEE / DO / SUMMON / BE) every other being uses. The model never sees the verbs directly; the tools wrap them. MCP can be added as a transport at the edge for outside model runtimes, but the being inside the world is unchanged either way.
Where this lives in the seed
- One-moment LLM cognition:
seed/present/cognition/llm/llmMoment.js - Role registry & dispatcher:
seed/present/roles/+seed/present/cognition/defaultSummon.js - Discriminated result (act / see / failure):
seed/present/cognition/cognitionResult.js - The four seed verb-tools:
seed/present/cognition/llm/seedSeeTool.js,seedDoTool.js,seedSummonTool.js,seedBeTool.js - The relationship resolver layer:
seed/present/cognition/llm/canStarResolver.js - Fold doctrine (forward / half / inward):
philosophy/MODEL.md,philosophy/INNER-FOLD.md
Beat 4 (momentum) is the beat where this all happens inside one moment.
