Design¶
A reusable architecture for building prompts from modular state, templates, and runtime modes. Domain text, model provider, and output shape are left to the caller.
Prompt generation is a deterministic pipeline with controlled stochastic choices. The library separates what is known (registry sections), what's chosen for this call (selections + runtime modes), how the prompt is assembled (assembly_order tokens), and how outputs are cleaned and validated (output policy).
Goals¶
- Build prompts from composable parts rather than one large hardcoded string or a deep tree of decorators.
- Keep the JSON the editor produces literally the JSON the runtime loads — no codegen, no schema drift.
- Make conditional text behave: an unfilled template variable should
drop the surrounding sentence, not leave a
{var}in the output. - Make randomness visible and reproducible: every random pick (which item, which entries from an array, which slider value) is per-field, controllable, and deterministic when seeded.
One abstraction, repeated¶
A section is { required: bool, template_vars: [str], items: [obj] }.
That shape covers everything that used to be five different concepts:
| Old concept | Now |
|---|---|
| Routes | optional top-level routes map (overrides) |
| Overlays | sections + template_vars + fragments |
| Injections (stacked) | static_injections, runtime_injections |
| Persona presets | personas section, items have id, context |
| Few-shot pools | examples section, items have items[] |
The studio renders one card per section with the same affordances
(+ Add form, runtime-mode dropdown, value editor) regardless of which
section it is.
The token language¶
assembly_order is a list of tokens. Three forms:
section— bare. Renders the selected item's primary text field (textfor most sections,contextforpersonas/sentiment). Multi-select sections render each chosen item.section.field— dotted. Renders that field of the selected item. When the item doesn't have the field but does have anitems[]array, the resolver falls back to the array (soprompt_endings.<anything>Just Works for pool-shaped sections).section[expr]— bracket. Evaluatesexprrecursively, then looks up an item insectionwhosename/idmatches the result. Useful for cross-section indirection likeexamples[sentiment.example_pool].
Singular aliases (persona → personas, injections → static_injections,
ending → prompt_endings) read naturally inside an assembly_order.
Glue rules¶
- Consecutive same-section tokens join with
\n. - Different sections join with
\n\n. - Adjacent list tokens that share the same
pre_contextheading merge into one bulleted list — so a sentiment-specific examples list and a generic examples pool produce one heading + one combined bullet list. prompt_endingsis structurally trailing: it never absorbs into the preceding merged block.
Per-array runtime modes¶
Any array field on a selected item (nudges, examples, pool items,
base_directives) supports four modes:
all— render every entry.none— skip entirely. The token drops out of the assembly.index:N— render only the Nth entry.random:K— pick K fresh random entries every hydrate. Seedable.
Modes attach to (section, field) not (item, field) — when you switch
items, the section's modes still apply to the new selection.
A bullet list is rendered when pre_context is present or when there
are 2+ entries. A single entry without pre_context renders as a plain
line, so random:1 on sentiment.nudges produces a single sentence
instead of - one item.
Conditional fragments¶
base_context items (and any other section that wants them) can carry
a fragments array beside text:
{
"name": "scene",
"text": "You're watching a streamer.",
"fragments": [
{ "if_var": "location", "text": "Currently at {location}." },
{ "if_var": "sublocation", "text": "Specifically: {sublocation}." }
]
}
Each fragment renders with {var} substitution. If if_var is set and
the template-var has no value at runtime, the fragment is dropped — so
an unfilled {sublocation} doesn't leave a broken sentence behind.
Selections vs evaluation¶
The studio tracks two related but distinct things:
- Selections — what the user picked in the UI (dropdown values, checked boxes). Stable; doesn't move under you.
- Evaluated state — what hydrate actually uses. Folds in
section_randomtoggles (re-roll the item) andslider_randomtoggles (re-roll the slider).
Engine.hydrate() and Engine.run() both go through evaluation. The
inline editor reads only selections so what you see is what you set.
Runtime injections¶
Items in runtime_injections are special: each carries an
include_sections[] whitelist. When at least one injection is enabled
(checked in the studio), the assembled prompt is filtered to only
the sections in the union of those whitelists, plus the injection's own
text appended at the end.
Use case: a "raid" injection blanks out base context, sentiment, examples, etc., and substitutes one tight reaction prompt — without needing a separate route.
Output policy¶
OutputPolicy is unchanged from the previous library: min_length,
max_length, strip_prefixes, strip_patterns, forbidden_substrings,
forbidden_patterns, require_patterns, append_suffix,
collapse_whitespace. Lives at the registry top level and on each
optional route. Engine.run() cleans + validates the model response and
retries up to generation.retries times when validation fails.
Merge semantics¶
Generation config route overrides replace registry-level values
key-by-key. Output policy has two layers of behavior: the
OutputPolicy.merged_with helper is additive for sequence fields, but
Engine.run() currently applies route policy with a dictionary update
before constructing the policy object.
- Sequence fields (
strip_prefixes,strip_patterns,forbidden_substrings,forbidden_patterns,require_patterns) are additive when passed throughOutputPolicy.merged_withdirectly. InEngine.run()route policy replaces the registry value for the same field, because the route dictionary is applied first. - Scalar fields (
max_length,min_length,append_suffix,collapse_whitespace) replace.
GenerationConfig.merged_with is straight key-by-key replace; setting
temperature on a route just overrides it.
If you need registry-level sequence rules and route-level sequence rules
to both apply under Engine.run(), include both sets on the route.
Optional routes¶
A registry can declare named routes that override assembly_order
(and optionally generation / output_policy):
"routes": {
"raid": { "assembly_order": [...] },
"short": { "assembly_order": [...], "generation": { "max_tokens": 64 } }
}
Engine.run(state, route="raid") picks one. With no routes, the
top-level fields are the only path. Routes are optional structural
sugar — the same effect can be achieved by swapping which registry you
load.
Item metadata fields¶
Two fields appear on items in canonical exports but aren't consumed by the engine — they're authoring metadata for the studio:
runtime_variables: list[str]on items declares which template vars the item expects the caller to fill. Useful for documentation, for the studio to flag missing inputs, and for validators that want to check coverage.Engine.hydratedoesn't enforce it; missing vars just substitute as empty strings.pre_context/pre_context:— string heading that prefixes list-shaped content (items/examplesarray fields). The trailing colon spelling is tolerated for legacy data; canonical ispre_context.
What lives in the registry vs in the call¶
In the registry:
- All authored content (items, fragments, pre_context, template_vars)
- Default selections / array modes / sliders (when you
Export Model JSON) - Generation overrides + output policy
In the per-call state:
- Runtime template-var values (
{location}→ "the kitchen") - Selections and modes the caller wants to override
The same JSON your editor exports is loadable as a self-contained model; the studio also lets you save snapshots in the browser's localStorage for quick switching during authoring.
What the engine does NOT do¶
- It does not parse free-form natural language descriptions of prompts.
- It does not multiplex providers within a single call.
- It does not "chain" — that's a higher-level concern. Compose by
calling
Engine.runmultiple times with different states. - It does not validate semantic correctness of generated text beyond
what
OutputPolicyregex/length checks express.
Why this shape¶
The earlier iteration of the library shipped Routes, Overlays, Injections, Presets, and Pools as separate concepts. They each had their own merge/sample/select rule. Authoring a prompt meant learning four mental models and reasoning about how they composed. The registry collapses all of that into one shape (sections of items + an assembly order) with one set of orthogonal modifiers (runtime modes, fragments, sliders). Less to learn, less to round-trip, less to break.