promptlibretto¶

A prompt-engineering library plus a browser studio for designing, tuning, and exporting prompts as portable JSON.

A registry is a flat list of named sections — personas, sentiment, examples, prompt_endings, and so on — each holding a list of items. An assembly_order of tokens (section, section.field, or section[expr]) tells the engine how to weave the selected items into one prompt.

The same JSON the studio exports is the JSON the library loads. No codegen, no schema drift between the editor and the runtime.

promptlibretto intro

Install¶

pip install promptlibretto                # library only
pip install "promptlibretto[ollama]"      # adds httpx for OllamaProvider
pip install "promptlibretto[studio]"      # adds the browser studio
pip install "promptlibretto[dev]"         # pytest + pytest-asyncio

Quick start — library¶

import asyncio
from promptlibretto import Engine, MockProvider, Registry

reg = Registry.from_dict({
    "registry": {
        "assembly_order": ["output_prompt_directions", "personas.context"],
        "output_prompt_directions": {
            "required": True,
            "items": [{"name": "rules", "text": "Be brief."}],
        },
        "personas": {
            "required": True,
            "items": [
                {"id": "shy",  "context": "You're nervous about speaking."},
                {"id": "loud", "context": "You're full of energy."},
            ],
        },
    }
})

eng = Engine(reg, provider=MockProvider())
print(eng.hydrate(state={"selections": {"personas": "loud"}}))

Output:

Be brief.

You're full of energy.

Quick start — studio¶

pip install "promptlibretto[studio,ollama]"
promptlibretto-studio --port 8000

Open http://localhost:8000. From there:

Import JSON… to paste a registry, or load one of the bundled examples as a starting point.
Pick selections (dropdowns / checkboxes), set per-array runtime modes (all / none / index:N / random:K), fill template-var inputs, and use the sentiment-intensity slider.
Pre-generate to render the prompt; Generate to send it browser-direct to your local Ollama. The browser-direct path uses the server to hydrate the prompt, but it does not apply Python-side output policy validation or retries.
Export Model JSON copies the canonical registry — selections, runtime modes, slider, generation overrides, all baked in — to load in your app via load_registry().

Studio Compose view

Builder overview

State shape¶

state = {
    "selections":     {section_key: id | [id, ...]},
    "array_modes":    {section_key: {field: "all" | "none" | "index:N" | "random:K"}},
    "section_random": {section_key: bool},
    "sliders":        {section_key: int},
    "slider_random":  {section_key: bool},
    "template_vars":  {"<section>::<var>": "value"},
}

All fields optional. Selections default to the first item of each required section. Modes default to all.

Hydrate vs run¶

prompt = eng.hydrate(state=...)             # string, no LLM call
result = await eng.run(state=...)           # hydrate + provider + output policy
result = await eng.run(state=..., route="raid")    # optional route override
async for chunk in eng.stream(state=...):   # streaming providers only
    ...

Seed for reproducibility:

prompt = eng.hydrate(state=..., seed=42)

Optional routes¶

from promptlibretto import Route

reg.routes["raid"] = Route(
    assembly_order=["runtime_injections", "output_prompt_directions"],
    generation={"max_tokens": 96, "temperature": 0.4},
    output_policy={"max_length": 280},
)

result = await eng.run(state=..., route="raid")

If a registry has no routes, the top-level assembly_order is the only path.

API reference¶

Everything lives at the top of the package: from promptlibretto import ….

Model¶

Registry: Dataclass for the whole model. Registry.from_dict(d) accepts both wrapped ({"registry": {...}}) and bare dicts. reg.to_dict(wrap=True) serializes it back. Attributes: version, title, description, assembly_order: list[str], sections: dict[str, Section], routes: dict[str, Route], generation: dict, output_policy: dict.
Section: One section. required: bool, template_vars: list[str], items: list[dict]. May also carry studio-state extras (selected, section_random, array_modes, slider, slider_random, scale_template, template_var_defaults) when round-tripping through Export Model JSON.
Route: Optional override bundle. assembly_order: list[str] | None, generation: dict, output_policy: dict. Only the fields you set override the registry-level defaults at run time.
SCHEMA_VERSION / SECTION_KEYS: Constants. SCHEMA_VERSION == 22 for the current schema. SECTION_KEYS is the canonical ordered tuple of recognized section names.

Hydration¶

HydrateState: Per-call inputs. Either build it directly or use HydrateState.from_dict(d) with the state dict shape:

HydrateState(
    selections     = {"sentiment": "positive", "examples": ["normal"]},
    array_modes    = {"sentiment": {"nudges": "random:1"}},
    section_random = {"personas": True},
    sliders        = {"sentiment": 8},
    slider_random  = {"sentiment": False},
    template_vars  = {"base_context::location": "the kitchen"},
)

hydrate(reg, state, *, route=None, seed=None) -> str: Functional version of Engine.hydrate. Returns the assembled prompt. No provider call. Pass seed for deterministic random rolls.

Engine¶

class Engine:
    def __init__(
        self,
        registry: Registry | Mapping | None = None,
        provider: ProviderAdapter | str | None = None,
        output_processor: OutputProcessor | None = None,
    ): ...

    def hydrate(self, state, *, route=None, seed=None) -> str: ...
    async def run(self, state, *, route=None, seed=None) -> GenerationResult: ...
    async def stream(self, state, *, route=None, seed=None) -> AsyncIterator[GenerationChunk]: ...

provider="mock" builds a MockProvider. None does the same and defers — calling run without a provider is fine if you only ever hydrate.

Results¶

GenerationResult: Returned by Engine.run. Fields: text (cleaned + validated output), accepted: bool, prompt (the exact string sent to the LLM), route: Optional[str], reason: Optional[str] (why a non-accepted result was rejected), usage: Optional[dict], timing: Optional[dict], raw (provider-specific payload).
GenerationChunk: Yielded by Engine.stream. delta: str for incremental text; done: bool for the final chunk; result: Optional[GenerationResult] populated on the final chunk only.

Generation config¶

GenerationConfig(
    provider="ollama",      # informational only
    model="default",
    temperature=0.8,
    top_p=None, top_k=None,
    max_tokens=256,
    repeat_penalty=None,
    timeout_ms=60_000,
    retries=1,
    max_prompt_chars=None,
)

Engine builds one of these per call from registry.generation merged with the active route's generation. You don't usually instantiate it directly — set fields on the registry/route JSON instead. max_prompt_chars is stored in the config object but is not enforced by the current engine.

Output policy¶

OutputPolicy(
    min_length=None, max_length=None,
    strip_prefixes=(),         # tuple/list of strings
    strip_patterns=(),         # tuple/list of regex strings
    forbidden_substrings=(),
    forbidden_patterns=(),
    require_patterns=(),
    append_suffix=None,
    collapse_whitespace=True,
)

OutputProcessor.clean(text, ctx, policy) strips, collapses whitespace, truncates to max_length, and appends append_suffix. validate(text, ctx, policy) returns a ValidationResult(ok, reason).

Merge semantics on OutputPolicy.merged_with(overrides):

Sequence fields (strip_prefixes, strip_patterns, forbidden_substrings, forbidden_patterns, require_patterns) — additive: existing rules survive, overrides extend them.
Scalars (max_length, min_length, append_suffix, collapse_whitespace) — replace.

Engine.run() currently applies route output policy with a dictionary update before constructing OutputPolicy. In practice, a route-level sequence field replaces the registry-level value for that field rather than extending it.

Providers¶

ProviderAdapter (Protocol): Anything with async def generate(request: ProviderRequest) -> ProviderResponse. Drop-in: pass an instance to Engine(...).
StreamingProviderAdapter (Protocol): Adds def stream(request) -> AsyncIterator[ProviderStreamChunk]. The final chunk has done=True and a response carrying the aggregated text/usage/timing.
supports_streaming(provider) -> bool: True iff provider.stream is callable.
OllamaProvider(base_url, chat_path, payload_shape="auto", client=None): Talks to Ollama or any OpenAI-compatible chat endpoint. payload_shape defaults to auto-detect from chat_path (presence of /v1/ ⇒ openai). Pass a custom httpx.AsyncClient to client to set timeouts or other transport options.
MockProvider(responder=None, latency_ms=5.0): Echoes the user prompt back, optionally transformed by responder. Used in tests and demos.
ProviderRequest / ProviderResponse / ProviderStreamChunk: Wire-shape dataclasses. ProviderResponse carries text, usage (prompt_tokens / completion_tokens / total_tokens), timing (total_ms / load_ms / prompt_eval_ms / eval_ms), and raw.

Serialization¶

load_registry(source, provider=None, output_processor=None) -> Engine: source may be a file path (str or Path), a JSON string, or an already-parsed dict. Returns a ready Engine.
export_json(engine_or_registry, *, indent=2) -> str: Pretty-printed JSON wrapped under "registry": { … }. Round-trips through Registry.from_dict.

Where to read next¶

Design rationale — how the registry model fits together and why it replaced routes/overlays/injections/presets.
Studio & Builder — designing prompts in the browser, the visual registry builder, exporting models, browser-direct LLM, and the registry HTTP API.