Mellea — build predictable AI without guesswork

Mellea helps you manage the unreliable part of every AI-powered pipeline: the LLM call itself. It replaces ad-hoc prompt chains and brittle agents with structured generative programs — Python code where LLM calls are first-class operations governed by type annotations, requirement verifiers, and principled repair loops.
uv pip install mellea
Install Mellea and run your first generative program in minutes.
Build a complete program with generation, validation, and repair.
Runnable examples: RAG, agents, sampling, MObjects, and more.
Full public API — backends, session, components, requirements, sampling.
How Mellea works
Mellea's design rests on three interlocking ideas.
@generative turns a typed function signature into an LLM-backed implementation.
Docstrings become prompts. Type hints become output schemas. No DSL required.
Declare what good output looks like with req(). Mellea checks every response
before it leaves the session — using LLM verifiers, programmatic checks, or
domain-trained adapters.
When a requirement fails, Mellea feeds the failure back and tries again. Rejection sampling, majority voting, and SOFAI are built in.
Key patterns
Add @mify to any class to make it LLM-queryable and tool-accessible
without rewriting your data model.
Explicit context threading with push/pop state keeps multi-turn workflows reproducible and debuggable.
ainstruct(), aact(), and token-by-token streaming for production
throughput and responsive UIs.
Guardian Intrinsics detect harmful, off-topic, or hallucinated outputs before they reach downstream code.
Best-of-n, SOFAI, majority voting — swap strategies in one line.
@tool, MelleaTool, and the ReACT loop for goal-driven multi-step agents.
Backends
Mellea is backend-agnostic. The same program runs on any inference engine.
Local inference, zero cloud costs.
GPT-4o, o3-mini, any OpenAI-compatible API.
AWS Bedrock via Bedrock Mantle or LiteLLM.
IBM WatsonX managed AI platform.
Local inference with Transformers — aLoRA and constrained decoding.
Google Vertex AI, Anthropic, and 100+ providers via LiteLLM.
Use LangChain tools in Mellea sessions or call Mellea from LangChain chains.
See Backends and configuration for the full list of supported backends and how to configure them.
How-to guides
Pydantic models, Literal types, and @generative for guaranteed schemas.
Python functions, ValidationResult, and multi-field validation logic.
aact(), ainstruct(), and token-by-token streaming output.
ChatContext, explicit context threading, and multi-session workflows.
Temperature, seed, max tokens, system prompts — cross-backend with ModelOption.
Pass images to instruct() and chat() with any vision-capable backend.
Vector search, LLM relevance filtering, and grounded generation end-to-end.
GitHub · PyPI · Discussions