Skip to main content

CLI Reference

Mellea command-line tool for LLM-powered workflows.

Provides sub-commands for serving models (m serve), training and uploading adapters (m alora), decomposing tasks into subtasks (m decompose), running test-based evaluation pipelines (m eval), and applying automated code migrations (m fix).

m alora

Train or upload aLoRAs for requirement validation.

m alora add-readme

Generate and upload an INTRINSIC_README.md for a trained adapter.

Uses an LLM to auto-generate documentation for a trained adapter based on the training data and model configuration, then uploads it to the Hugging Face Hub repository.

Prerequisites:

  • Hugging Face CLI authenticated (huggingface-cli login).
  • An LLM backend available for README generation.
m alora add-readme <DATAFILE> --basemodel <value> [--promptfile] --name <value> [--hints] [--io-yaml]

Arguments:

NameTypeRequiredDescription
DATAFILEtextyesJSONL file with item/label pairs

Options:

FlagTypeDefaultDescription
--basemodeltextrequiredBase model ID or path
--promptfiletextPath to load the prompt format file
--nametextrequiredDestination model name (e.g., acme/carbchecker-alora)
--hintstextFile containing any additional hints.
--io-yamltextLocation of the io.yaml file that configures input and output processing if the model is invoked as an intrinsic.

Output: Generates a README.md file, displays it for confirmation, and uploads it to the Hugging Face Hub repository specified by --name.

Example:

m alora add-readme data.jsonl --basemodel ibm-granite/granite-3.3-2b-instruct --name acme/my-alora

See also: Lora and Alora Adapters

m alora train

Train an aLoRA or LoRA adapter on a labelled dataset.

Fine-tunes a base causal language model using a JSONL dataset of item/label pairs. Supports both aLoRA (asymmetric LoRA) and standard LoRA adapters.

Prerequisites:

  • Mellea installed with adapter extras (uv add mellea[adapters]).
  • A CUDA, MPS, or CPU device available for training.
m alora train <DATAFILE> --basemodel <value> --outfile <value> [--promptfile] [--adapter] [--device] [--epochs] [--learning-rate] [--batch-size] [--max-length] [--grad-accum]

Arguments:

NameTypeRequiredDescription
DATAFILEtextyesJSONL file with item/label pairs

Options:

FlagTypeDefaultDescription
--basemodeltextrequiredBase model ID or path
--outfiletextrequiredPath to save adapter weights
--promptfiletextPath to load the prompt format file
--adaptertextaloraAdapter type: alora or lora
--devicetextautoDevice: auto, cpu, cuda, or mps
--epochsinteger6Number of training epochs
--learning-ratefloat6e-06Learning rate
--batch-sizeinteger2Per-device batch size
--max-lengthinteger1024Max sequence length
--grad-accuminteger4Gradient accumulation steps

Output: Saves adapter weights to the path specified by --outfile. The output directory contains an adapter_config.json and the trained weight files, ready for upload or local inference.

Example:

m alora train data.jsonl --basemodel ibm-granite/granite-3.3-2b-instruct --outfile ./adapter

See also: Lora and Alora Adapters

m alora upload

Upload a trained adapter to a remote model registry.

Pushes adapter weights to Hugging Face Hub, optionally packaging the adapter as an intrinsic with an io.yaml configuration file.

Prerequisites: Hugging Face CLI authenticated (huggingface-cli login).

m alora upload <WEIGHT_PATH> --name <value> [--intrinsic] [--io-yaml]

Arguments:

NameTypeRequiredDescription
WEIGHT_PATHtextyesPath to saved adapter weights

Options:

FlagTypeDefaultDescription
--nametextrequiredDestination model name (e.g., acme/carbchecker-alora)
--intrinsicbooleanfalseTrue if the uploaded adapter implements an intrinsic. If true, the caller must provide an io.yaml file.
--io-yamltextLocation of the io.yaml file that configures input and output processing if the model is invoked as an intrinsic.

Output: Creates or updates a Hugging Face Hub repository at the name specified by --name and uploads the adapter weight files.

Example:

m alora upload ./adapter --name acme/my-alora

See also: Lora and Alora Adapters

m decompose

Utility pipeline for decomposing task prompts.

m decompose run

Break a complex task into ordered, executable subtasks.

Reads user queries from a file or interactive input, runs the LLM-driven decomposition pipeline for each task job, and writes one JSON file, one rendered Python script, and any generated validation modules under a per-job output directory.

Prerequisites:

  • Mellea installed (uv add mellea).
  • An Ollama instance running locally, or an OpenAI-compatible endpoint configured via --backend-endpoint.
m decompose run --out-dir <value> [--out-name] [--input-file] [--model-id] [--backend] [--backend-req-timeout] [--backend-endpoint] [--backend-api-key] [--version] [--input-var] [--log-mode] [--enable-script-run]

Options:

FlagTypeDefaultDescription
--out-dirpathrequiredPath to an existing directory to save the output files.
--out-nametextm_decomp_resultName for the output files. Defaults to "m_decomp_result".
--input-filetextPath to a text file containing user queries.
--model-idtextmistral-small3.2:latestModel name/id used to run the decomposition pipeline. Defaults to "mistral-small3.2:latest", valid for the "ollama" backend.
--backendollama | openaiollamaBackend used for inference. Options: "ollama" and "openai".
--backend-req-timeoutinteger300Timeout in seconds for backend requests. Defaults to "300".
--backend-endpointtextBackend endpoint / base URL. Required for "openai".
--backend-api-keytextBackend API key. Required for "openai".
--versionlatest | v1 | v2 | v3latestVersion of the mellea program generator template to use.
--input-vartextOptional user input variable names. You may pass this option multiple times. Each value must be a valid Python identifier.
--log-modedemo | debugdemoReadable logging mode. Options: "demo" or "debug".
--enable-script-runbooleanfalseWhen true, generated scripts expose argparse runtime options for backend, model, endpoint, and API key overrides.

Output: Creates a directory <out-dir>/<out-name>/ containing a JSON decomposition result file, a ready-to-run Python script, and any generated validation modules. One directory per task job.

Example:

m decompose run --out-dir ./output --input-file tasks.txt

See also: M Decompose, Refactor Prompts with Cli

m eval

LLM-as-a-judge evaluation pipelines.

m eval run

Run LLM-as-a-judge evaluation on one or more test files.

Loads test cases from JSON/JSONL files, generates candidate responses using the specified generation backend, scores them with a judge model, and writes aggregated results to a file.

Prerequisites:

  • Mellea installed (uv add mellea).
  • At least one inference backend available (Ollama by default).
  • A separate judge backend/model is recommended but optional (defaults to the generation backend).
m eval run <TEST_FILES> [--backend] [--model] [--max-gen-tokens] [--judge-backend] [--judge-model] [--max-judge-tokens] [--output-path] [--output-format] [--continue-on-error]

Arguments:

NameTypeRequiredDescription
TEST_FILEStextyesList of paths to json/jsonl files containing test cases

Options:

FlagTypeDefaultDescription
--backend, -btextollamaInference backend for generating candidate responses (e.g. ollama, openai)
--modeltextModel name/id for the generation backend; uses backend default if omitted
--max-gen-tokensinteger256Max tokens to generate for responses
--judge-backend, -jbtextInference backend for the judge model; reuses --backend if omitted
--judge-modeltextModel name/id for the judge; uses judge backend default if omitted
--max-judge-tokensinteger256Max tokens for the judge model's judgement.
--output-path, -otexteval_resultsOutput path for results
--output-formattextjsonEither json or jsonl format for results
--continue-on-errorbooleantrueSkip failed test cases instead of aborting the entire run

Output: Writes evaluation results to <output-path>.<output-format> (default eval_results.json). The file contains per-test-case scores, judge verdicts, and aggregate statistics.

Example:

m eval run tests.jsonl --backend ollama --model granite3.3:2b

See also: Evaluate with Llm as a Judge

m fix

Fix code for API changes.

m fix async

Fix async calls for the await_result default change.

Scans Python source files for aact, ainstruct, and aquery calls and applies an automated migration to restore blocking behaviour after the await_result default changed from True to False.

Prerequisites: Mellea installed (uv add mellea).

m fix async <PATH> [--mode] [--dry-run]

Arguments:

NameTypeRequiredDescription
PATHtextyesFile or directory to scan

Options:

FlagTypeDefaultDescription
--mode, -madd-await-result | add-stream-loopadd-await-resultFix strategy to apply
--dry-runbooleanfalseReport locations without modifying files

Modes:

  • add-await-result — (default) Adds await_result=True to each call so it blocks until the result is ready. Use this if you don't need to stream partial results.
  • add-stream-loop — Inserts a while not r.is_computed(): await r.astream() loop after each call. This only works if you passed a streaming model option (e.g. stream=True) to the call; otherwise the loop will finish immediately.

Best practices:

  • Run with --dry-run first to review what will be changed.
  • Only run a given mode once per file. The tool detects prior fixes and skips calls that already have await_result=True or a stream loop, but it is safest to treat it as a one-shot migration.
  • Do not run both modes on the same file. If a stream loop is already present, add-await-result will skip that call (and vice versa).

Detection notes:

  • Most import styles are detected: import mellea, from mellea import MelleaSession, from mellea.stdlib.functional import aact, module aliases, etc.
  • Calls that are already followed by await r.avalue(), await r.astream(), or a while not r.is_computed() loop are automatically skipped, even when nested inside if/try/for blocks.

Output: Modifies Python source files in place (unless --dry-run). Prints a summary of fixed call sites with file paths and line numbers.

Example:

m fix async src/ --dry-run

m fix genslots

Rewrite genslot imports and class names to genstub equivalents.

Scans Python source files and replaces deprecated GenerativeSlot imports and class references with their GenerativeStub replacements.

Prerequisites: Mellea installed (uv add mellea).

m fix genslots <PATH> [--dry-run]

Arguments:

NameTypeRequiredDescription
PATHtextyesFile or directory to scan

Options:

FlagTypeDefaultDescription
--dry-runbooleanfalseReport locations without modifying files

Rewrites:

  • mellea.stdlib.components.genslot → mellea.stdlib.components.genstub
  • GenerativeSlot → GenerativeStub
  • SyncGenerativeSlot → SyncGenerativeStub
  • AsyncGenerativeSlot → AsyncGenerativeStub

Best practices:

  • Run with --dry-run first to review what will be changed.
  • The tool is idempotent — running it twice on the same file is safe.

Output: Modifies Python source files in place (unless --dry-run). Prints a summary of rewritten references with file paths and line numbers.

Example:

m fix genslots src/ --dry-run

m serve

Serve a Mellea program as an OpenAI-compatible HTTP endpoint.

Loads a Python file containing a serve function and exposes it via a FastAPI server implementing the OpenAI chat completions API. The server accepts POST /v1/chat/completions requests.

Prerequisites:

  • Mellea installed with server dependency group (uv add 'mellea[server]').
  • The python file being loaded must have a serve function.
m serve [SCRIPT_PATH] [--host] [--port]

Arguments:

NameTypeRequiredDescription
SCRIPT_PATHtextnoPath to the Python script to import and serve

Options:

FlagTypeDefaultDescription
--hosttext0.0.0.0Host to bind to
--portinteger8080Port to bind to

Output: Starts a long-running HTTP server on the specified host and port. The /v1/chat/completions endpoint accepts OpenAI-format chat completion requests and returns ChatCompletion JSON responses.

Example:

m serve my_app.py --port 9000

See also: M Serve