Ollama

Ollama is the default backend for Mellea. It runs models locally with no API key, making it the fastest way to get started.

Prerequisites: Ollama installed and the Ollama server running, pip install mellea.

Install Ollama

Download the installer from ollama.ai or:

# macOS
brew install ollama

# Linux (one-line installer)
curl -fsSL https://ollama.ai/install.sh | sh

Start the server before running any Mellea code:

ollama serve

On macOS, installing via Homebrew or the .dmg starts the server automatically as a background service.

Default setup

start_session() connects to Ollama on localhost:11434 and uses IBM Granite 4 Micro (granite4.1:3b) by default. On first run, Mellea automatically pulls the model if it is not already downloaded:

# Returns: ModelOutputThunk
import mellea

m = mellea.start_session()
email = m.instruct("Write an email inviting the team to a meeting.")
print(str(email))
# Output will vary — LLM responses depend on model and temperature.

Note: The first run pulls granite4.1:3b (~2 GB). Subsequent runs start immediately from the local cache.

Switching models

Pass any model name that Ollama supports:

# Returns: MelleaSession
import mellea

m = mellea.start_session(model_id="llama3.2:3b")

Use model_ids constants for well-known models — they carry the correct Ollama model name automatically:

# Returns: MelleaSession
from mellea import start_session
from mellea.backends import model_ids

m = start_session(model_id=model_ids.IBM_GRANITE_3_3_8B)

Pull models before using them (or let Mellea pull on first use):

ollama pull granite4.1:3b
ollama pull llama3.2:3b
ollama pull mistral:7b

Recommended models

`model_ids` constant	Ollama name	Notes
`IBM_GRANITE_4_1_3B`	`granite4.1:3b`	Default. Fast, low memory (~2 GB).
`IBM_GRANITE_4_1_8B`	`granite4.1:8b`	Higher quality, ~5 GB.
`IBM_GRANITE_3_3_8B`	`granite3.3:8b`	Higher quality, ~5 GB.
`IBM_GRANITE_3_3_VISION_2B`	`ibm/granite3.3-vision:2b`	Vision model for image inputs.
`META_LLAMA_3_2_3B`	`llama3.2:3b`	Compact Llama model.
`MISTRALAI_MISTRAL_0_3_7B`	`mistral:7b`	Mistral 7B.
`QWEN3_8B`	`qwen3:8b`	Qwen3 8B.
`DEEPSEEK_R1_8B`	`deepseek-r1:8b`	Reasoning-capable model.

Run ollama list to see which models are already downloaded locally.

Direct backend construction

For full control, construct OllamaModelBackend directly:

# Returns: MelleaSession
from mellea import MelleaSession
from mellea.backends.ollama import OllamaModelBackend
from mellea.backends import model_ids
from mellea.stdlib.context import ChatContext

backend = OllamaModelBackend(
    model_id=model_ids.IBM_GRANITE_3_3_8B,
)
m = MelleaSession(backend=backend, ctx=ChatContext())

Custom host

Mellea reads the OLLAMA_HOST environment variable or accepts a base_url parameter. Use this to connect to Ollama running on a remote machine or a non-standard port:

# Environment variable
export OLLAMA_HOST=http://my-gpu-server:11434

# Requires: mellea
# Returns: MelleaSession
from mellea import MelleaSession
from mellea.backends.ollama import OllamaModelBackend

m = MelleaSession(
    OllamaModelBackend(
        model_id="granite4.1:3b",
        base_url="http://my-gpu-server:11434",
    )
)

base_url takes precedence over OLLAMA_HOST if both are set.

Model options

Pass generation parameters via ModelOption:

# Requires: mellea
# Returns: MelleaSession
from mellea import MelleaSession
from mellea.backends import ModelOption, model_ids
from mellea.backends.ollama import OllamaModelBackend

m = MelleaSession(
    OllamaModelBackend(
        model_id=model_ids.IBM_GRANITE_4_1_3B,
        model_options={
            ModelOption.TEMPERATURE: 0.1,
            ModelOption.SEED: 42,
        },
    )
)

Options set at construction time apply to all calls. Options passed to instruct() or chat() apply to that call only and take precedence.

Vision models

Ollama hosts vision-capable models. Use IBM_GRANITE_3_3_VISION_2B or any Ollama vision model via the OpenAI-compatible endpoint:

# Requires: mellea, pillow
# Returns: str
from PIL import Image
from mellea import MelleaSession
from mellea.backends.ollama import OllamaModelBackend
from mellea.backends import model_ids
from mellea.core import ImageBlock

backend = OllamaModelBackend(model_id=model_ids.IBM_GRANITE_3_3_VISION_2B)
m = MelleaSession(backend=backend)

pil_image = Image.open("photo.jpg")
img_block = ImageBlock.from_pil_image(pil_image)

response = m.instruct(
    "Describe what you see in this image.",
    images=[img_block],
)
print(str(response))
# Output will vary — LLM responses depend on model and temperature.

Backend note: Vision requires a model that supports image inputs. The default granite4.1:3b is text-only. Pull a vision model explicitly before using images: ollama pull ibm/granite3.3-vision:2b.

Ollama's OpenAI-compatible endpoint

Ollama exposes an OpenAI-compatible API at http://localhost:11434/v1. Use this with the OpenAIBackend to access any Ollama model with OpenAI-style tool calling or vision support:

# Requires: mellea[openai]
# Returns: MelleaSession
from mellea import MelleaSession
from mellea.backends.openai import OpenAIBackend

m = MelleaSession(
    OpenAIBackend(
        model_id="qwen2.5vl:7b",
        base_url="http://localhost:11434/v1",
        api_key="ollama",          # required by the client; value is ignored by Ollama
    )
)

See Backends and Configuration for the full OpenAIBackend reference.

Troubleshooting

Connection refused on port 11434

The Ollama server is not running. Start it with ollama serve, or on macOS, launch the Ollama app from Applications.

Model not found

The model has not been pulled. Run ollama pull <model-name> before using it, or let Mellea pull it automatically on first use.

Slow first run

Ollama loads the model into memory on the first request. Subsequent requests in the same session are much faster. On machines with less than 8 GB RAM, consider using granite4.1:3b or llama3.2:1b.

Intel Mac torch errors

Some dependencies require a Rosetta-compatible environment on Intel Macs. Create a conda environment and install torchvision before pip install mellea:

conda create -n mellea python=3.12
conda activate mellea
conda install 'torchvision>=0.22.0'
pip install mellea

Install Ollama​

Default setup​

Switching models​

Recommended models​

Direct backend construction​

Custom host​

Model options​

Vision models​

Ollama's OpenAI-compatible endpoint​

Troubleshooting​

Connection refused on port 11434​

Model not found​

Slow first run​

Intel Mac torch errors​