Run Large Language Models locally with Ollama

Ollama is a fantastic tool for running LLMs locally. Install ollama and run a model using

ollama run llama2

After you have the local model running, use the Ollama LLM to access them

Usage

from phi.assistant import Assistant
from phi.llm.ollama import Ollama

assistant = Assistant(
    llm=Ollama(model="llama3"),
    description="You help people with their health and fitness goals.",
)
assistant.print_response("Share a quick healthy breakfast recipe.", markdown=True)

Params

name
str
default: "Ollama"

The name of the LLM.

model
str
default: "openhermes"

The name of the model to be used.

host
str

The host URL for making API requests to the service.

format
str

The response format, either an empty string for default or “json” for JSON responses.

timeout
Any

The timeout duration for requests, can be specified in seconds.

options
Dict[str, Any]

A dictionary of options to include with the request, e.g., {"temperature": 0.1, "stop": ["\n"]}.

keep_alive
Union[float, str]

The keep-alive duration for maintaining persistent connections, can be specified in seconds or as a string.

client_kwargs
Dict[str, Any]

Additional keyword arguments provided as a dictionary when initializing the Ollama() client.

ollama_client
ollama.Client

An instance of ollama.Client provided for making API requests.