Run Large Language Models locally with Ollama

Ollama is a fantastic tool for running models locally. Install ollama and run a model using

ollama run llama3.1

After you have the local model running, use the Ollama model to access them

Example

from phi.agent import Agent, RunResponse
from phi.model.ollama import Ollama

agent = Agent(
    model=Ollama(id="llama3.1"),
    markdown=True
)

# Get the response in a variable
# run: RunResponse = agent.run("Share a 2 sentence horror story.")
# print(run.content)

# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story.")

Params

ParameterTypeDefaultDescription
idstr"llama3.2"The ID of the model to use.
namestr"Ollama"The name of the model.
providerstr"Ollama llama3.2"The provider of the model.
formatOptional[str]NoneThe format of the response.
optionsOptional[Any]NoneAdditional options to pass to the model.
keep_aliveOptional[Union[float, str]]NoneThe keep alive time for the model.
request_paramsOptional[Dict[str, Any]]NoneAdditional parameters to pass to the request.
hostOptional[str]NoneThe host to connect to.
timeoutOptional[Any]NoneThe timeout for the connection.
client_paramsOptional[Dict[str, Any]]NoneAdditional parameters to pass to the client.
clientOptional[OllamaClient]NoneA pre-configured instance of the Ollama client.
async_clientOptional[AsyncOllamaClient]NoneA pre-configured instance of the asynchronous Ollama client.