Groq offers blazing-fast API endpoints for large language models

Authentication

Set your GROQ_API_KEY environment variable.

export GROQ_API_KEY=***

Usage

Use Groq with your Assistant:

from phi.assistant import Assistant
from phi.llm.groq import Groq

assistant = Assistant(
    llm=Groq(model="mixtral-8x7b-32768"),
    description="You help people with their health and fitness goals.",
    # debug_mode=True,
)
assistant.print_response("Share a quick healthy breakfast recipe.", markdown=True)

Params

name
str
default: "Groq"

The name identifier for the assistant.

model
str
default: "mixtral-8x7b-32768"

The specific model ID used for generating responses.

frequency_penalty
float

A number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

logit_bias
Any

A JSON object that modifies the likelihood of specified tokens appearing in the completion by mapping token IDs to bias values between -100 and 100.

logprobs
int

The number of log probabilities to return for each generated token.

max_tokens
int

The maximum number of tokens to generate in the chat completion.

presence_penalty
float

A number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

response_format
Dict[str, Any]

Specifies the format that the model must output. Setting to { "type": "json_object" } enables JSON mode, ensuring the message generated is valid JSON.

seed
int

A seed value for deterministic sampling, ensuring repeated requests with the same seed and parameters return the same result.

stop
Union[str, List[str]]

Up to 4 sequences where the API will stop generating further tokens.

temperature
float

The sampling temperature to use, between 0 and 2. Higher values like 0.8 make the output more random, while lower values like 0.2 make it more focused and deterministic.

top_logprobs
int

The number of top log probabilities to return for each generated token.

top_p
float

Nucleus sampling parameter. The model considers the results of the tokens with top_p probability mass.

user
str

A unique identifier representing your end-user, helping to monitor and detect abuse.

extra_headers
Any

Additional headers to include in the API request.

extra_query
Any

Additional query parameters to include in the API request.

api_key
str

The API key for authenticating requests to the service.

organization
str

The organization associated with the API key.

base_url
str

The base URL for making API requests to the service.

timeout
float

The timeout duration for requests, specified in seconds.

max_retries
int

The maximum number of retry attempts for failed requests.

default_headers
Any

Default headers to include in all API requests.

default_query
Any

Default query parameters to include in all API requests.

groq_client
GroqClient

An instance of GroqClient provided for making API requests.