Example

HuggingFace Params

ParameterTypeDefaultDescription
idstr"meta-llama/Meta-Llama-3-8B-Instruct"The id of the HuggingFace model to use.
namestr"HuggingFaceChat"The name of this chat model instance.
providerstr"HuggingFace"The provider of the model.
storeOptional[bool]-Whether or not to store the output of this chat completion request.
frequency_penaltyOptional[float]-Penalizes new tokens based on their frequency in the text so far.
logit_biasOptional[Any]-Modifies the likelihood of specified tokens appearing in the completion.
logprobsOptional[bool]-Include the log probabilities on the logprobs most likely tokens.
max_tokensOptional[int]-The maximum number of tokens to generate in the chat completion.
presence_penaltyOptional[float]-Penalizes new tokens based on whether they appear in the text so far.
response_formatOptional[Any]-An object specifying the format that the model must output.
seedOptional[int]-A seed for deterministic sampling.
stopOptional[Union[str, List[str]]]-Up to 4 sequences where the API will stop generating further tokens.
temperatureOptional[float]-Controls randomness in the model's output.
top_logprobsOptional[int]-How many log probability results to return per token.
top_pOptional[float]-Controls diversity via nucleus sampling.
request_paramsOptional[Dict[str, Any]]-Additional parameters to include in the request.
api_keyOptional[str]-The Access Token for authenticating with HuggingFace.
base_urlOptional[Union[str, httpx.URL]]-The base URL for API requests.
timeoutOptional[float]-The timeout for API requests.
max_retriesOptional[int]-The maximum number of retries for failed requests.
default_headersOptional[Any]-Default headers to include in all requests.
default_queryOptional[Any]-Default query parameters to include in all requests.
http_clientOptional[httpx.Client]-An optional pre-configured HTTP client.
client_paramsOptional[Dict[str, Any]]-Additional parameters for client configuration.
clientOptional[InferenceClient]-The HuggingFace Hub Inference client instance.
async_clientOptional[AsyncInferenceClient]-The asynchronous HuggingFace Hub client instance.

HuggingFaceChat is a subclass of the Model class and has access to the same params.