Completion
Completion Types
Data models and types for completion operations.
any_llm.types.completion
ChatCompletionMessageFunctionToolCall
Bases: ChatCompletionMessageFunctionToolCall
Extended tool call type that includes extra_content for provider-specific data.
The extra_content field is used to store provider-specific metadata that needs to be preserved across multi-turn conversations. For example, Gemini 3 models require thought_signature to be passed back with function calls.
Example extra_content structure for Gemini
{"google": {"thought_signature": "
Source code in src/any_llm/types/completion.py
CompletionParams
Bases: BaseModel
Normalized parameters for chat completions.
This model is used internally to pass structured parameters from the public API layer to provider implementations, avoiding very long function signatures while keeping type safety.
Source code in src/any_llm/types/completion.py
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
frequency_penalty = None
class-attribute
instance-attribute
Penalize new tokens based on frequency in text
logit_bias = None
class-attribute
instance-attribute
Bias the likelihood of specified tokens during generation
logprobs = None
class-attribute
instance-attribute
Include token-level log probabilities in the response
max_completion_tokens = None
class-attribute
instance-attribute
Maximum number of tokens for the completion (provider-dependent)
max_tokens = None
class-attribute
instance-attribute
Maximum number of tokens to generate
messages
instance-attribute
List of messages for the conversation
model_id
instance-attribute
Model identifier (e.g., 'mistral-small-latest')
n = None
class-attribute
instance-attribute
Number of completions to generate
parallel_tool_calls = None
class-attribute
instance-attribute
Whether to allow parallel tool calls
presence_penalty = None
class-attribute
instance-attribute
Penalize new tokens based on presence in text
reasoning_effort = 'auto'
class-attribute
instance-attribute
Reasoning effort level for models that support it. "auto" will map to each provider's default.
response_format = None
class-attribute
instance-attribute
Format specification for the response
seed = None
class-attribute
instance-attribute
Random seed for reproducible results
stop = None
class-attribute
instance-attribute
Stop sequences for generation
stream = None
class-attribute
instance-attribute
Whether to stream the response
stream_options = None
class-attribute
instance-attribute
Additional options controlling streaming behavior
temperature = None
class-attribute
instance-attribute
Controls randomness in the response (0.0 to 2.0)
tool_choice = None
class-attribute
instance-attribute
Controls which tools the model can call
tools = None
class-attribute
instance-attribute
List of tools for tool calling. Should be converted to OpenAI tool format dicts
top_logprobs = None
class-attribute
instance-attribute
Number of top alternatives to return when logprobs are requested
top_p = None
class-attribute
instance-attribute
Controls diversity via nucleus sampling (0.0 to 1.0)
user = None
class-attribute
instance-attribute
Unique identifier for the end user