Skip to content

Responses

Responses

Warning

This API is experimental and subject to changes based upon our experience as we integrate additional providers. Use with caution.

any_llm.api.responses(model, input_data, *, provider=None, tools=None, tool_choice=None, max_output_tokens=None, temperature=None, top_p=None, stream=None, api_key=None, api_base=None, instructions=None, max_tool_calls=None, parallel_tool_calls=None, reasoning=None, text=None, client_args=None, **kwargs)

Create a response using the OpenAI-style Responses API.

This follows the OpenAI Responses API shape and returns the aliased any_llm.types.responses.Response type. If stream=True, an iterator of any_llm.types.responses.ResponseStreamEvent items is returned.

Parameters:

Name Type Description Default
model str

Model identifier in format 'provider/model' (e.g., 'openai/gpt-4o'). If provider is provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai/gpt-4o'.

required
provider str | LLMProvider | None

Provider name to use for the request. If provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai:gpt-4o'.

None
input_data str | ResponseInputParam

The input payload accepted by provider's Responses API. For OpenAI-compatible providers, this is typically a list mixing text, images, and tool instructions, or a dict per OpenAI spec.

required
tools list[dict[str, Any] | Callable[..., Any]] | None

Optional tools for tool calling (Python callables or OpenAI tool dicts)

None
tool_choice str | dict[str, Any] | None

Controls which tools the model can call

None
max_output_tokens int | None

Maximum number of output tokens to generate

None
temperature float | None

Controls randomness in the response (0.0 to 2.0)

None
top_p float | None

Controls diversity via nucleus sampling (0.0 to 1.0)

None
stream bool | None

Whether to stream response events

None
api_key str | None

API key for the provider

None
api_base str | None

Base URL for the provider API

None
instructions str | None

A system (or developer) message inserted into the model's context.

None
max_tool_calls int | None

The maximum number of total calls to built-in tools that can be processed in a response. This maximum number applies across all built-in tool calls, not per individual tool. Any further attempts to call a tool by the model will be ignored.

None
parallel_tool_calls int | None

Whether to allow the model to run tool calls in parallel.

None
reasoning Any | None

Configuration options for reasoning models.

None
text Any | None

Configuration options for a text response from the model. Can be plain text or structured JSON data.

None
client_args dict[str, Any] | None

Additional provider-specific arguments that will be passed to the provider's client instantiation.

None
**kwargs Any

Additional provider-specific arguments that will be passed to the provider's API call.

{}

Returns:

Type Description
Response | Iterator[ResponseStreamEvent]

Either a Response object (non-streaming) or an iterator of

Response | Iterator[ResponseStreamEvent]

ResponseStreamEvent (streaming).

Raises:

Type Description
NotImplementedError

If the selected provider does not support the Responses API.

Source code in src/any_llm/api.py
def responses(
    model: str,
    input_data: str | ResponseInputParam,
    *,
    provider: str | LLMProvider | None = None,
    tools: list[dict[str, Any] | Callable[..., Any]] | None = None,
    tool_choice: str | dict[str, Any] | None = None,
    max_output_tokens: int | None = None,
    temperature: float | None = None,
    top_p: float | None = None,
    stream: bool | None = None,
    api_key: str | None = None,
    api_base: str | None = None,
    instructions: str | None = None,
    max_tool_calls: int | None = None,
    parallel_tool_calls: int | None = None,
    reasoning: Any | None = None,
    text: Any | None = None,
    client_args: dict[str, Any] | None = None,
    **kwargs: Any,
) -> Response | Iterator[ResponseStreamEvent]:
    """Create a response using the OpenAI-style Responses API.

    This follows the OpenAI Responses API shape and returns the aliased
    `any_llm.types.responses.Response` type. If `stream=True`, an iterator of
    `any_llm.types.responses.ResponseStreamEvent` items is returned.

    Args:
        model: Model identifier in format 'provider/model' (e.g., 'openai/gpt-4o'). If provider is provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai/gpt-4o'.
        provider: Provider name to use for the request. If provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai:gpt-4o'.
        input_data: The input payload accepted by provider's Responses API.
            For OpenAI-compatible providers, this is typically a list mixing
            text, images, and tool instructions, or a dict per OpenAI spec.
        tools: Optional tools for tool calling (Python callables or OpenAI tool dicts)
        tool_choice: Controls which tools the model can call
        max_output_tokens: Maximum number of output tokens to generate
        temperature: Controls randomness in the response (0.0 to 2.0)
        top_p: Controls diversity via nucleus sampling (0.0 to 1.0)
        stream: Whether to stream response events
        api_key: API key for the provider
        api_base: Base URL for the provider API
        instructions: A system (or developer) message inserted into the model's context.
        max_tool_calls: The maximum number of total calls to built-in tools that can be processed in a response. This maximum number applies across all built-in tool calls, not per individual tool. Any further attempts to call a tool by the model will be ignored.
        parallel_tool_calls: Whether to allow the model to run tool calls in parallel.
        reasoning: Configuration options for reasoning models.
        text: Configuration options for a text response from the model. Can be plain text or structured JSON data.
        client_args: Additional provider-specific arguments that will be passed to the provider's client instantiation.
        **kwargs: Additional provider-specific arguments that will be passed to the provider's API call.

    Returns:
        Either a `Response` object (non-streaming) or an iterator of
        `ResponseStreamEvent` (streaming).

    Raises:
        NotImplementedError: If the selected provider does not support the Responses API.

    """
    all_args = locals()
    all_args.pop("provider")
    kwargs = all_args.pop("kwargs")

    model = all_args.pop("model")
    if provider is None:
        provider_key, model_id = AnyLLM.split_model_provider(model)
    else:
        provider_key = LLMProvider.from_string(provider)
        model_id = model
    all_args["model"] = model_id

    llm = AnyLLM.create(
        provider_key,
        api_key=all_args.pop("api_key"),
        api_base=all_args.pop("api_base"),
        **all_args.pop("client_args") or {},
    )
    return llm.responses(**all_args, **kwargs)

any_llm.api.aresponses(model, input_data, *, provider=None, tools=None, tool_choice=None, max_output_tokens=None, temperature=None, top_p=None, stream=None, api_key=None, api_base=None, instructions=None, max_tool_calls=None, parallel_tool_calls=None, reasoning=None, text=None, client_args=None, **kwargs) async

Create a response using the OpenAI-style Responses API.

This follows the OpenAI Responses API shape and returns the aliased any_llm.types.responses.Response type. If stream=True, an iterator of any_llm.types.responses.ResponseStreamEvent items is returned.

Parameters:

Name Type Description Default
model str

Model identifier in format 'provider/model' (e.g., 'openai/gpt-4o'). If provider is provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai/gpt-4o'.

required
provider str | LLMProvider | None

Provider name to use for the request. If provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai:gpt-4o'.

None
input_data str | ResponseInputParam

The input payload accepted by provider's Responses API. For OpenAI-compatible providers, this is typically a list mixing text, images, and tool instructions, or a dict per OpenAI spec.

required
tools list[dict[str, Any] | Callable[..., Any]] | None

Optional tools for tool calling (Python callables or OpenAI tool dicts)

None
tool_choice str | dict[str, Any] | None

Controls which tools the model can call

None
max_output_tokens int | None

Maximum number of output tokens to generate

None
temperature float | None

Controls randomness in the response (0.0 to 2.0)

None
top_p float | None

Controls diversity via nucleus sampling (0.0 to 1.0)

None
stream bool | None

Whether to stream response events

None
api_key str | None

API key for the provider

None
api_base str | None

Base URL for the provider API

None
instructions str | None

A system (or developer) message inserted into the model's context.

None
max_tool_calls int | None

The maximum number of total calls to built-in tools that can be processed in a response. This maximum number applies across all built-in tool calls, not per individual tool. Any further attempts to call a tool by the model will be ignored.

None
parallel_tool_calls int | None

Whether to allow the model to run tool calls in parallel.

None
reasoning Any | None

Configuration options for reasoning models.

None
text Any | None

Configuration options for a text response from the model. Can be plain text or structured JSON data.

None
client_args dict[str, Any] | None

Additional provider-specific arguments that will be passed to the provider's client instantiation.

None
**kwargs Any

Additional provider-specific arguments that will be passed to the provider's API call.

{}

Returns:

Type Description
Response | AsyncIterator[ResponseStreamEvent]

Either a Response object (non-streaming) or an iterator of

Response | AsyncIterator[ResponseStreamEvent]

ResponseStreamEvent (streaming).

Raises:

Type Description
NotImplementedError

If the selected provider does not support the Responses API.

Source code in src/any_llm/api.py
async def aresponses(
    model: str,
    input_data: str | ResponseInputParam,
    *,
    provider: str | LLMProvider | None = None,
    tools: list[dict[str, Any] | Callable[..., Any]] | None = None,
    tool_choice: str | dict[str, Any] | None = None,
    max_output_tokens: int | None = None,
    temperature: float | None = None,
    top_p: float | None = None,
    stream: bool | None = None,
    api_key: str | None = None,
    api_base: str | None = None,
    instructions: str | None = None,
    max_tool_calls: int | None = None,
    parallel_tool_calls: int | None = None,
    reasoning: Any | None = None,
    text: Any | None = None,
    client_args: dict[str, Any] | None = None,
    **kwargs: Any,
) -> Response | AsyncIterator[ResponseStreamEvent]:
    """Create a response using the OpenAI-style Responses API.

    This follows the OpenAI Responses API shape and returns the aliased
    `any_llm.types.responses.Response` type. If `stream=True`, an iterator of
    `any_llm.types.responses.ResponseStreamEvent` items is returned.

    Args:
        model: Model identifier in format 'provider/model' (e.g., 'openai/gpt-4o'). If provider is provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai/gpt-4o'.
        provider: Provider name to use for the request. If provided, we assume that the model does not contain the provider name. Otherwise, we assume that the model contains the provider name, like 'openai:gpt-4o'.
        input_data: The input payload accepted by provider's Responses API.
            For OpenAI-compatible providers, this is typically a list mixing
            text, images, and tool instructions, or a dict per OpenAI spec.
        tools: Optional tools for tool calling (Python callables or OpenAI tool dicts)
        tool_choice: Controls which tools the model can call
        max_output_tokens: Maximum number of output tokens to generate
        temperature: Controls randomness in the response (0.0 to 2.0)
        top_p: Controls diversity via nucleus sampling (0.0 to 1.0)
        stream: Whether to stream response events
        api_key: API key for the provider
        api_base: Base URL for the provider API
        instructions: A system (or developer) message inserted into the model's context.
        max_tool_calls: The maximum number of total calls to built-in tools that can be processed in a response. This maximum number applies across all built-in tool calls, not per individual tool. Any further attempts to call a tool by the model will be ignored.
        parallel_tool_calls: Whether to allow the model to run tool calls in parallel.
        reasoning: Configuration options for reasoning models.
        text: Configuration options for a text response from the model. Can be plain text or structured JSON data.
        client_args: Additional provider-specific arguments that will be passed to the provider's client instantiation.
        **kwargs: Additional provider-specific arguments that will be passed to the provider's API call.

    Returns:
        Either a `Response` object (non-streaming) or an iterator of
        `ResponseStreamEvent` (streaming).

    Raises:
        NotImplementedError: If the selected provider does not support the Responses API.

    """
    all_args = locals()
    all_args.pop("provider")
    kwargs = all_args.pop("kwargs")

    model = all_args.pop("model")
    if provider is None:
        provider_key, model_id = AnyLLM.split_model_provider(model)
    else:
        provider_key = LLMProvider.from_string(provider)
        model_id = model
    all_args["model"] = model_id

    llm = AnyLLM.create(
        provider_key,
        api_key=all_args.pop("api_key"),
        api_base=all_args.pop("api_base"),
        **all_args.pop("client_args") or {},
    )
    return await llm.aresponses(**all_args, **kwargs)