podcast_llm.utils.llm
Utilities for working with Language Learning Models (LLMs) in a standardized way.
This module provides utilities for interacting with various LLM providers through a unified interface. It handles provider-specific implementation details while exposing a consistent API for the rest of the application.
Key components: - LLMWrapper: A class that wraps different LLM providers (OpenAI, Google, Anthropic)
with standardized interfaces for structured output parsing and rate limiting
Helper functions for configuring and instantiating LLM instances with appropriate settings for podcast generation tasks
The module abstracts away provider differences around: - Message formatting and chunking - Rate limiting implementation - Structured output parsing - Temperature and token limit settings
This allows the rest of the application to work with LLMs in a provider-agnostic way while still leveraging provider-specific capabilities when beneficial.
- class podcast_llm.utils.llm.LLMWrapper(provider: str, model: str, temperature: float = 1.0, max_tokens: int = 8192, rate_limiter: BaseRateLimiter | None = None)[source]
Bases:
Runnable
- coerce_to_schema(llm_output: str)[source]
Coerce raw LLM output into a structured schema object.
Takes unstructured text output from the LLM and attempts to parse it into a structured Pydantic object based on the defined schema. Currently supports Question and Answer schema types.
- Parameters:
llm_output (str) – Raw text output from the LLM to be coerced
- Returns:
Pydantic object matching the defined schema type
- Return type:
BaseModel
- Raises:
ValueError – If no schema is defined
OutputParserException – If output cannot be coerced to the schema
The coercion maps the raw text to the appropriate schema field: - Question schema -> ‘question’ field - Answer schema -> ‘answer’ field
- invoke(input: PromptValue | str | Sequence[BaseMessage | list[str] | tuple[str, str] | str | dict[str, Any]], config: RunnableConfig | None = None, **kwargs: Any) BaseMessage [source]
Invoke the LLM with the given input and configuration.
This method handles provider-specific invocation details while maintaining a consistent interface. It manages structured output formatting and parsing based on the provider’s capabilities.
- Parameters:
input (LanguageModelInput) – The input to send to the LLM, typically messages or prompts
config (Optional[RunnableConfig]) – Optional configuration for the invocation
**kwargs (Any) – Additional keyword arguments passed to the underlying LLM
- Returns:
The LLM’s response message
- Return type:
BaseMessage
The implementation varies by provider: - OpenAI/Anthropic: Direct invocation with native structured output support - Google: Custom handling for structured output via parser and format instructions
- with_structured_output(schema: BaseModel)[source]
Configure the LLM wrapper to output structured data using a Pydantic schema.
This method adapts the underlying LLM to output responses conforming to the provided Pydantic model schema. The implementation varies by provider:
OpenAI/Anthropic: Uses native structured output support
Google: Implements structured output via output parser and format instructions
- Parameters:
schema (pydantic.BaseModel) – The Pydantic model class defining the expected response structure
- Returns:
The wrapper instance configured for structured output
- Return type:
- podcast_llm.utils.llm.get_fast_llm(config: PodcastConfig, rate_limiter: BaseRateLimiter | None = None)[source]
Get a fast LLM model optimized for quick responses.
Creates and returns an LLM wrapper configured with a fast model variant from the specified provider. Fast models trade some quality for improved response speed.
- Parameters:
config (PodcastConfig) – Configuration object containing provider settings
rate_limiter (BaseRateLimiter | None, optional) – Rate limiter to control API request frequency. Defaults to None.
- Returns:
Wrapper instance configured with a fast model variant
- Return type:
- Raises:
ValueError – If the configured fast_llm_provider is not supported
The function maps providers to their respective fast model variants: - OpenAI: gpt-4o-mini - Google: gemini-1.5-flash - Anthropic: claude-3-5-sonnet
- podcast_llm.utils.llm.get_long_context_llm(config: PodcastConfig, rate_limiter: BaseRateLimiter | None = None)[source]
Get a long context LLM model optimized for handling larger prompts.
Creates and returns an LLM wrapper configured with a model variant that can handle longer context windows from the specified provider. These models are optimized for processing larger amounts of text at once.
- Parameters:
config (PodcastConfig) – Configuration object containing provider settings
rate_limiter (BaseRateLimiter | None, optional) – Rate limiter to control API request frequency. Defaults to None.
- Returns:
Wrapper instance configured with a long context model variant
- Return type:
- Raises:
ValueError – If the configured long_context_llm_provider is not supported
The function maps providers to their respective long context model variants: - OpenAI: gpt-4o - Google: gemini-1.5-pro-latest - Anthropic: claude-3-5-sonnet