podcast_llm.utils.llm

Utilities for working with Language Learning Models (LLMs) in a standardized way.

This module provides utilities for interacting with various LLM providers through a unified interface. It handles provider-specific implementation details while exposing a consistent API for the rest of the application.

Key components: - LLMWrapper: A class that wraps different LLM providers (OpenAI, Google, Anthropic)

with standardized interfaces for structured output parsing and rate limiting

  • Helper functions for configuring and instantiating LLM instances with appropriate settings for podcast generation tasks

The module abstracts away provider differences around: - Message formatting and chunking - Rate limiting implementation - Structured output parsing - Temperature and token limit settings

This allows the rest of the application to work with LLMs in a provider-agnostic way while still leveraging provider-specific capabilities when beneficial.

class podcast_llm.utils.llm.LLMWrapper(provider: str, model: str, temperature: float = 1.0, max_tokens: int = 8192, rate_limiter: BaseRateLimiter | None = None)[source]

Bases: Runnable

coerce_to_schema(llm_output: str)[source]

Coerce raw LLM output into a structured schema object.

Takes unstructured text output from the LLM and attempts to parse it into a structured Pydantic object based on the defined schema. Currently supports Question and Answer schema types.

Parameters:

llm_output (str) – Raw text output from the LLM to be coerced

Returns:

Pydantic object matching the defined schema type

Return type:

BaseModel

Raises:
  • ValueError – If no schema is defined

  • OutputParserException – If output cannot be coerced to the schema

The coercion maps the raw text to the appropriate schema field: - Question schema -> ‘question’ field - Answer schema -> ‘answer’ field

invoke(input: PromptValue | str | Sequence[BaseMessage | list[str] | tuple[str, str] | str | dict[str, Any]], config: RunnableConfig | None = None, **kwargs: Any) BaseMessage[source]

Invoke the LLM with the given input and configuration.

This method handles provider-specific invocation details while maintaining a consistent interface. It manages structured output formatting and parsing based on the provider’s capabilities.

Parameters:
  • input (LanguageModelInput) – The input to send to the LLM, typically messages or prompts

  • config (Optional[RunnableConfig]) – Optional configuration for the invocation

  • **kwargs (Any) – Additional keyword arguments passed to the underlying LLM

Returns:

The LLM’s response message

Return type:

BaseMessage

The implementation varies by provider: - OpenAI/Anthropic: Direct invocation with native structured output support - Google: Custom handling for structured output via parser and format instructions

with_structured_output(schema: BaseModel)[source]

Configure the LLM wrapper to output structured data using a Pydantic schema.

This method adapts the underlying LLM to output responses conforming to the provided Pydantic model schema. The implementation varies by provider:

  • OpenAI/Anthropic: Uses native structured output support

  • Google: Implements structured output via output parser and format instructions

Parameters:

schema (pydantic.BaseModel) – The Pydantic model class defining the expected response structure

Returns:

The wrapper instance configured for structured output

Return type:

LLMWrapper

podcast_llm.utils.llm.get_fast_llm(config: PodcastConfig, rate_limiter: BaseRateLimiter | None = None)[source]

Get a fast LLM model optimized for quick responses.

Creates and returns an LLM wrapper configured with a fast model variant from the specified provider. Fast models trade some quality for improved response speed.

Parameters:
  • config (PodcastConfig) – Configuration object containing provider settings

  • rate_limiter (BaseRateLimiter | None, optional) – Rate limiter to control API request frequency. Defaults to None.

Returns:

Wrapper instance configured with a fast model variant

Return type:

LLMWrapper

Raises:

ValueError – If the configured fast_llm_provider is not supported

The function maps providers to their respective fast model variants: - OpenAI: gpt-4o-mini - Google: gemini-1.5-flash - Anthropic: claude-3-5-sonnet

podcast_llm.utils.llm.get_long_context_llm(config: PodcastConfig, rate_limiter: BaseRateLimiter | None = None)[source]

Get a long context LLM model optimized for handling larger prompts.

Creates and returns an LLM wrapper configured with a model variant that can handle longer context windows from the specified provider. These models are optimized for processing larger amounts of text at once.

Parameters:
  • config (PodcastConfig) – Configuration object containing provider settings

  • rate_limiter (BaseRateLimiter | None, optional) – Rate limiter to control API request frequency. Defaults to None.

Returns:

Wrapper instance configured with a long context model variant

Return type:

LLMWrapper

Raises:

ValueError – If the configured long_context_llm_provider is not supported

The function maps providers to their respective long context model variants: - OpenAI: gpt-4o - Google: gemini-1.5-pro-latest - Anthropic: claude-3-5-sonnet