podcast_llm.utils.checkpointer
Utilities for checkpointing and resuming long-running processes.
This module provides functionality for saving and loading intermediate computation results to disk, enabling efficient resumption of processing from the last successful checkpoint. This is particularly useful for long-running podcast generation tasks that may need to be interrupted and resumed.
Key components: - Checkpointer: A class that manages saving/loading of checkpoint data with configurable
paths and serialization
to_snake_case: Helper function for converting checkpoint names to valid filenames
The checkpointing system helps with: - Saving intermediate results during multi-step processing - Resuming interrupted processes without recomputing completed steps - Debugging by examining saved checkpoint states - Reducing wasted computation on process restarts
The module uses pickle for serialization by default but is designed to be extensible to other serialization formats as needed.
- class podcast_llm.utils.checkpointer.Checkpointer(checkpoint_key: str, checkpoint_dir: str = '.checkpoints', enabled: bool = True)[source]
Bases:
object
A class for managing checkpointing of intermediate results during processing.
The Checkpointer allows saving and loading of intermediate computation results to disk, enabling resumption of long-running processes from the last successful checkpoint.
Key features: - Configurable checkpoint directory and key prefix for files - Can be enabled/disabled via constructor - Automatically creates checkpoint directory if needed - Saves results as pickle files with stage-specific names - Loads from existing checkpoints when available
- Example usage:
- checkpointer = Checkpointer(
checkpoint_key=’my_process_’, enabled=True
)
# Will save result to disk and return it result = checkpointer.checkpoint(
expensive_computation(), stage_name=’stage1’
)
# On subsequent runs, will load from disk instead of recomputing result = checkpointer.checkpoint(
expensive_computation(), stage_name=’stage1’
)
- podcast_llm.utils.checkpointer.to_snake_case(text: str) str [source]
Convert a string to snake_case format.
Takes any string input and converts it to snake_case by: 1. Replacing spaces and hyphens with underscores 2. Converting to lowercase 3. Removing any non-alphanumeric characters except underscores
- Parameters:
text (str) – Input string to convert
- Returns:
Snake case formatted string
- Return type:
str