API
This section provides detailed documentation for all public APIs in the Online Retail Simulator.
Core Functions
Online Retail Simulator - Generate synthetic retail data for experimentation.
- online_retail_simulator.simulate(config_path, products_df=None, job_id=None)[source]
Runs simulate_products (or uses provided products), optionally simulate_product_details, and simulate_metrics.
All results are automatically saved to a job-based directory structure under the configured storage path.
- Parameters:
config_path (str) – Path to configuration file
products_df (DataFrame | None) – Optional DataFrame of existing products. If provided, skips product generation and uses this DataFrame instead. Expected columns: product_identifier, category, price
job_id (str | None) – Optional job ID, auto-generated if not provided
- Returns:
Information about the saved job
- Return type:
- online_retail_simulator.simulate_products(config_path, job_id=None)[source]
Simulate products using the backend specified in config.
- online_retail_simulator.simulate_product_details(job_info, config_path)[source]
Simulate product details using configured backend.
Loads existing products, enriches with title/description/brand/features, and saves back to the same job.
- Config example:
- PRODUCT_DETAILS:
FUNCTION: simulate_product_details_mock # or simulate_product_details_ollama
- online_retail_simulator.simulate_metrics(job_info, config_path)[source]
Simulate product metrics using the backend specified in config.
- online_retail_simulator.enrich(config_path, job_info)[source]
Apply enrichment to metrics data using a config file.
Saves enriched results to the same job directory.
- online_retail_simulator.register_enrichment_function(name, func)[source]
Register an enrichment function.
- online_retail_simulator.register_enrichment_module(module_name)[source]
Register all compatible functions from a module.
- Parameters:
module_name (str)
- Return type:
None
- online_retail_simulator.list_enrichment_functions()[source]
List all registered enrichment functions.
- online_retail_simulator.clear_enrichment_registry()[source]
Clear all registered enrichment functions.
- Return type:
None
- online_retail_simulator.register_products_function(name, func)[source]
Register a products generation function.
- online_retail_simulator.register_metrics_function(name, func)[source]
Register a metrics generation function.
- online_retail_simulator.register_simulation_module(module_name, prefix='')[source]
Register all compatible functions from a module.
Functions are automatically detected based on their signatures: - Products functions: must have ‘config’ parameter - Metrics functions: must have ‘products’ and ‘config’ parameters
- online_retail_simulator.list_simulation_functions()[source]
List all registered simulation functions.
- online_retail_simulator.get_simulation_function(func_type, name)[source]
Get a registered simulation function.
- class online_retail_simulator.JobInfo(job_id, storage_path)[source]
Bases:
objectInformation about a simulation job and its storage location.
- save_df(name, df)[source]
Save a DataFrame to this job’s directory.
- Parameters:
name (str)
df (DataFrame)
- Return type:
None
- online_retail_simulator.create_job(config, config_path, job_id=None)[source]
Create a new job directory with config.
- online_retail_simulator.load_job_results(job_info)[source]
Load simulation results for a job.
- Parameters:
job_info (JobInfo) – JobInfo containing job details
- Returns:
‘products’, ‘metrics’, ‘enriched’
- Return type:
Dict with available DataFrames
- Raises:
FileNotFoundError – If job directory doesn’t exist
- online_retail_simulator.load_job_metadata(job_info)[source]
Load metadata for a job.
- Parameters:
job_info (JobInfo) – JobInfo containing job details
- Returns:
Job metadata
- Return type:
Dict
- Raises:
FileNotFoundError – If job directory or metadata file doesn’t exist
- online_retail_simulator.list_jobs(storage_path='.')[source]
List all available job IDs in a storage path.
Simulation Module
Simulation module for generating synthetic retail data.
- online_retail_simulator.simulate.simulate(config_path, products_df=None, job_id=None)[source]
Runs simulate_products (or uses provided products), optionally simulate_product_details, and simulate_metrics.
All results are automatically saved to a job-based directory structure under the configured storage path.
- Parameters:
config_path (str) – Path to configuration file
products_df (DataFrame | None) – Optional DataFrame of existing products. If provided, skips product generation and uses this DataFrame instead. Expected columns: product_identifier, category, price
job_id (str | None) – Optional job ID, auto-generated if not provided
- Returns:
Information about the saved job
- Return type:
- online_retail_simulator.simulate.simulate_products(config_path, job_id=None)[source]
Simulate products using the backend specified in config.
- online_retail_simulator.simulate.simulate_product_details(job_info, config_path)[source]
Simulate product details using configured backend.
Loads existing products, enriches with title/description/brand/features, and saves back to the same job.
- Config example:
- PRODUCT_DETAILS:
FUNCTION: simulate_product_details_mock # or simulate_product_details_ollama
- online_retail_simulator.simulate.simulate_metrics(job_info, config_path)[source]
Simulate product metrics using the backend specified in config.
- online_retail_simulator.simulate.register_products_function(name, func)[source]
Register a products generation function.
- online_retail_simulator.simulate.register_metrics_function(name, func)[source]
Register a metrics generation function.
- online_retail_simulator.simulate.register_simulation_module(module_name, prefix='')[source]
Register all compatible functions from a module.
Functions are automatically detected based on their signatures: - Products functions: must have ‘config’ parameter - Metrics functions: must have ‘products’ and ‘config’ parameters
- online_retail_simulator.simulate.list_simulation_functions()[source]
List all registered simulation functions.
- online_retail_simulator.simulate.get_simulation_function(func_type, name)[source]
Get a registered simulation function.
Products Generation
Interface for simulating products. Dispatches to appropriate backend based on config.
- online_retail_simulator.simulate.products.simulate_products(config_path, job_id=None)[source]
Simulate products using the backend specified in config.
Rule-based product simulation.
- online_retail_simulator.simulate.products_rule_based.generate_random_product_identifier(rng, prefix='B')[source]
Generate a random product identifier. - 10 characters total - Alphanumeric - Defaults to starting with ‘B’
- online_retail_simulator.simulate.products_rule_based.simulate_products_rule_based(config)[source]
Generate synthetic products (rule-based). :param config: Complete configuration dictionary
- Returns:
DataFrame of products
- Parameters:
config (Dict)
- Return type:
DataFrame
Synthesizer-based product simulation. Reads a DataFrame from the path specified in config[‘SYNTHESIZER’][‘dataframe_path’]. No error handling, hard failures only.
- online_retail_simulator.simulate.products_synthesizer_based.simulate_products_synthesizer_based(config)[source]
Generate synthetic products using Gaussian Copula synthesizer. :param config: Complete configuration dictionary
- Returns:
DataFrame of synthetic products
- Parameters:
config (Dict)
- Return type:
DataFrame
Metrics Generation
Interface for simulating product metrics. Dispatches to appropriate backend based on config.
- online_retail_simulator.simulate.metrics.simulate_metrics(job_info, config_path)[source]
Simulate product metrics using the backend specified in config.
Rule-based product metrics simulation (minimal skeleton).
- online_retail_simulator.simulate.metrics_rule_based.simulate_metrics_rule_based(products, config)[source]
Generate synthetic product metrics with customer journey funnel (rule-based).
Simulates a realistic conversion funnel: impressions → visits → cart adds → orders.
- Parameters:
products (DataFrame) – DataFrame of products
config (Dict) – Complete configuration dictionary
- Returns:
DataFrame of product metrics (one row per product per time period). Columns: product_identifier, category, price, date, impressions, visits, cart_adds, ordered_units, revenue.
- Return type:
DataFrame
Synthesizer-based simulation backend for metrics. Takes products DataFrame and config path. No error handling, hard failures only.
- online_retail_simulator.simulate.metrics_synthesizer_based.simulate_metrics_synthesizer_based(products, config)[source]
Generate synthetic product metrics using Gaussian Copula synthesizer. :param products: DataFrame of products (unused in current implementation) :param config: Complete configuration dictionary
- Returns:
DataFrame of synthetic metrics
- Parameters:
products (DataFrame)
config (Dict)
- Return type:
DataFrame
Enrichment Module
Enrichment module for applying treatments to sales data.
- online_retail_simulator.enrich.enrich(config_path, job_info)[source]
Apply enrichment to metrics data using a config file.
Saves enriched results to the same job directory.
- online_retail_simulator.enrich.register_enrichment_function(name, func)[source]
Register an enrichment function.
- online_retail_simulator.enrich.register_enrichment_module(module_name)[source]
Register all compatible functions from a module.
- Parameters:
module_name (str)
- Return type:
None
- online_retail_simulator.enrich.list_enrichment_functions()[source]
List all registered enrichment functions.
- online_retail_simulator.enrich.clear_enrichment_registry()[source]
Clear all registered enrichment functions.
- Return type:
None
Interface for applying enrichment treatments to metrics data. Dispatches to impact-based implementation based on config.
- online_retail_simulator.enrich.enrichment.parse_impact_spec(impact_spec)[source]
Parse IMPACT specification into module, function, and params.
Supports dict format with capitalized keys: {“FUNCTION”: “product_detail_boost”, “PARAMS”: {“effect_size”: 0.5, “ramp_days”: 7}} {“MODULE”: “my_module”, “FUNCTION”: “my_func”, “PARAMS”: {…}} # MODULE ignored, kept for compatibility
- online_retail_simulator.enrich.enrichment.assign_enrichment(products, fraction, seed=None)[source]
Assign enrichment treatment to a fraction of products.
- online_retail_simulator.enrich.enrichment.apply_enrichment_to_metrics(metrics, enriched_products, enrichment_start, effect_function, **kwargs)[source]
Apply enrichment treatment effect to metrics data.
- Parameters:
- Returns:
List of modified metrics with treatment effect applied
- Return type:
- online_retail_simulator.enrich.enrichment.enrich(config_path, df, job_info=None, products_df=None)[source]
Apply enrichment to a DataFrame using a config file.
- Parameters:
config_path (str) – Path to enrichment config (YAML or JSON, local or S3)
df (DataFrame) – DataFrame with metrics data (must include product_identifier)
job_info – Optional JobInfo for product-aware enrichment functions
products_df – Optional products DataFrame for product-aware enrichment functions
- Returns:
enriched_df: DataFrame with enrichment applied (factual version)
potential_outcomes_df: DataFrame with Y0/Y1 for all products, or None if not provided
- Return type:
Tuple of (enriched_df, potential_outcomes_df)
Library of predefined treatment effect functions for catalog enrichment.
- online_retail_simulator.enrich.enrichment_library.quantity_boost(metrics, **kwargs)[source]
Boost ordered units by a percentage for enriched products.
- Parameters:
metrics (list) – List of metric record dictionaries
**kwargs – Parameters including: - effect_size: Percentage increase in ordered units (default: 0.5 for 50% boost) - enrichment_fraction: Fraction of products to enrich (default: 0.3) - enrichment_start: Start date of enrichment (default: “2024-11-15”) - seed: Random seed for product selection (default: 42) - min_units: Minimum units for enriched products with zero sales (default: 1)
- Returns:
treated_metrics: List of modified metric dictionaries with treatment applied
potential_outcomes_df: DataFrame with Y0_revenue and Y1_revenue for all products
- Return type:
Tuple of (treated_metrics, potential_outcomes_df)
- online_retail_simulator.enrich.enrichment_library.probability_boost(metrics, **kwargs)[source]
Boost sale probability (simulated by ordered units increase as proxy).
- online_retail_simulator.enrich.enrichment_library.product_detail_boost(metrics, **kwargs)[source]
Product detail regeneration and metrics boost for enrichment experiments.
Selects a fraction of products for treatment, regenerates their product details (title, description, features) while preserving brand/category/price, and applies metrics boost effect.
- Parameters:
metrics (list) – List of metric record dictionaries
**kwargs – Parameters including: - job_info: JobInfo for saving product artifacts (required for saving) - products: List of product dictionaries (required for product details) - effect_size: Percentage increase in ordered units (default: 0.5) - ramp_days: Number of days for ramp-up period (default: 7) - enrichment_fraction: Fraction of products to enrich (default: 0.3) - enrichment_start: Start date of enrichment (default: “2024-11-15”) - seed: Random seed for product selection (default: 42) - prompt_path: Path to custom prompt template file (optional) - backend: Backend to use for regeneration (“mock” or “ollama”, default: “mock”)
- Returns:
treated_metrics: List of modified metric dictionaries with treatment applied
potential_outcomes_df: DataFrame with Y0_revenue and Y1_revenue for all products
- Return type:
Tuple of (treated_metrics, potential_outcomes_df)
Impact-based enrichment registry for custom user-defined enrichment functions.
This module provides a registration system that allows users to register their own impact-based enrichment functions.
- online_retail_simulator.enrich.enrichment_registry.register_enrichment_function(name, func)[source]
Register an enrichment function.
- online_retail_simulator.enrich.enrichment_registry.register_enrichment_module(module_name)[source]
Register all compatible functions from a module.
- Parameters:
module_name (str)
- Return type:
None
- online_retail_simulator.enrich.enrichment_registry.list_enrichment_functions()[source]
List all registered enrichment functions.
- online_retail_simulator.enrich.enrichment_registry.clear_enrichment_registry()[source]
Clear all registered enrichment functions.
- Return type:
None
Configuration Module
Configuration processing with defaults and validation.
- online_retail_simulator.config_processor.load_defaults()[source]
Load default configuration from package.
- online_retail_simulator.config_processor.get_impact_defaults(function_name)[source]
Get default parameters for an IMPACT enrichment function.
- online_retail_simulator.config_processor.deep_merge(base, override)[source]
Deep merge two dictionaries, with override values taking precedence.
- online_retail_simulator.config_processor.validate_config(config)[source]
Validate configuration has required fields and valid parameters.
- online_retail_simulator.config_processor.process_config(config_path)[source]
Load, merge with defaults, and validate configuration.
- Parameters:
config_path (str) – Path to user configuration file (local or S3)
- Returns:
Complete validated configuration
- Raises:
FileNotFoundError – If config file doesn’t exist
ValueError – If configuration is invalid
- Return type: