Online Retail Simulator

CI Docs License: MIT Ruff Slack

Generating synthetic product and sales data for testing causal inference pipelines

Building and validating causal inference pipelines requires realistic data—but using production datasets introduces privacy, compliance, and operational constraints.

Online Retail Simulator generates fully synthetic retail data for end-to-end testing of causal inference workflows. It simulates products, customers, sales, and conversion funnels while preserving key statistical and behavioral patterns found in real e-commerce systems.

Unlike generic data generators, the simulator supports controlled treatment effects, enabling teams to validate estimators, stress-test identifying assumptions, and compare causal models against known ground truth—before running causal analysis on production data.

Quick Start

pip install git+https://github.com/eisenhauerIO/tools-online-retail-simulator.git
from online_retail_simulator import simulate, load_job_results

job_info = simulate("config.yaml")
results = load_job_results(job_info)

products_df = results["products"]
metrics_df = results["metrics"]

Documentation

Guide

Description

Usage

Getting started with basic workflow

Configuration

All configuration options

Design

System design and architecture