{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Demo\n", "\n", "This notebook provides a high-level overview of the **Online Retail Simulator** package and its capabilities.\n", "\n", "## What is Online Retail Simulator?\n", "\n", "A Python package for generating **synthetic e-commerce data** for:\n", "- Testing and demos without exposing real business data\n", "- ML model training with realistic retail patterns\n", "- A/B test simulation and experimentation\n", "- Teaching analytics and data science concepts\n", "\n", "## Key Capabilities\n", "\n", "- **Rule-based generation**: Fast, configurable synthetic data\n", "- **ML-based synthesis**: Learn patterns from real data (optional SDV integration)\n", "- **Reproducible results**: Seed control for deterministic output\n", "- **8 product categories**: Electronics, Books, Clothing, and more\n", "- **Funnel metrics**: Impressions, visits, cart adds, orders" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "First, let's install the package (if running in Colab) and import the necessary libraries." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2026-03-01T03:45:44.637615Z", "iopub.status.busy": "2026-03-01T03:45:44.637404Z", "iopub.status.idle": "2026-03-01T03:45:44.640537Z", "shell.execute_reply": "2026-03-01T03:45:44.639715Z" } }, "outputs": [], "source": [ "# Uncomment if running in Google Colab\n", "# !pip install online-retail-simulator matplotlib seaborn" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2026-03-01T03:45:44.642025Z", "iopub.status.busy": "2026-03-01T03:45:44.641872Z", "iopub.status.idle": "2026-03-01T03:45:46.520421Z", "shell.execute_reply": "2026-03-01T03:45:46.519513Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "from online_retail_simulator import simulate, load_job_results\n", "\n", "# Set plot style\n", "sns.set_theme(style=\"whitegrid\")\n", "plt.rcParams[\"figure.figsize\"] = (10, 6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generate Sample Data\n", "\n", "We'll generate 30 days of synthetic sales data with a simple configuration." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2026-03-01T03:45:46.522685Z", "iopub.status.busy": "2026-03-01T03:45:46.522451Z", "iopub.status.idle": "2026-03-01T03:45:46.933544Z", "shell.execute_reply": "2026-03-01T03:45:46.932552Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generated 100 products\n", "Generated 3000 metrics records\n" ] } ], "source": [ "import os\n", "\n", "# Run simulation using config file\n", "config_path = os.path.join(os.path.dirname(__file__) if \"__file__\" in dir() else \".\", \"config_demo.yaml\")\n", "job_info = simulate(config_path)\n", "\n", "# Load results\n", "results = load_job_results(job_info)\n", "products_df = results[\"products\"]\n", "metrics_df = results[\"metrics\"]\n", "\n", "print(f\"Generated {len(products_df)} products\")\n", "print(f\"Generated {len(metrics_df)} metrics records\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploring the Generated Data\n", "\n", "Let's look at the structure and contents of our synthetic dataset." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2026-03-01T03:45:46.966140Z", "iopub.status.busy": "2026-03-01T03:45:46.965970Z", "iopub.status.idle": "2026-03-01T03:45:46.978728Z", "shell.execute_reply": "2026-03-01T03:45:46.978001Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Date range: 2024-11-01 to 2024-11-30\n", "Categories: 8\n", "Total revenue: $110,511.49\n", "\n" ] }, { "data": { "text/html": [ "
| \n", " | product_identifier | \n", "category | \n", "price | \n", "date | \n", "impressions | \n", "visits | \n", "cart_adds | \n", "ordered_units | \n", "revenue | \n", "
|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "B1P4DZHDS9 | \n", "Electronics | \n", "686.37 | \n", "2024-11-01 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 1 | \n", "B1SE4QSNG7 | \n", "Toys & Games | \n", "80.75 | \n", "2024-11-01 | \n", "100 | \n", "16 | \n", "3 | \n", "3 | \n", "242.25 | \n", "
| 2 | \n", "BXTPQIDT5C | \n", "Food & Beverage | \n", "42.02 | \n", "2024-11-01 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 3 | \n", "B3F1ZMC8Q6 | \n", "Food & Beverage | \n", "33.42 | \n", "2024-11-01 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 4 | \n", "B2NQRBTF0Y | \n", "Toys & Games | \n", "27.52 | \n", "2024-11-01 | \n", "25 | \n", "3 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 5 | \n", "B0OL6NCQ2G | \n", "Health & Beauty | \n", "77.66 | \n", "2024-11-01 | \n", "50 | \n", "7 | \n", "1 | \n", "0 | \n", "0.00 | \n", "
| 6 | \n", "BELIUY7PF3 | \n", "Books | \n", "33.79 | \n", "2024-11-01 | \n", "10 | \n", "1 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 7 | \n", "BZ13P24N6K | \n", "Toys & Games | \n", "38.11 | \n", "2024-11-01 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0.00 | \n", "
| 8 | \n", "BY3H2A222X | \n", "Clothing | \n", "40.85 | \n", "2024-11-01 | \n", "200 | \n", "34 | \n", "9 | \n", "1 | \n", "40.85 | \n", "
| 9 | \n", "BZUQSUBFIE | \n", "Books | \n", "49.04 | \n", "2024-11-01 | \n", "10 | \n", "1 | \n", "0 | \n", "0 | \n", "0.00 | \n", "