Usage
Workflow
Impact Engine follows the same three steps regardless of which measurement model is used.
1. Prepare a product catalog. Provide a CSV with product characteristics (product_identifier, category, price). In demo notebooks, the catalog simulator generates this automatically.
2. Write a YAML configuration file. The config has three sections: DATA selects the data source and optional transformations, MEASUREMENT selects the model and its parameters, and OUTPUT sets the storage path. See Configuration for the full parameter reference.
DATA:
SOURCE:
type: simulator
CONFIG:
path: data/products.csv
start_date: "2024-01-01"
end_date: "2024-01-31"
TRANSFORM:
FUNCTION: aggregate_by_date
PARAMS:
metric: revenue
MEASUREMENT:
MODEL: interrupted_time_series
PARAMS:
intervention_date: "2024-01-15"
dependent_variable: revenue
OUTPUT:
PATH: output
3. Run the analysis.
from impact_engine_measure import measure_impact, load_results
job_info = measure_impact(
config_path="config.yaml",
storage_url="./results"
)
results = load_results(job_info)
print(results.model_type) # "interrupted_time_series"
print(results.job_id) # "job-20260101-abc123"
print(results.impact_results) # {"schema_version": "2.0", "model_type": ..., "data": {...}}
print(results.transformed_metrics.head()) # DataFrame with aggregated revenue
The engine loads products, retrieves metrics, applies transformations, fits the model, and writes results. measure_impact() returns a JobInfo object; pass it to load_results() to get a typed MeasureJobResult with all artifacts loaded: config, impact_results, products, business_metrics, transformed_metrics, and any model-specific model_artifacts.
Output
Every run produces a standardized output regardless of which model was used.
impact_results.json contains the result envelope:
{
"schema_version": "2.0",
"model_type": "<model_name>",
"data": {
"model_params": { },
"impact_estimates": { },
"model_summary": { }
},
"metadata": {
"executed_at": "2026-02-08T12:00:00+00:00"
}
}
The three keys inside data are standardized across all models. model_params echoes the input parameters. impact_estimates holds the treatment effect measurements. model_summary provides fit diagnostics and sample sizes.
manifest.json lists all output files and their formats, making the output self-describing. Consumers should read the manifest to resolve file paths rather than hardcoding filenames.
Some models produce supplementary artifacts as Parquet files (e.g., per-stratum breakdowns, matched data). These are listed in the manifest and named {model_type}__{artifact_name}.parquet.
Available models
Each model has a demo notebook with a runnable end-to-end example including truth recovery validation and convergence analysis.
Model |
Library |
Interface |
Description |
Demo |
|---|---|---|---|---|
Experiment |
Linear regression for randomized A/B tests |
|||
Interrupted Time Series |
statsmodels |
ARIMA-based pre/post intervention comparison on aggregated time series |
||
Nearest Neighbour Matching |
Causal matching on covariates for ATT/ATC estimation |
|||
Subclassification |
|
Propensity stratification with within-stratum treatment effects |
||
Synthetic Control |
Synthetic control method for aggregate intervention analysis |
|||
Metrics Approximation |
(built-in) |
Response function registry |
Response function approximation using a library of candidate functions |