Download the notebook here!
Full Pipeline Tutorial
This notebook walks through a complete pipeline run: Measure, Evaluate, Allocate, Scale.
We use five simulated initiatives with known treatment effects so every result is reproducible.
Prerequisites: Run setup_data.py first to generate the simulated product catalogs:
hatch run python docs/source/impact-loop/setup_data.py
1. Configuration
The pipeline is driven by a single YAML file that specifies the budget, initiative list, and stage adapters.
[ ]:
from impact_engine_orchestrator.config import load_config
from impact_engine_orchestrator.orchestrator import Orchestrator
config = load_config("config.yaml")
print(f"Budget: ${config.budget:,.0f}")
print(f"Initiatives: {len(config.initiatives)}")
for init in config.initiatives:
print(f" {init.initiative_id} (cost to scale: ${init.cost_to_scale:,.0f})")
2. Run the Pipeline
A single call executes all four stages: measure causal effects, score evidence quality, optimize the portfolio, and scale the selected initiatives.
[ ]:
engine = Orchestrator.from_config(config)
result = engine.run()
print("Pipeline complete.")
print(f" Initiatives measured: {len(result['pilot_results'])}")
print(f" Initiatives selected: {len(result['outcome_reports'])}")
3. Inspect Results by Stage
MEASURE — Causal effect estimates
[ ]:
import pandas as pd
pilots = pd.DataFrame(result["pilot_results"])
pilots[["initiative_id", "effect_estimate", "ci_lower", "ci_upper", "p_value", "model_type"]]
EVALUATE — Confidence scores
[ ]:
evals = pd.DataFrame(result["evaluate_results"])
evals[["initiative_id", "confidence", "return_best", "return_median", "return_worst", "model_type"]]
ALLOCATE — Portfolio selection
[ ]:
alloc = result["allocate_result"]
print(f"Selected: {alloc['selected_initiatives']}")
print(f"\nBudget allocation:")
for iid, budget in alloc["budget_allocated"].items():
predicted = alloc["predicted_returns"][iid]
print(f" {iid}: ${budget:,.0f} (predicted return: {predicted:.2%})")
4. Outcome Reports — Predicted vs. Actual
The outcome report compares pilot predictions against scale actuals. This is the learning signal that calibrates future cycles.
[ ]:
reports = pd.DataFrame(result["outcome_reports"])
reports[
[
"initiative_id",
"predicted_return",
"actual_return",
"prediction_error",
"confidence_score",
"budget_allocated",
"sample_size_pilot",
"sample_size_scale",
]
]