System
A config-driven pipeline of independent, adapter-based components
communicating through versioned dict contracts. Each stage
is a separate Python package with its own repository, CI, and release
cycle — wired together by an orchestrator at runtime.
Design Decisions
Four stages, four packages, one orchestrator.
Each stage is a separate Python package with its own repository, CI, and release cycle. Four design decisions keep them decoupled while the orchestrator wires them together at runtime.
Protocol-based interfaces
Components satisfy a typing.Protocol — structural
subtyping, not inheritance. Each package defines
PipelineComponent locally. No shared base import,
no circular dependencies, no coupling beyond the method signature.
Dict boundaries
execute() takes and returns a plain dict.
No cross-package type imports. Components are coupled only by
dict key names — making each package independently deployable
and testable.
Schema versioning
Every output dict includes schema_version: "X.Y".
Consumers validate the major version and fail explicitly on mismatch.
Minor bumps (additive fields) are always safe. Breaking changes
require coordinated major bumps across producer and consumer.
Configuration-driven execution
A YAML file selects components, parameters, and data sources. A static registry maps component names to classes; a factory instantiates from config at runtime. No code changes to run a different scenario — change the config, re-run.
# Each package defines its own Protocol (no shared import)
class PipelineComponent(Protocol):
def execute(self, event: dict) -> dict: ...
# Registry maps config names → classes
COMPONENT_REGISTRY = {
"Measure": Measure,
"Evaluate": Evaluate,
"MinimaxRegretAllocate": MinimaxRegretAllocate,
}
Impact Engine — Orchestrator
The Impact Engine — Orchestrator enforces these decisions. It owns the registry, the config loader, and the fan-out/fan-in execution model — connecting independently developed stages into a single pipeline run.
Pipeline Architecture
Measure
Causal effect estimation. Parallel — one call per initiative.
Evaluate
Evidence quality scoring. Parallel — one call per initiative.
Allocate
Portfolio optimization. Fan-in — single decision over all initiatives.
Scale
Run at production scale. Fan-out — selected initiatives only.
Each stage is an independent component exposing a single method:
execute(event) → result. The orchestrator wires them
together using a ThreadPoolExecutor for parallel fan-out,
synchronizes at the fan-in boundary before Allocate, and fans back out
for the selected initiatives.
Component Architecture
Each component follows the same internal pattern. The Measure package illustrates it: an internal pipeline of Load, Transform, Measure, and Store — each backed by a pluggable adapter.
Measure component: Load → Transform → Measure → Store
Three-tier adapter pattern. External libraries (SARIMAX,
statsmodels, PuLP) sit behind thin adapter wrappers that implement a
shared interface — connect, validate,
fit. A manager layer coordinates adapters, handles
dependency injection, and manages storage. The adapter is the only code
that knows about the library; everything above works against the interface.
Self-registering adapters. Each adapter registers itself
via a decorator:
@MODEL_REGISTRY.register_decorator("interrupted_time_series").
Six model adapters are available today. Adding a new one means writing
one adapter class — zero changes to the pipeline, registry, or orchestrator.
Same pattern across stages. Allocate follows the identical
structure: an AllocationSolver Protocol with pluggable solvers
(minimax regret, Bayesian), injected into the component via constructor.
Evaluate uses deterministic scoring functions keyed by model type.
Impact Engine — Measure
The Impact Engine — Measure is the reference implementation of this pattern — six model adapters, self-registering via decorator, each independently testable against the shared adapter interface.
Agentic Support
A shared support library (utils-agentic-support) structures AI-assisted development across all packages. Installed as a git submodule, it provides Claude Code with custom skills, specialized subagents, and ecosystem-aware templates.
Conventions as a runtime instruction set. Each repo's
CLAUDE.md is not documentation — it is an active constraint
on every AI-assisted change. When Claude Code opens any package in the
ecosystem, it reads the dependency graph, naming rules, and interface
contracts before touching a single file. Enforcement does not rely on
humans remembering a style guide; it is embedded in the model's context.
A hierarchical structure propagates shared conventions from the workspace
root down to each component, with repo-specific extensions layered on top.
Skills
General-purpose slash commands (feature workflows, code review, tech debt scanning) and ecosystem-specific ones (package scaffolding, cross-repo config sync, convention auditing).
Subagents
Specialized AI agents for code review, architecture evaluation, cross-boundary consistency checks, documentation generation, and test writing — each with scoped tool permissions.
Templates
Feature-type scaffolding for new pipeline components, adapters,
and measurement models. Hierarchical CLAUDE.md
files propagate conventions from workspace to component level.
Utils — Agentic Support
The Utils — Agentic Support library is installed as a git submodule in every package, providing the skills, subagents, and templates that enforce ecosystem conventions through AI-assisted development.
Further Reading
The architecture borrows heavily from three bodies of work: classical software design patterns, messaging-oriented integration, and the operational discipline of running ML in production.
Software Design
E. Gamma, R. Helm, R. Johnson & J. Vlissides — Design Patterns: Elements of Reusable Object-Oriented Software (1994)
M. Fowler — Patterns of Enterprise Application Architecture (2002)
Integration Patterns
G. Hohpe & B. Woolf — Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions (2003)
M. Kleppmann — Designing Data-Intensive Applications (2017)
Production Systems
B. Beyer, C. Jones, J. Petoff & N.R. Murphy — Site Reliability Engineering: How Google Runs Production Systems (2016)
D. Sculley et al. — Hidden Technical Debt in Machine Learning Systems, NeurIPS (2015)