Projects · Python · Multi-Agent RL · Event-driven Simulation · Agent Systems

Heron Event-Driven MARL

A domain-agnostic MARL framework for evaluating trained policies under heterogeneous event-driven execution and realistic observability constraints

2026.05.05 · 1 min read · by Zhenlin Wang

Introduction

HERON is a private domain-agnostic multi-agent reinforcement learning framework for hierarchical control systems where deployment timing is not clean, synchronized, or fully observable.

Most MARL policies are trained under env.step(), where every agent observes, acts, and updates in lockstep. Real systems behave differently: sensors tick at different rates, observations arrive late, and actuators respond after delay. HERON keeps the same agent definitions across step-based training and event-driven evaluation, so the deployment gap can be measured instead of guessed.

What it focuses on

Why I built it

The point of HERON is to make asynchronous deployment failures attributable. Instead of treating performance drops as one vague sim-to-real gap, the framework lets experiments isolate timing, observability, and coordination structure as separate causes.