Paper

Learning Robust Reward Machines from Noisy Labels

01 January 2024 Roko Parac, Lorenzo Nodari, Leo Ardon, Daniel Furelos-Blanco, Federico Cerutti, Alessandra Russo

This paper studies how to learn robust reward machines when labels are noisy, a key issue for agents that must infer structure from imperfect supervision.

In cybersecurity, that matters for adaptive systems operating in uncertain environments where clean labels are rare and decision rules must remain dependable under noise.