Causally-Aware Safe Reinforcement Learning for Long-Horizon Partially Observable Environments in IoT Systems

Kashikar, Raj

Causally-Aware Safe Reinforcement Learning for Long-Horizon Partially Observable Environments in IoT Systems

Files

Published version (1.41 MB)

Downloads: 15

Date

2025

Authors

Kashikar, Raj

Publisher

ACM

Abstract

Real-world decision-making problems often exhibit partial observability, shifting dynamics, and potentially high-stakes outcomes. In Internet of Things (IoT) deployments, these issues become even more pronounced due to noisy sensor data, intermittent connectivity, and the sheer scale of distributed devices. Traditional reinforcement learning (RL) methods may fail to safely generalize in these environments because they rely on correlational patterns rather than causal mechanisms. In this work, we propose a novel Causally-Aware Safe Reinforcement Learning (CAS-RL) framework that integrates causal structure learning with robust policy optimization. Our approach discovers latent causal factors and enforces safety constraints at every decision step, resulting in policies that are more interpretable, safer under distribution shifts, and scalable to long-horizon tasks. Empirical results on both synthetic benchmarks and a healthcare-oriented partially observable domain show that CAS-RL significantly outperforms state-of-the-art baselines in terms of robustness, safety compliance, and sample efficiency.

Persistent link

https://hdl.handle.net/10919/137705

Collections

Journal Articles, Association for Computing Machinery (ACM)
Scholarly Works, Computer Science

Full item page

Causally-Aware Safe Reinforcement Learning for Long-Horizon Partially Observable Environments in IoT Systems

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections