Multi-Drive Curiosity-Based RL Agent
Published:
Overview
This independent research project explores biologically inspired reinforcement learning agents with multiple competing drives: curiosity, survival, and safety. Using OpenAI Gym, the agent learns to navigate environments while balancing exploration and alignment with intended objectives, aiming to mitigate reward hacking and unsafe behaviors.
Motivation
Standard RL agents often exploit reward functions in unintended ways. By integrating multiple motivational drives inspired by biological systems, this project investigates more robust and aligned AI behavior, providing insights into safe exploration and intrinsic motivation.
Technical Details
Technologies Used
- Python, PyTorch
- OpenAI Gym
- NumPy, Matplotlib
- Reinforcement learning algorithms (custom multi-drive reward functions)
Methodology
- Designed multi-objective reward structure balancing curiosity, survival, and safety
- Implemented RL agent using standard policy gradient methods with intrinsic reward modulation
- Tested in a suite of OpenAI Gym environments to observe exploration patterns and emergent behaviors
Links
Project Status: In Progress
Timeline: August 2025 – Ongoing
