Treasure Hunt with SARSA: A Reinforcement Learning Adventure
Welcome to the Treasure Hunt with SARSA! This project is a fun and interactive way to explore reinforcement learning concepts using the SARSA algorithm. You’ll guide an agent through a maze-like environment, avoiding obstacles, collecting treasures, and finding the exit while managing energy consumption.
In this project, you’ll be tasked with training an agent to navigate a 20x20 grid world. The agent starts at a fixed initial position and must find the exit while avoiding obstacles and collecting treasures. The environment includes:
Obstacles: Randomly placed on the map.
Treasures: Randomly placed near obstacles.
Exit: Randomly placed on the map’s boundary.
The agent receives rewards and penalties based on its actions:
Treasure: +10 reward.
Obstacle: -5 penalty.
Exit: +50 reward.
Boundary: -1 penalty.
Repeated Path: -2 penalty.
Energy Depletion: -10 penalty.
The agent’s energy decreases with each move, and if it runs out of energy, the episode ends.
Features
SARSA Algorithm: Implements the SARSA algorithm for reinforcement learning.
Dynamic Environment: Randomly generated obstacles, treasures, and exit.
Energy Management: The agent must manage its energy to avoid depletion.
Reward System: A comprehensive reward system to guide the agent’s learning.
Visualization: Real-time visualization of the agent’s progress during training and testing.
Installation
To get started, clone the repository and install the required dependencies:
git clone https://github.com/yourusername/treasure-hunt-sarsa.git
cd treasure-hunt-sarsa
pip install -r requirements.txt
Usage
To train the agent, run:
python main.py
You’ll be prompted to choose a model type. Currently, only sarsa is supported.
Training
The training process involves running multiple episodes where the agent learns to navigate the environment. The training progress is visualized every 10,000 episodes, and the average reward is printed every 1,000 episodes.
Testing
After training, you can test the agent’s performance by running:
python main.py
The agent will navigate the environment using the learned Q-values, and the best path will be visualized.
Visualization
The visualization shows the agent’s position, obstacles, treasures, and exit on the map. During training, the map is updated every 10,000 episodes. During testing, the best path taken by the agent is displayed.
TODO List
Here are some exciting features and improvements planned for the future:
Multi-Agent Support: Allow multiple agents to explore the environment simultaneously.
Different Environments: Introduce different map sizes and layouts.
Advanced Algorithms: Implement other reinforcement learning algorithms like Q-Learning, DQN, and PPO.
Energy Regeneration: Introduce energy regeneration points on the map.
Dynamic Obstacles: Make obstacles move or change positions over time.
User Interaction: Allow users to manually guide the agent during testing.
Performance Optimization: Optimize the training process for faster convergence.
Documentation: Improve documentation and add more examples.
Contributing
Contributions are welcome! Feel free to open issues or submit pull requests. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Treasure Hunt with SARSA: A Reinforcement Learning Adventure
Welcome to the Treasure Hunt with SARSA! This project is a fun and interactive way to explore reinforcement learning concepts using the SARSA algorithm. You’ll guide an agent through a maze-like environment, avoiding obstacles, collecting treasures, and finding the exit while managing energy consumption.
Table of Contents
Introduction
In this project, you’ll be tasked with training an agent to navigate a 20x20 grid world. The agent starts at a fixed initial position and must find the exit while avoiding obstacles and collecting treasures. The environment includes:
The agent receives rewards and penalties based on its actions:
The agent’s energy decreases with each move, and if it runs out of energy, the episode ends.
Features
Installation
To get started, clone the repository and install the required dependencies:
Usage
To train the agent, run:
You’ll be prompted to choose a model type. Currently, only
sarsa
is supported.Training
The training process involves running multiple episodes where the agent learns to navigate the environment. The training progress is visualized every 10,000 episodes, and the average reward is printed every 1,000 episodes.
Testing
After training, you can test the agent’s performance by running:
The agent will navigate the environment using the learned Q-values, and the best path will be visualized.
Visualization
The visualization shows the agent’s position, obstacles, treasures, and exit on the map. During training, the map is updated every 10,000 episodes. During testing, the best path taken by the agent is displayed.
TODO List
Here are some exciting features and improvements planned for the future:
Contributing
Contributions are welcome! Feel free to open issues or submit pull requests. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Happy treasure hunting! 🗺️💎