make README fancy
This is a refactored local version of the Unsloth Colab notebook, based on the excellent work by Daniel Han and the Unsloth team.
Now you can run GRPO policy locally and feel the AHA MOMENT on your own machine! ✨
make up
Modify config.yaml to customize settings and parameters. Then simply run:
config.yaml
make train
make down
If you prefer not to use make, you can run the Docker commands directly:
make
# 🏗️ Build the image docker build -t grpo_unsloth . # 📦 Create container docker create -it \ --gpus=all \ --name grpo_unsloth_container \ -v $(pwd)/models:/models \ -v $(pwd):/workspace \ -e HF_HOME=/models/cache \ grpo_unsloth # 🚀 Start container docker start grpo_unsloth_container # 🧪 Run a quick test (dry run) docker exec -it grpo_unsloth_container bash -c "uv run python main.py 'saving=null' 'training.max_steps=10'" # 🏃 Run full training docker exec -it grpo_unsloth_container bash -c "uv run python main.py 'saving=null'" # ⏹️ Stop container docker stop grpo_unsloth_container # 🗑️ Remove container docker rm grpo_unsloth_container
Feel free to open issues and pull requests!
This project is open-source and available under the MIT License.
©Copyright 2023 CCF 开源发展委员会 Powered by Trustie& IntelliDE 京ICP备13000930号
🚀 Local GRPO Training
This is a refactored local version of the Unsloth Colab notebook, based on the excellent work by Daniel Han and the Unsloth team.
Now you can run GRPO policy locally and feel the AHA MOMENT on your own machine! ✨
📚 Sources
🛠️ Prerequisites
🏃♂️ Quick Start
⚙️ Configuration
Modify
config.yaml
to customize settings and parameters. Then simply run:🧹 Clean up
⚠️ Limitations
🔍 Advanced Instructions
If you prefer not to use
make
, you can run the Docker commands directly:🤝 Contributing
Feel free to open issues and pull requests!
📄 License
This project is open-source and available under the MIT License.