目录
目录README.md

🚀 Local GRPO Training

This is a refactored local version of the Unsloth Colab notebook, based on the excellent work by Daniel Han and the Unsloth team.

Now you can run GRPO policy locally and feel the AHA MOMENT on your own machine! ✨

📚 Sources

🛠️ Prerequisites

  • 🖥️ GPU (NVIDIA)
  • 🔧 make (optional - see Advanced Instructions if not using make)

🏃‍♂️ Quick Start

make up

⚙️ Configuration

Modify config.yaml to customize settings and parameters. Then simply run:

make train

🧹 Clean up

make down

⚠️ Limitations

  • 🎮 Currently supports single GPU operations only
  • 💪 For multi-GPU or H100 access, please visit runpod.io

🔍 Advanced Instructions

If you prefer not to use make, you can run the Docker commands directly:

# 🏗️ Build the image
docker build -t grpo_unsloth .

# 📦 Create container
docker create -it \
    --gpus=all \
    --name grpo_unsloth_container \
    -v $(pwd)/models:/models \
    -v $(pwd):/workspace \
    -e HF_HOME=/models/cache \
    grpo_unsloth

# 🚀 Start container
docker start grpo_unsloth_container

# 🧪 Run a quick test (dry run)
docker exec -it grpo_unsloth_container bash -c "uv run python main.py 'saving=null' 'training.max_steps=10'"

# 🏃 Run full training
docker exec -it grpo_unsloth_container bash -c "uv run python main.py 'saving=null'"

# ⏹️ Stop container
docker stop grpo_unsloth_container

# 🗑️ Remove container
docker rm grpo_unsloth_container

🤝 Contributing

Feel free to open issues and pull requests!

📄 License

This project is open-source and available under the MIT License.

GitHub GitHub stars GitHub issues GitHub forks

邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

©Copyright 2023 CCF 开源发展委员会
Powered by Trustie& IntelliDE 京ICP备13000930号