PyTorch implementation of Online Meta-Critic Learning for Off-Policy Actor-Critic Methods.
Getting Started
Prerequisites
The environment can be run locally using conda, you need to have Miniconda3 installed. Also, most of our environments currently require a MuJoCo license.
cd ${Miniconda3_PATH}
bash Miniconda3-latest-Linux-x86_64.sh
Conda Installation
1 Download and install MuJoCo 1.31 (used for the environment of rllab) and 1.50 from the MuJoCo website. Moreover, for some experimental needs, you need to install the rllab environment rllab.
2 We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150
and ~/.mujoco/mjpro131
). The version of GYM is ‘0.14.0’ and mujoco_py is ‘1.50.1.68’.
3 Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:
4 You need to edit your PYTHONPATH to include the rllab directory. You need to have the zip file for MuJoCo 1.31 and the license file ready.:
export PYTHONPATH=path_to_rllab:$PYTHONPATH
./scripts/setup_linux.sh
./scripts/setup_mujoco.sh
5 Create and activate conda environment, install meta-critic to enable command line interface.
cd ${Meta_Critic_PATH}
conda env create -f environment.yaml
conda activate meta_critic
Examples
Training and simulating policy agent of DDPG_MC
1 Enter the directory of TD3_DDPG_MC
cd ${TD3_DDPG_MC_PATH}
2 Different design of auxiliary loss network: hw(pi(s))
python main.py --env_name HalfCheetahEnv --method DDPG_MC
3 Different design of auxiliary loss network: hw(pi(s),s,a)
python main.py --env_name HalfCheetahEnv --method DDPG_MC_sa
Training and simulating policy agent of TD3_MC
1 Enter the directory of TD3_DDPG_MC
cd ${TD3_DDPG_MC_PATH}
2 Different design of auxiliary loss network: hw(pi(s))
python main.py --env_name HalfCheetahEnv --method TD3_MC
3 Different design of auxiliary loss network: hw(pi(s),s,a)
python main.py --env_name HalfCheetahEnv --method TD3_MC_sa
Training and simulating policy agent of SAC_MC
1 Enter the directory of SAC_MC
cd ${SAC_MC_PATH}
2 Different design of auxiliary loss network: hw(pi(s))
python main.py --env_name HalfCheetahEnv --method SAC_MC
3 Different design of auxiliary loss network: hw(pi(s),s,a)
python main.py --env_name HalfCheetahEnv --method SAC_MC_sa
Meta-Critic Network in RL
PyTorch implementation of Online Meta-Critic Learning for Off-Policy Actor-Critic Methods.
Getting Started
Prerequisites
The environment can be run locally using conda, you need to have Miniconda3 installed. Also, most of our environments currently require a MuJoCo license.
Conda Installation
1 Download and install MuJoCo 1.31 (used for the environment of rllab) and 1.50 from the MuJoCo website. Moreover, for some experimental needs, you need to install the rllab environment rllab.
2 We assume that the MuJoCo files are extracted to the default location (
~/.mujoco/mjpro150
and~/.mujoco/mjpro131
). The version of GYM is ‘0.14.0’ and mujoco_py is ‘1.50.1.68’.3 Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:
4 You need to edit your PYTHONPATH to include the rllab directory. You need to have the zip file for MuJoCo 1.31 and the license file ready.:
5 Create and activate conda environment, install meta-critic to enable command line interface.
Examples
Training and simulating policy agent of DDPG_MC
1 Enter the directory of TD3_DDPG_MC
2 Different design of auxiliary loss network: hw(pi(s))
3 Different design of auxiliary loss network: hw(pi(s),s,a)
Training and simulating policy agent of TD3_MC
1 Enter the directory of TD3_DDPG_MC
2 Different design of auxiliary loss network: hw(pi(s))
3 Different design of auxiliary loss network: hw(pi(s),s,a)
Training and simulating policy agent of SAC_MC
1 Enter the directory of SAC_MC
2 Different design of auxiliary loss network: hw(pi(s))
3 Different design of auxiliary loss network: hw(pi(s),s,a)