GitHub

Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning

Requirements

pytorch==1.6.0, mujoco-py==2.0.2.13. All the requirements are specified in requirements.txt.

Code Usage

We demonstrate with Half-Cheetah-Vel environment. For other environments, change the argument --env-type according to the table:

Environment	Argument
Point-Robot	point_robot_v1
Half-Cheetah-Vel	cheetah_vel
Ant-Dir	ant_dir
Hopper-Param	hopper_param
Walker-Param	walker_param

Data Collection

Copy the following code into a shell script, and run the script.

for seed in {1..40}
do
	python train_data_collection.py --env-type cheetah_vel --save-models 1 --log-tensorboard 1 --seed $seed
done

Train the Task Encoder

If use generative modeling, run python train_generative_model.py --env-type cheetah_vel to pre-train the CVAE. Run python train_contrastive.py --env-type cheetah_vel --relabel-type generative --generative-model-path logs/*** --output-file-prefix contrastive_generative to train the encoder. Specify --generative-model-path with the path of the last saved CVAE model. If use reward randomization, specify --relabel-type with reward_randomize.

Offline Meta-RL

Specify --encoder-model-path with the last saved encoder, then run: python train_offpolicy_with_trained_encoder.py --env-type cheetah_vel --encoder-model-path logs/*** --output-file-prefix offpolicy_contrastive_generative. Check for the training result using Tensorboard.

OOD Test

Replace the content in the file ood_test_config/cheetah_vel.txt with paths of sampled behavior policies. Modify line 343~346 of test_ood_context.py to set the correct test model path. Then run python test_ood_context.py --env-type cheetah_vel.

Citation

If you are using the codes, please cite our paper.

@inproceedings{yuan2022robust,
    	title={Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning},
		author={Yuan, Haoqi and Lu, Zongqing},
		booktitle={International Conference on Machine Learning},
		pages={25747--25759},
		year={2022},
		organization={PMLR}
}

Name	Name	Last commit message	Last commit date
Latest commit YHQpkueecs bibtex Aug 17, 2022 210ad40 · Aug 17, 2022 History 3 Commits
algorithms	algorithms	initial submit	Jun 13, 2022
data_collection_config	data_collection_config	initial submit	Jun 13, 2022
data_management	data_management	initial submit	Jun 13, 2022
environments	environments	initial submit	Jun 13, 2022
models	models	initial submit	Jun 13, 2022
offline_rl_config	offline_rl_config	initial submit	Jun 13, 2022
ood_test_config	ood_test_config	initial submit	Jun 13, 2022
relabel_model_config	relabel_model_config	initial submit	Jun 13, 2022
torchkit	torchkit	initial submit	Jun 13, 2022
utils	utils	initial submit	Jun 13, 2022
README.md	README.md	bibtex	Aug 17, 2022
learner.py	learner.py	initial submit	Jun 13, 2022
offline_learner.py	offline_learner.py	initial submit	Jun 13, 2022
requirements.txt	requirements.txt	initial submit	Jun 13, 2022
test_ood_context.py	test_ood_context.py	initial submit	Jun 13, 2022
train_contrastive.py	train_contrastive.py	initial submit	Jun 13, 2022
train_contrastive_no_generative.py	train_contrastive_no_generative.py	initial submit	Jun 13, 2022
train_data_collection.py	train_data_collection.py	initial submit	Jun 13, 2022
train_generative_model.py	train_generative_model.py	initial submit	Jun 13, 2022
train_offpolicy_with_trained_encoder.py	train_offpolicy_with_trained_encoder.py	initial submit	Jun 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning

Requirements

Code Usage

Data Collection

Train the Task Encoder

Offline Meta-RL

OOD Test

Citation

About

Releases

Packages

Languages

Super1ce/CORRO

Folders and files

Latest commit

History

Repository files navigation

Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning

Requirements

Code Usage

Data Collection

Train the Task Encoder

Offline Meta-RL

OOD Test

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages