Skip to content

Latest commit

 

History

History
99 lines (70 loc) · 5.75 KB

README.md

File metadata and controls

99 lines (70 loc) · 5.75 KB

MR-COGraphs

💻 Code

Coming soon!

🎥 Video

Watch the demo video
MR_COGraphs_video_submission.mp4

Due to size restrictions, the full video can be downloaded here: 🎥 Download the video

🗃️ Dataset

Isaac Small & Large Environment

描述文本

We provide both small and large environments as USD files, which can be downloaded and opened in the Isaac Sim platform.

Once loaded, you can generate rosbag files using the following command:

# single robot example
rosbag record /clock /robot1/camera_info_left \
/robot1/depth_left /robot1/odom /robot1/rgb_left \
/robot1/imu /robot1/scan /tf -o rosbag_name.bag
# two robots example
rosbag record /clock /robot1/camera_info_left \
/robot1/depth_left /robot1/odom /robot1/rgb_left \
/robot1/imu /robot1/scan /robot2/camera_info_left \
/robot2/depth_left /robot2/odom /robot2/rgb_left \
/robot2/imu /robot2/scan /tf -o rosbag_name.bag

Replica Apartment2 Environment

We develop a ROS wrapper to extract RGB-D sequences and ground-truth poses from the Replica Dataset, transforming them into ROS bag files.

For the replica apartment2 environment, you can directly download the single-robot rosbag and the two-robot rosbag.

Real-world Environment

We integrate iPhones (iPhone 12 Pro or later) as sensors in our framework in two ways:

  • Data Collection & Conversion: Captured data is processed and converted into rosbag files, as demonstrated in /r3d_to_ROS/r3d_to_rosbag.py.
  • Real-time Streaming: RGB-D and pose information are continuously transformed into ROS messages and published to the corresponding ROS topics. This enables real-time COGraph construction, implemented in /r3d_to_ROS/record3d_ros.zip.

描述文本

Below are the rosbag files collected from our real-world environment, a 9m × 9m space with three rooms:

📝 Appendix

GPU usage information

描述文本

The GPU utilization during the COGraph generation process is shown above. For detailed metrics, please refer to the log file gpu_usage_log.txt.

More Demonstrations

描述文本

We have also conducted tests of our system in a more expansive real-world setting, featuring a corridor and three rooms. The illustration below depicts the nodes created by robot1 in the COGraph, along with the merged nodes contributed by robot1 and robot2.

描述文本

How to train the encoder and decoder

1.Download the dataset from the following URL: https://www.kaggle.com/c/imagenet-object-localization-challenge/data.

2.Input the file LOC_synset_mapping.txt into a large language model (GPT/Kimi), with the prompt: "Based on the information in the dataset, each line begins with a serial number, followed by the word it represents. Select the serial numbers and words that will definitely appear in a room, and output the results in the format of 'serial number + word (reason)'." The output file obtained is imagenet_classes_in_house_last.txt.

3.Utilizing the list of household objects from imagenet_classes_in_house_last.txt, in conjunction with the annotation files from ILSVRC/Annotations, obtain the cropped images.

4.Input the cropped images into CLIP to acquire features, which will subsequently be used for training the encoder and decoder.

How to generate content for inquiry purposes

1.Ranking Node Labels by Frequency in COGraph: Sort the node labels in the COGraph dataset based on their occurrence frequency and select the top 10 labels, which will be referred to as "Appeared" in the Query Type.

2.Synonym Generation Using Large Language Models: Input the 10 labels identified in step 1 into a large language model (GPT/Kimi) with the prompt "Find synonyms for these words" to generate a list of synonyms for each label, denoted as "Similar" in the Query Type.

3.Descriptive Phrase Generation Using Large Language Models: Input the 10 labels from step 1 into a large language model (GPT/Kimi) with the prompt "Provide brief descriptions in English for these terms" to obtain a set of brief descriptions for each label, labeled as "Descriptive" in the Query Type.

🎓 Citation

If you find our paper and datasets useful, please cite us:

@article{gu2024mr,
  title={MR-COGraphs: Communication-efficient Multi-Robot Open-vocabulary Mapping System via 3D Scene Graphs},
  author={Gu, Qiuyi and Ye, Zhaocheng and Yu, Jincheng and Tang, Jiahao and Yi, Tinghao and Dong, Yuhan and Wang, Jian and Cui, Jinqiang and Chen, Xinlei and Wang, Yu},
  journal={arXiv preprint arXiv:2412.18381},
  year={2024}
}