Skip to content

Commit b5a3566

Browse files
committed
add swin-l and swin-b
1 parent 892e8ec commit b5a3566

7 files changed

+791
-23
lines changed

configs/mm_grounding_dino/README.md

+30-10
Original file line numberDiff line numberDiff line change
@@ -26,16 +26,23 @@ Please refer to [usage.md](usage.md) or [中文版用法说明](usage_zh-CN.md)
2626

2727
## Zero-Shot COCO Results and Models
2828

29-
| Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30-
| :--------: | :------: | :-------: | :--------: | :-------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31-
| GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32-
| GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33-
| GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [config](../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
34-
| MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [config](grounding_dino_swin-t_pretrain_obj365.py) | |
35-
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [config](grounding_dino_swin-t_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json) |
36-
| MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json) |
37-
| MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json) |
38-
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json) |
29+
| Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30+
| :----------: | :------: | :-------: | :--------: | :----------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31+
| GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32+
| GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33+
| GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [config](../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
34+
| MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [config](grounding_dino_swin-t_pretrain_obj365.py) | |
35+
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [config](grounding_dino_swin-t_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json) |
36+
| MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json) |
37+
| MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json) |
38+
| MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [config](grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth) \| [log](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json) |
39+
| MM-GDINO-B | Swin-B | Zero-shot | 52.5 | O365,GoldG,V3Det | [config](grounding_dino_swin-b_pretrain_obj365_goldg_v3det.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_obj365_goldg_v3det/grounding_dino_swin-b_pretrain_obj365_goldg_v3de-f83eef00.pth) \| [log](<>) |
40+
| MM-GDINO-B\* | Swin-B | - | | O365,ALL | [config](grounding_dino_swin-b_pretrain_all.py) | [model](<>) \| [log](<>) |
41+
| MM-GDINO-L | Swin-L | Zero-shot | 53.0 | O365V2,OpenImageV6,GoldG | [config](grounding_dino_swin-l_pretrain_obj365_goldg.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_obj365_goldg/grounding_dino_swin-l_pretrain_obj365_goldg-34dcdc53.pth) \| [log](<>) |
42+
| MM-GDINO-L\* | Swin-L | - | 60.3 | O365V2,OpenImageV6,ALL | [config](grounding_dino_swin-l_pretrain_all.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_all/grounding_dino_swin-l_pretrain_all-56d69e78.pth) \| [log](<>) |
43+
44+
- This * indicates that the model has not been fully trained yet. We will release the final weights in the future.
45+
- ALL: GoldG,V3det,COCO2017,LVISV1,COCO2014,GRIT,RefCOCO,RefCOCO+,RefCOCOg,gRefCOCO.
3946

4047
## Zero-Shot LVIS Results
4148

@@ -361,3 +368,16 @@ Note:
361368
| MM-GDINO | Swin-T | 5e | 45.1 | 64.7 | 42.5 | 65.5 | 40.3 | 63.2 |
362369

363370
- The MM-GDINO-T config file is [here](refcoco/grounding_dino_swin-t_finetune_8xb4_5e_grefcoco.py)
371+
372+
## Citation
373+
374+
If you find this project useful in your research, please consider citing:
375+
376+
```latex
377+
@article{zhao2024open,
378+
title={An Open and Comprehensive Pipeline for Unified Object Grounding and Detection},
379+
author={Zhao, Xiangyu and Chen, Yicheng and Xu, Shilin and Li, Xiangtai and Wang, Xinjiang and Li, Yining and Huang, Haian},
380+
journal={arXiv preprint arXiv:2401.02361},
381+
year={2024}
382+
}
383+
```

0 commit comments

Comments
 (0)