@@ -26,16 +26,23 @@ Please refer to [usage.md](usage.md) or [中文版用法说明](usage_zh-CN.md)
26
26
27
27
## Zero-Shot COCO Results and Models
28
28
29
- | Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30
- | :--------: | :------: | :-------: | :--------: | :-------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31
- | GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32
- | GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33
- | GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [ config] ( ../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth ) |
34
- | MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [ config] ( grounding_dino_swin-t_pretrain_obj365.py ) | |
35
- | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json ) |
36
- | MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json ) |
37
- | MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json ) |
38
- | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json ) |
29
+ | Model | Backbone | Style | COCO mAP | Pre-Train Data | Config | Download |
30
+ | :----------: | :------: | :-------: | :--------: | :----------------------: | :------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
31
+ | GDINO-T | Swin-T | Zero-shot | 46.7 | O365 | | |
32
+ | GDINO-T | Swin-T | Zero-shot | 48.1 | O365,GoldG | | |
33
+ | GDINO-T | Swin-T | Zero-shot | 48.4 | O365,GoldG,Cap4M | [ config] ( ../grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth ) |
34
+ | MM-GDINO-T | Swin-T | Zero-shot | 48.5(+1.8) | O365 | [ config] ( grounding_dino_swin-t_pretrain_obj365.py ) | |
35
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.3) | O365,GoldG | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602-4ea751ce.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg/grounding_dino_swin-t_pretrain_obj365_goldg_20231122_132602.log.json ) |
36
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.5(+2.1) | O365,GoldG,GRIT | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818-169cc352.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_20231128_200818.log.json ) |
37
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.6(+2.2) | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741-e316e297.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_v3det_20231218_095741.log.json ) |
38
+ | MM-GDINO-T | Swin-T | Zero-shot | 50.4(+2.0) | O365,GoldG,GRIT,V3Det | [ config] ( grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth ) \| [ log] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047.log.json ) |
39
+ | MM-GDINO-B | Swin-B | Zero-shot | 52.5 | O365,GoldG,V3Det | [ config] ( grounding_dino_swin-b_pretrain_obj365_goldg_v3det.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-b_pretrain_obj365_goldg_v3det/grounding_dino_swin-b_pretrain_obj365_goldg_v3de-f83eef00.pth ) \| [ log] ( < > ) |
40
+ | MM-GDINO-B\* | Swin-B | - | | O365,ALL | [ config] ( grounding_dino_swin-b_pretrain_all.py ) | [ model] ( < > ) \| [ log] ( < > ) |
41
+ | MM-GDINO-L | Swin-L | Zero-shot | 53.0 | O365V2,OpenImageV6,GoldG | [ config] ( grounding_dino_swin-l_pretrain_obj365_goldg.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_obj365_goldg/grounding_dino_swin-l_pretrain_obj365_goldg-34dcdc53.pth ) \| [ log] ( < > ) |
42
+ | MM-GDINO-L\* | Swin-L | - | 60.3 | O365V2,OpenImageV6,ALL | [ config] ( grounding_dino_swin-l_pretrain_all.py ) | [ model] ( https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-l_pretrain_all/grounding_dino_swin-l_pretrain_all-56d69e78.pth ) \| [ log] ( < > ) |
43
+
44
+ - This * indicates that the model has not been fully trained yet. We will release the final weights in the future.
45
+ - ALL: GoldG,V3det,COCO2017,LVISV1,COCO2014,GRIT,RefCOCO,RefCOCO+,RefCOCOg,gRefCOCO.
39
46
40
47
## Zero-Shot LVIS Results
41
48
@@ -361,3 +368,16 @@ Note:
361
368
| MM-GDINO | Swin-T | 5e | 45.1 | 64.7 | 42.5 | 65.5 | 40.3 | 63.2 |
362
369
363
370
- The MM-GDINO-T config file is [ here] ( refcoco/grounding_dino_swin-t_finetune_8xb4_5e_grefcoco.py )
371
+
372
+ ## Citation
373
+
374
+ If you find this project useful in your research, please consider citing:
375
+
376
+ ``` latex
377
+ @article{zhao2024open,
378
+ title={An Open and Comprehensive Pipeline for Unified Object Grounding and Detection},
379
+ author={Zhao, Xiangyu and Chen, Yicheng and Xu, Shilin and Li, Xiangtai and Wang, Xinjiang and Li, Yining and Huang, Haian},
380
+ journal={arXiv preprint arXiv:2401.02361},
381
+ year={2024}
382
+ }
383
+ ```
0 commit comments