|
1 | 1 | # Arcface Pytorch (Distributed Version of ArcFace)
|
2 | 2 |
|
3 |
| - |
4 | 3 | ## Contents
|
5 | 4 |
|
6 | 5 | ## Set Up
|
7 | 6 | ```shell
|
8 | 7 | torch >= 1.6.0
|
9 |
| -``` |
10 |
| - |
11 |
| -## Train on a single node |
12 |
| -If you want to use 8 GPU to train, you should set `--nproc_per_node=8` and set `CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ` |
13 |
| -If you want to use 4 GPU to train, you should set `--nproc_per_node=4` and set `CUDA_VISIBLE_DEVICES=0,1,2,3` |
14 |
| -If you want to use 1 GPU to train, you should set `--nproc_per_node=1` ... |
| 8 | +``` |
| 9 | +More details see [eval.md](docs/install.md) in docs. |
15 | 10 |
|
| 11 | +## Training |
| 12 | +### 1. Single node, 1 GPUs: |
16 | 13 | ```shell
|
17 |
| -export OMP_NUM_THREADS=4 |
18 |
| -export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 |
19 |
| -python -m torch.distributed.launch \ |
20 |
| ---nproc_per_node=8 --nnodes=1 \ |
21 |
| ---node_rank=0 --master_addr="127.0.0.1" \ |
22 |
| ---master_port=1234 train.py |
23 |
| -ps -ef | grep "train" | grep -v grep | awk '{print "kill -9 "$2}' | sh |
| 14 | +python -m torch.distributed.launch --nproc_per_node=1 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py |
24 | 15 | ```
|
25 |
| - |
26 |
| -## Train on multi-node |
| 16 | +### 2. Single node, 8 GPUs: |
27 | 17 | ```shell
|
28 |
| -pass |
| 18 | +python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py |
29 | 19 | ```
|
30 |
| - |
31 |
| -## Evaluation |
| 20 | +### 3. Multiple nodes, each node 8 GPUs: |
| 21 | +Node 0: |
| 22 | +```shell |
| 23 | +python -m torch.distributed.launch --nproc_per_node=8 --nnodes=2 --node_rank=0 --master_addr="ip1" --master_port=1234 train.py |
| 24 | +``` |
| 25 | +Node 1: |
32 | 26 | ```shell
|
33 |
| -# model-prefix your model path |
34 |
| -# image-path your IJBC path |
35 |
| -# result-dir your result path |
36 |
| -# network your backbone |
37 |
| -CUDA_VISIBLE_DEVICES=0,1 python eval_ijbc.py \ |
38 |
| ---model-prefix ms1mv3_arcface_r50/backbone.pth \ |
39 |
| ---image-path IJB_release/IJBC \ |
40 |
| ---result-dir ms1mv3_arcface_r50 \ |
41 |
| ---batch-size 128 \ |
42 |
| ---job ms1mv3_arcface_r50 \ |
43 |
| ---target IJBC \ |
44 |
| ---network iresnet50 |
| 27 | +python -m torch.distributed.launch --nproc_per_node=8 --nnodes=2 --node_rank=1 --master_addr="ip1" --master_port=1234 train.py |
45 | 28 | ```
|
| 29 | + |
| 30 | + |
| 31 | +## Evaluation IJBC |
46 | 32 | More details see [eval.md](docs/eval.md) in docs.
|
47 | 33 |
|
48 | 34 | ## Speed Benchmark
|
@@ -89,14 +75,12 @@ All Model Can be found in here.
|
89 | 75 | ### Glint360k
|
90 | 76 | | Datasets | log |backbone | IJBC(1e-05) | IJBC(1e-04) |agedb30|cfp_fp|lfw |
|
91 | 77 | | :---: | :--- |:--- | :--- | :--- |:--- |:--- |:--- |
|
92 |
| -| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r100/training.log) |r100 | 96.19 | 97.39 | 98.52 | 99.26 | 99.83 | |
93 |
| -| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r100_fp16_0.1/training.log)|r100-fp16-sample-0.1 | 95.95 | 97.35 | 98.57 | 99.30 | 99.85 | |
94 |
| -| Glint360k-Cosface | - | - | - | - | - | - | - | |
95 |
| -| Glint360k-Cosface | - | - | - | - | - | - | - | |
96 |
| -| Glint360k-Cosface | - | - | - | - | - | - | - | |
97 |
| - |
98 |
| - |
99 |
| - |
| 78 | +| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r18_fp16_0.1/training.log) |r18-fp16-0.1 | 93.16 | 95.33 | 97.72 | 97.73 | 99.77 | |
| 79 | +| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r34_fp16_0.1/training.log) |r34-fp16-0.1 | 95.16 | 96.56 | 98.33 | 98.78 | 99.82 | |
| 80 | +| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r50_fp16_0.1/training.log) |r50-fp16-0.1 | 95.61 | 96.97 | 98.38 | 99.20 | 99.83 | |
| 81 | +| Glint360k-Cosface |[log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_cosface_r100_fp16_0.1/training.log)|r100-fp16-0.1 | 95.88 | 97.32 | 98.48 | 99.29 | 99.82 | |
| 82 | + |
| 83 | +0.1 means sample rate is 0.1. |
100 | 84 |
|
101 | 85 | More details see [eval.md](docs/modelzoo.md) in docs.
|
102 | 86 |
|
|
0 commit comments