Skip to content

Commit f0b8dac

Browse files
authored
V2.0.0 alpha (#4)
* Format with flake8 * Release pretrained model of usnets * Create MODEL_ZOO.md * Update README.md
1 parent 74fc3ff commit f0b8dac

16 files changed

+601
-92
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
logs
22
data
3+
.flake8

MODEL_ZOO.md

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Slimmable Model Zoo
2+
3+
## Slimmable Neural Networks ([ICLR 2019](https://arxiv.org/abs/1812.08928))
4+
5+
6+
| Model | Switches (Widths) | Top-1 Err. | MFLOPs | Model ID |
7+
| :--- | :---: | :---: | ---: | :---: |
8+
| S-MobileNet v1 | 1.00<br>0.75<br>0.50<br>0.25 | 28.5<br>30.5<br>35.2<br>46.9 | 569<br>325<br>150<br>41 | [a6285db](https://github.com/JiahuiYu/slimmable_networks/files/2709079/s_mobilenet_v1_0.25_0.5_0.75_1.0.pt.zip) |
9+
| S-MobileNet v2 | 1.00<br>0.75<br>0.50<br>0.35 | 29.5<br>31.1<br>35.6<br>40.3 | 301<br>209<br>97<br>59 | [0593ffd](https://github.com/JiahuiYu/slimmable_networks/files/2709080/s_mobilenet_v2_0.35_0.5_0.75_1.0.pt.zip) |
10+
| S-ShuffleNet | 2.00<br>1.00<br>0.50 | 28.6<br>34.5<br>42.8 | 524<br>138<br>38 | [1427f66](https://github.com/JiahuiYu/slimmable_networks/files/2709082/s_shufflenet_0.5_1.0_2.0.pt.zip) |
11+
| S-ResNet-50 | 1.00<br>0.75<br>0.50<br>0.25 | 24.0<br>25.1<br>27.9<br>35.0 | 4.1G<br>2.3G<br>1.1G<br>278 | [3fca9cc](https://drive.google.com/open?id=1f6q37OkZaz_0GoOAwllHlXNWuKwor2fC) |
12+
13+
14+
## Universally Slimmable Networks and Improved Training Techniques ([Preprint](https://arxiv.org/abs/1903.05134))
15+
16+
| Model | Widths | Top-1 Err. | MFLOPs | Model ID |
17+
| :--- | :--- | :---: | ---: | :---: |
18+
| US-MobileNet v1 | 1.0<br> 0.975<br> 0.95<br> 0.925<br> 0.9<br> 0.875<br> 0.85<br> 0.825<br> 0.8<br> 0.775<br> 0.75<br> 0.725<br> 0.7<br> 0.675<br> 0.65<br> 0.625<br> 0.6<br> 0.575<br> 0.55<br> 0.525<br> 0.5<br> 0.475<br> 0.45<br> 0.425<br> 0.4<br> 0.375<br> 0.35<br> 0.325<br> 0.3<br> 0.275<br> 0.25 | 28.2<br> 28.3<br> 28.4<br> 28.7<br> 28.7<br> 29.1<br> 29.4<br> 29.7<br> 30.2<br> 30.3<br> 30.5<br> 30.9<br> 31.2<br> 31.7<br> 32.2<br> 32.5<br> 33.2<br> 33.7<br> 34.4<br> 35.0<br> 35.8<br> 36.5<br> 37.3<br> 38.1<br> 39.0<br> 40.0<br> 41.0<br> 41.9<br> 42.7<br> 44.2<br> 44.3 | 568<br> 543<br> 517<br> 490<br> 466<br> 443<br> 421<br> 389<br> 366<br> 345<br> 325<br> 306<br> 287<br> 267<br> 249<br> 232<br> 217<br> 201<br> 177<br> 162<br> 149<br> 136<br> 124<br> 114<br> 100<br> 89<br> 80<br> 71<br> 64<br> 48<br> 41 | [13d5af2](https://github.com/JiahuiYu/slimmable_networks/files/2979952/us_mobilenet_v1_calibrated.pt.zip) |
19+
| US-MobileNet v2 | 1.0<br> 0.975<br> 0.95<br> 0.925<br> 0.9<br> 0.875<br> 0.85<br> 0.825<br> 0.8<br> 0.775<br> 0.75<br> 0.725<br> 0.7<br> 0.675<br> 0.65<br> 0.625<br> 0.6<br> 0.575<br> 0.55<br> 0.525<br> 0.5<br> 0.475<br> 0.45<br> 0.425<br> 0.4<br> 0.375<br> 0.35 | 28.5<br> 28.5<br> 28.8<br> 28.9<br> 29.1<br> 29.1<br> 29.4<br> 29.9<br> 30.0<br> 30.2<br> 30.4<br> 30.7<br> 31.1<br> 31.4<br> 31.7<br> 31.7<br> 32.4<br> 32.4<br> 34.4<br> 34.6<br> 34.9<br> 35.1<br> 35.8<br> 35.8<br> 36.6<br> 36.7<br> 37.7<br> | 300<br> 299<br> 284<br> 274<br> 269<br> 268<br> 254<br> 235<br> 222<br> 213<br> 209<br> 185<br> 173<br> 165<br> 161<br> 161<br> 151<br> 150<br> 106<br> 100<br> 97<br> 96<br> 88<br> 88<br> 80<br> 80<br> 59 | [3880cad](https://github.com/JiahuiYu/slimmable_networks/files/2979953/us_mobilenet_v2_calibrated.pt.zip) |

README.md

+22-12
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,25 @@
1-
# Slimmable Neural Networks
1+
# Slimmable Networks
22

3-
[ICLR 2019 Paper](https://arxiv.org/abs/1812.08928) | [ArXiv](https://arxiv.org/abs/1812.08928) | [OpenReview](https://openreview.net/forum?id=H1gMCsAqY7) | [Detection](https://github.com/JiahuiYu/slimmable_networks/tree/detection) | [Model Zoo](#model-zoo) | [BibTex](#citing)
3+
An open-source framework for slimmable training on ImageNet classification and COCO detection, which has enabled numerous projects.
4+
5+
## [Slimmable Neural Networks](https://arxiv.org/abs/1812.08928)
6+
7+
[ICLR 2019 Paper](https://arxiv.org/abs/1812.08928) | [OpenReview](https://openreview.net/forum?id=H1gMCsAqY7) | [Detection](https://github.com/JiahuiYu/slimmable_networks/tree/detection) | [Model Zoo](/MODEL_ZOO.md) | [BibTex](#citing)
48

59
<img src="https://user-images.githubusercontent.com/22609465/50390872-1b3fb600-0702-11e9-8034-d0f41825d775.png" width=95%/>
610

711
Illustration of slimmable neural networks. The same model can run at different widths (number of active channels), permitting instant and adaptive accuracy-efficiency trade-offs.
812

913

14+
## [Universally Slimmable Networks and Improved Training Techniques](https://arxiv.org/abs/1903.05134)
15+
16+
[Preprint](https://arxiv.org/abs/1903.05134) | [Model Zoo](/MODEL_ZOO.md) | [BibTex](#citing)
17+
18+
<img src="https://user-images.githubusercontent.com/22609465/54562571-45b5ae00-4995-11e9-8984-49e32d07e325.png" width=95%/>
19+
20+
Illustration of universally slimmable networks. The same model can run at **arbitrary** widths.
21+
22+
1023
## Run
1124

1225
0. Requirements:
@@ -22,16 +35,6 @@ Illustration of slimmable neural networks. The same model can run at different w
2235
* If you still have questions, please search closed issues first. If the problem is not solved, please open a new.
2336

2437

25-
## Model Zoo
26-
27-
| Model | Switches (Widths) | Top-1 Err. | MFLOPs | Model ID |
28-
| :--- | :---: | :---: | ---: | :---: |
29-
| S-MobileNet v1 | 1.00<br>0.75<br>0.50<br>0.25 | 28.5<br>30.5<br>35.2<br>46.9 | 569<br>325<br>150<br>41 | [a6285db](https://github.com/JiahuiYu/slimmable_networks/files/2709079/s_mobilenet_v1_0.25_0.5_0.75_1.0.pt.zip) |
30-
| S-MobileNet v2 | 1.00<br>0.75<br>0.50<br>0.35 | 29.5<br>31.1<br>35.6<br>40.3 | 301<br>209<br>97<br>59 | [0593ffd](https://github.com/JiahuiYu/slimmable_networks/files/2709080/s_mobilenet_v2_0.35_0.5_0.75_1.0.pt.zip) |
31-
| S-ShuffleNet | 2.00<br>1.00<br>0.50 | 28.6<br>34.5<br>42.8 | 524<br>138<br>38 | [1427f66](https://github.com/JiahuiYu/slimmable_networks/files/2709082/s_shufflenet_0.5_1.0_2.0.pt.zip) |
32-
| S-ResNet-50 | 1.00<br>0.75<br>0.50<br>0.25 | 24.0<br>25.1<br>27.9<br>35.0 | 4.1G<br>2.3G<br>1.1G<br>278 | [3fca9cc](https://drive.google.com/open?id=1f6q37OkZaz_0GoOAwllHlXNWuKwor2fC) |
33-
34-
3538
## Technical Details
3639

3740
Implementing slimmable networks and slimmable training is straightforward:
@@ -54,4 +57,11 @@ The software is for educaitonal and academic research purpose only.
5457
journal={arXiv preprint arXiv:1812.08928},
5558
year={2018}
5659
}
60+
61+
@article{yu2019universally,
62+
title={Universally Slimmable Networks and Improved Training Techniques},
63+
author={Yu, Jiahui and Huang, Thomas},
64+
journal={arXiv preprint arXiv:1903.05134},
65+
year={2019}
66+
}
5767
```

apps/us_mobilenet_v1_val.yml

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# =========================== Basic Settings ===========================
2+
# machine info
3+
num_gpus_per_job: 4 # number of gpus each job need
4+
num_cpus_per_job: 63 # number of cpus each job need
5+
memory_per_job: 380 # memory requirement each job need
6+
gpu_type: "nvidia-tesla-p100"
7+
8+
# data
9+
dataset: imagenet1k
10+
data_transforms: imagenet1k_basic
11+
data_loader: imagenet1k_basic
12+
dataset_dir: data/imagenet
13+
data_loader_workers: 62
14+
15+
# info
16+
num_classes: 1000
17+
image_size: 224
18+
topk: [1, 5]
19+
num_epochs: 100
20+
21+
# optimizer
22+
optimizer: sgd
23+
momentum: 0.9
24+
weight_decay: 0.0001
25+
nesterov: True
26+
27+
# lr
28+
lr: 0.1
29+
lr_scheduler: multistep
30+
multistep_lr_milestones: [30, 60, 90]
31+
multistep_lr_gamma: 0.1
32+
33+
# model profiling
34+
profiling: [gpu]
35+
36+
# pretrain, resume, test_only
37+
pretrained: ''
38+
resume: ''
39+
test_only: False
40+
41+
#
42+
random_seed: 1995
43+
batch_size: 256
44+
model: ''
45+
reset_parameters: True
46+
47+
48+
# =========================== Override Settings ===========================
49+
log_dir: logs/
50+
slimmable_training: True
51+
model: models.us_mobilenet_v1
52+
width_mult: 1.0
53+
width_mult_list: [0.25, 0.275, 0.3, 0.325, 0.35, 0.375, 0.4, 0.425, 0.45, 0.475, 0.5, 0.525, 0.55, 0.575, 0.6, 0.625, 0.65, 0.675, 0.7, 0.725, 0.75, 0.775, 0.8, 0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0]
54+
width_mult_range: [0.25, 1.0]
55+
data_transforms: imagenet1k_mobile
56+
# num_gpus_per_job:
57+
# lr:
58+
# lr_scheduler:
59+
# exp_decaying_lr_gamma:
60+
# num_epochs:
61+
# batch_size:
62+
# test pretrained
63+
test_only: True
64+
pretrained: logs/us_mobilenet_v1_calibrated.pt

apps/us_mobilenet_v2_val.yml

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# =========================== Basic Settings ===========================
2+
# machine info
3+
num_gpus_per_job: 4 # number of gpus each job need
4+
num_cpus_per_job: 63 # number of cpus each job need
5+
memory_per_job: 380 # memory requirement each job need
6+
gpu_type: "nvidia-tesla-p100"
7+
8+
# data
9+
dataset: imagenet1k
10+
data_transforms: imagenet1k_basic
11+
data_loader: imagenet1k_basic
12+
dataset_dir: data/imagenet
13+
data_loader_workers: 62
14+
15+
# info
16+
num_classes: 1000
17+
image_size: 224
18+
topk: [1, 5]
19+
num_epochs: 100
20+
21+
# optimizer
22+
optimizer: sgd
23+
momentum: 0.9
24+
weight_decay: 0.0001
25+
nesterov: True
26+
27+
# lr
28+
lr: 0.1
29+
lr_scheduler: multistep
30+
multistep_lr_milestones: [30, 60, 90]
31+
multistep_lr_gamma: 0.1
32+
33+
# model profiling
34+
profiling: [gpu]
35+
36+
# pretrain, resume, test_only
37+
pretrained: ''
38+
resume: ''
39+
test_only: False
40+
41+
#
42+
random_seed: 1995
43+
batch_size: 256
44+
model: ''
45+
reset_parameters: True
46+
47+
48+
# =========================== Override Settings ===========================
49+
log_dir: logs/
50+
slimmable_training: True
51+
model: models.us_mobilenet_v2
52+
width_mult: 1.0
53+
width_mult_list: [0.35, 0.375, 0.4, 0.425, 0.45, 0.475, 0.5, 0.525, 0.55, 0.575, 0.6, 0.625, 0.65, 0.675, 0.7, 0.725, 0.75, 0.775, 0.8, 0.825, 0.85, 0.875, 0.9, 0.925, 0.95, 0.975, 1.0]
54+
width_mult_range: [0.35, 1.0]
55+
data_transforms: imagenet1k_mobile
56+
# num_gpus_per_job:
57+
# lr:
58+
# lr_scheduler:
59+
# exp_decaying_lr_gamma:
60+
# num_epochs:
61+
# batch_size:
62+
# test pretrained
63+
test_only: True
64+
pretrained: logs/us_mobilenet_v2_calibrated.pt

models/s_mobilenet_v1.py

+5-21
Original file line numberDiff line numberDiff line change
@@ -3,26 +3,10 @@
33

44

55
from .slimmable_ops import SwitchableBatchNorm2d
6-
from .slimmable_ops import SlimmableConv2d, SlimmableLinear
6+
from .slimmable_ops import SlimmableConv2d, SlimmableLinear, make_divisible
77
from utils.config import FLAGS
88

99

10-
def _make_divisible(v, divisor=8, min_value=8):
11-
"""
12-
forked from slim:
13-
https://github.com/tensorflow/models/blob/\
14-
0344c5503ee55e24f0de7f37336a6e08f10976fd/\
15-
research/slim/nets/mobilenet/mobilenet.py#L62-L69
16-
"""
17-
if min_value is None:
18-
min_value = divisor
19-
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
20-
# Make sure that round down does not go down by more than 10%.
21-
if new_v < 0.9 * v:
22-
new_v += divisor
23-
return new_v
24-
25-
2610
class DepthwiseSeparableConv(nn.Module):
2711
def __init__(self, inp, outp, stride):
2812
super(DepthwiseSeparableConv, self).__init__()
@@ -63,10 +47,10 @@ def __init__(self, num_classes=1000, input_size=224):
6347
# head
6448
assert input_size % 32 == 0
6549
channels = [
66-
_make_divisible(32 * width_mult)
50+
make_divisible(32 * width_mult)
6751
for width_mult in FLAGS.width_mult_list]
6852
self.outp = [
69-
_make_divisible(1024 * width_mult)
53+
make_divisible(1024 * width_mult)
7054
for width_mult in FLAGS.width_mult_list]
7155
first_stride = 2
7256
self.features.append(
@@ -81,7 +65,7 @@ def __init__(self, num_classes=1000, input_size=224):
8165
# body
8266
for c, n, s in self.block_setting:
8367
outp = [
84-
_make_divisible(c * width_mult)
68+
make_divisible(c * width_mult)
8569
for width_mult in FLAGS.width_mult_list]
8670
for i in range(n):
8771
if i == 0:
@@ -92,7 +76,7 @@ def __init__(self, num_classes=1000, input_size=224):
9276
DepthwiseSeparableConv(channels, outp, 1))
9377
channels = outp
9478

95-
avg_pool_size = input_size//32
79+
avg_pool_size = input_size // 32
9680
self.features.append(nn.AvgPool2d(avg_pool_size))
9781

9882
# make it nn.Sequential

models/s_mobilenet_v2.py

+6-21
Original file line numberDiff line numberDiff line change
@@ -3,25 +3,10 @@
33

44

55
from .slimmable_ops import SwitchableBatchNorm2d, SlimmableConv2d
6+
from .slimmable_ops import make_divisible
67
from utils.config import FLAGS
78

89

9-
def _make_divisible(v, divisor=8, min_value=1):
10-
"""
11-
forked from slim:
12-
https://github.com/tensorflow/models/blob/\
13-
0344c5503ee55e24f0de7f37336a6e08f10976fd/\
14-
research/slim/nets/mobilenet/mobilenet.py#L62-L69
15-
"""
16-
if min_value is None:
17-
min_value = divisor
18-
new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
19-
# Make sure that round down does not go down by more than 10%.
20-
if new_v < 0.9 * v:
21-
new_v += divisor
22-
return new_v
23-
24-
2510
class InvertedResidual(nn.Module):
2611
def __init__(self, inp, outp, stride, expand_ratio):
2712
super(InvertedResidual, self).__init__()
@@ -31,7 +16,7 @@ def __init__(self, inp, outp, stride, expand_ratio):
3116

3217
layers = []
3318
# expand
34-
expand_inp = [i*expand_ratio for i in inp]
19+
expand_inp = [i * expand_ratio for i in inp]
3520
if expand_ratio != 1:
3621
layers += [
3722
SlimmableConv2d(inp, expand_inp, 1, 1, 0, bias=False),
@@ -80,9 +65,9 @@ def __init__(self, num_classes=1000, input_size=224):
8065
# head
8166
assert input_size % 32 == 0
8267
channels = [
83-
_make_divisible(32 * width_mult)
68+
make_divisible(32 * width_mult)
8469
for width_mult in FLAGS.width_mult_list]
85-
self.outp = _make_divisible(
70+
self.outp = make_divisible(
8671
1280 * max(FLAGS.width_mult_list)) if max(
8772
FLAGS.width_mult_list) > 1.0 else 1280
8873
first_stride = 2
@@ -98,7 +83,7 @@ def __init__(self, num_classes=1000, input_size=224):
9883
# body
9984
for t, c, n, s in self.block_setting:
10085
outp = [
101-
_make_divisible(c * width_mult)
86+
make_divisible(c * width_mult)
10287
for width_mult in FLAGS.width_mult_list]
10388
for i in range(n):
10489
if i == 0:
@@ -120,7 +105,7 @@ def __init__(self, num_classes=1000, input_size=224):
120105
nn.ReLU6(inplace=True),
121106
)
122107
)
123-
avg_pool_size = input_size//32
108+
avg_pool_size = input_size // 32
124109
self.features.append(nn.AvgPool2d(avg_pool_size))
125110

126111
# make it nn.Sequential

models/s_resnet.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ def __init__(self, inp, outp, stride):
1212
super(Block, self).__init__()
1313
assert stride in [1, 2]
1414

15-
midp = [i//4 for i in outp]
15+
midp = [i // 4 for i in outp]
1616
layers = [
1717
SlimmableConv2d(inp, midp, 1, 1, 0, bias=False),
1818
SwitchableBatchNorm2d(midp),
@@ -79,7 +79,7 @@ def __init__(self, num_classes=1000, input_size=224):
7979
# body
8080
for stage_id, n in enumerate(self.block_setting):
8181
outp = [
82-
int(feats[stage_id]*width_mult*4)
82+
int(feats[stage_id] * width_mult * 4)
8383
for width_mult in FLAGS.width_mult_list]
8484
for i in range(n):
8585
if i == 0 and stage_id != 0:
@@ -88,7 +88,7 @@ def __init__(self, num_classes=1000, input_size=224):
8888
self.features.append(Block(channels, outp, 1))
8989
channels = outp
9090

91-
avg_pool_size = input_size//32
91+
avg_pool_size = input_size // 32
9292
self.features.append(nn.AvgPool2d(avg_pool_size))
9393

9494
# make it nn.Sequential

0 commit comments

Comments
 (0)