Skip to content

Commit 2f5c892

Browse files
authored
prototype_source/backend_config_tutorial.rst ๋ฒˆ์—ญ (#909)
* prototype_source/backend_config_tutorial.rst ๋ฒˆ์—ญ
1 parent f0e9318 commit 2f5c892

File tree

1 file changed

+76
-87
lines changed

1 file changed

+76
-87
lines changed

โ€Žprototype_source/backend_config_tutorial.rst

+76-87
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
(prototype) PyTorch BackendConfig Tutorial
22
==========================================
3-
**Author**: `Andrew Or <https://github.com/andrewor14>`_
4-
5-
The BackendConfig API enables developers to integrate their backends
6-
with PyTorch quantization. It is currently only supported in FX graph
7-
mode quantization, but support may be extended to other modes of
8-
quantization in the future. In this tutorial, we will demonstrate how to
9-
use this API to customize quantization support for specific backends.
10-
For more information on the motivation and implementation details behind
11-
BackendConfig, please refer to this
3+
**์ €์ž**: `Andrew Or <https://github.com/andrewor14>`_
4+
**๋ฒˆ์—ญ**: `์žฅ์Šนํ˜ธ <https://github.com/jason9865>`_
5+
6+
BackendConfig API๋ฅผ ํ†ตํ•ด ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ PyTorch ์–‘์žํ™”๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
7+
๊ธฐ์กด ํ™˜๊ฒฝ์—์„œ๋Š” FX ๊ทธ๋ž˜ํ”„ ๋ชจ๋“œ ์–‘์žํ™” ๋งŒ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์ง€๋งŒ
8+
์ถ”ํ›„์—๋Š” ๋‹ค๋ฅธ ๋ชจ๋“œ ๋˜ํ•œ ์ง€์› ํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
9+
๋ณธ ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ํŠน์ • ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ ์–‘์žํ™” ๊ธฐ๋Šฅ์„ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•ํ•˜๊ธฐ ์œ„ํ•ด
10+
BackendConfig API๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ๋‹ค๋ฃน๋‹ˆ๋‹ค.
11+
BackendConfig๊ฐ€ ๋งŒ๋“ค์–ด์ง„ ๋™๊ธฐ์™€ ๊ตฌํ˜„ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์„ธ๋ถ€์ •๋ณด๋ฅผ ์•Œ๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด
12+
์•„๋ž˜ ์‚ฌ์ดํŠธ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.
1213
`README <https://github.com/pytorch/pytorch/tree/master/torch/ao/quantization/backend_config>`__.
1314

14-
Suppose we are a backend developer and we wish to integrate our backend
15-
with PyTorch's quantization APIs. Our backend consists of two ops only:
16-
quantized linear and quantized conv-relu. In this section, we will walk
17-
through how to achieve this by quantizing an example model using a custom
18-
BackendConfig through `prepare_fx` and `convert_fx`.
15+
์—ฌ๋Ÿฌ๋ถ„์ด PyTorch์˜ ์–‘์žํ™” API๋ฅผ ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์–ดํ•˜๋Š” ๋ฐฑ์—”๋“œ ๊ฐœ๋ฐœ์ž๋ผ๊ณ  ๊ฐ€์ •ํ•ด๋ด…์‹œ๋‹ค.
16+
๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์„ ํƒ์ง€๋Š” ์–‘์žํ™”๋œ ์„ ํ˜•(Linear) ์—ฐ์‚ฐ์ž์™€ ํ•ฉ์„ฑ๊ณฑ(Convolution) ReLU ์—ฐ์‚ฐ์ž๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
17+
์ด๋ฒˆ ์žฅ์—์„œ๋Š” `prepare_fx`์™€ `convert_fx`๋ฅผ ํ†ตํ•ด ์ปค์Šคํ…€ BackendConfig๋ฅผ ๋งŒ๋“ค๊ณ ,
18+
์ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์˜ˆ์‹œ ๋ชจ๋ธ์„ ์–‘์žํ™”ํ•˜์—ฌ ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์„ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
1919

2020
.. code:: ipython3
2121
@@ -36,32 +36,30 @@ BackendConfig through `prepare_fx` and `convert_fx`.
3636
)
3737
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
3838
39-
1. Derive reference pattern for each quantized operator
39+
1. ์–‘์žํ™”๋œ ์—ฐ์‚ฐ์ž๋ฅผ ์œ„ํ•œ ์ฐธ์กฐ ํŒจํ„ด ์œ ๋„ํ•˜๊ธฐ
4040
--------------------------------------------------------
4141

42-
For quantized linear, suppose our backend expects the reference pattern
43-
`[dequant - fp32_linear - quant]` and lowers it into a single quantized
44-
linear op. The way to achieve this is to first insert quant-dequant ops
45-
before and after the float linear op, such that we produce the following
46-
reference model::
42+
์–‘์žํ™”๋œ ์„ ํ˜•์—ฐ์‚ฐ์ž๋ฅผ ์œ„ํ•ด ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ๋Š” `[dequant - fp32_linear - quant]` ์ฐธ์กฐ ํŒจํ„ด์„
43+
์–‘์žํ™”๋œ ๋‹จ์ผ ์„ ํ˜• ์—ฐ์‚ฐ์ž๋กœ ์ถ•์†Œํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ์‹œ๋‹ค.
44+
์ด๋ฅผ ์œ„ํ•ด ์šฐ์„  quant-dequant์—ฐ์‚ฐ์ž๋ฅผ ๋ถ€๋™์†Œ์ˆ˜์  ์„ ํ˜• ์—ฐ์‚ฐ์ž ์•ž ๋’ค๋กœ ์‚ฝ์ž…ํ•˜์—ฌ
45+
์•„๋ž˜์™€ ๊ฐ™์€ ์ถ”๋ก  ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.::
4746

4847
quant1 - [dequant1 - fp32_linear - quant2] - dequant2
4948

50-
Similarly, for quantized conv-relu, we wish to produce the following
51-
reference model, where the reference pattern in the square brackets will
52-
be lowered into a single quantized conv-relu op::
49+
์ด์™€ ์œ ์‚ฌํ•˜๊ฒŒ ์–‘์žํ™”๋œ ํ•ฉ์„ฑ๊ณฑ ReLU ์—ฐ์‚ฐ์ž๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„œ๋Š”
50+
๋Œ€๊ด„ํ˜ธ ์•ˆ์— ์žˆ๋Š” ์ฐธ์กฐํŒจํ„ด์„ ํ•˜๋‚˜์˜ ์–‘์žํ™”๋œ ํ•ฉ์„ฑ๊ณฑ ReLU ์—ฐ์‚ฐ์ž๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.::
5351

5452
quant1 - [dequant1 - fp32_conv_relu - quant2] - dequant2
5553

56-
2. Set DTypeConfigs with backend constraints
54+
2. ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ ์ œ์•ฝ์กฐ๊ฑด์„ DTypeConfig๋กœ ์„ค์ •ํ•˜๊ธฐ
5755
---------------------------------------------
5856

59-
In the reference patterns above, the input dtype specified in the
60-
DTypeConfig will be passed as the dtype argument to quant1, while the
61-
output dtype will be passed as the dtype argument to quant2. If the output
62-
dtype is fp32, as in the case of dynamic quantization, then the output
63-
quant-dequant pair will not be inserted. This example also shows how to
64-
specify restrictions on quantization and scale ranges on a particular dtype.
57+
์•ž์„œ ์–ธ๊ธ‰ํ•œ ์ถ”๋ก  ํŒจํ„ด์—์„œ DTypeConfig์— ๋ช…์‹œ๋œ ์ž…๋ ฅ๊ฐ’์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž…์€
58+
quant1 ๋ณ€์ˆ˜์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž… ์ธ์ž๋กœ, ์ถœ๋ ฅ๊ฐ’์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž…์€ quant2 ๋ณ€์ˆ˜์˜
59+
๋ฐ์ดํ„ฐ ํƒ€์ž… ์ธ์ž๋กœ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ๋™์  ์–‘์žํ™”(dynamic quantization)์˜ ๊ฒฝ์šฐ,
60+
์ถœ๋ ฅ๊ฐ’์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž…์ด fp32์ผ ๊ฒฝ์šฐ ์ถœ๋ ฅ๊ฐ’์˜ quant-dequant ์Œ์€ ์‚ฝ์ž…๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
61+
์•„๋ž˜ ์˜ˆ์ œ ์ฝ”๋“œ์—์„œ ์–‘์žํ™” ์‹œ ํ•„์š”ํ•œ ์ œ์•ฝ์กฐ๊ฑด์„ ๋‚˜ํƒ€๋‚ด๊ณ 
62+
ํŠน์ • ๋ฐ์ดํ„ฐ ํƒ€์ž…์˜ ๋ฒ”์œ„๋ฅผ ์ง€์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
6563

6664
.. code:: ipython3
6765
@@ -79,35 +77,33 @@ specify restrictions on quantization and scale ranges on a particular dtype.
7977
weight_dtype=torch.qint8,
8078
bias_dtype=torch.float)
8179
82-
3. Set up fusion for conv-relu
80+
3. ํ•ฉ์„ฑ๊ณฑ ReLU ๊ฒฐํ•ฉ(fusion)ํ•˜๊ธฐ
8381
-------------------------------
8482

85-
Note that the original user model contains separate conv and relu ops,
86-
so we need to first fuse the conv and relu ops into a single conv-relu
87-
op (`fp32_conv_relu`), and then quantize this op similar to how the linear
88-
op is quantized. We can set up fusion by defining a function that accepts
89-
3 arguments, where the first is whether or not this is for QAT, and the
90-
remaining arguments refer to the individual items of the fused pattern.
83+
์ดˆ๊ธฐ ์‚ฌ์šฉ์ž ๋ชจ๋ธ์—์„œ๋Š” ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์ž์™€ ReLU ์—ฐ์‚ฐ์ž๊ฐ€ ๋ถ„๋ฆฌ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
84+
๋”ฐ๋ผ์„œ ๋จผ์ € ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์ž์™€ ReLU ์—ฐ์‚ฐ์ž๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ํ•˜๋‚˜์˜ ํ•ฉ์„ฑ๊ณฑ-ReLU์—ฐ์‚ฐ์ž๋ฅผ ๋งŒ๋“  ํ›„
85+
์„ ํ˜• ์—ฐ์‚ฐ์ž๋ฅผ ์–‘์žํ™”ํ•œ ๊ฒƒ๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ํ•ฉ์„ฑ๊ณฑ-ReLU ์—ฐ์‚ฐ์ž๋ฅผ ์–‘์žํ™”๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
86+
์ด ๋•Œ 3๊ฐœ์˜ ์ธ์ž๋ฅผ ๊ฐ–๋Š” ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์ฒซ๋ฒˆ์งธ ์ธ์ž๋Š” QAT์ด ์ ์šฉ๋˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ
87+
๋‚˜๋จธ์ง€ 2๊ฐœ์˜ ์ธ์ž๋Š” ๊ฒฐํ•ฉ๋œ ํŒจํ„ด์˜ ๊ฐœ๋ณ„ ์š”์†Œ(์—ฌ๊ธฐ์„œ๋Š” ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์ž์™€ ReLU)๋ฅผ ๊ฐ€๋ฆฌํ‚ต๋‹ˆ๋‹ค.
9188

9289
.. code:: ipython3
9390
9491
def fuse_conv2d_relu(is_qat, conv, relu):
9592
"""Return a fused ConvReLU2d from individual conv and relu modules."""
9693
return torch.ao.nn.intrinsic.ConvReLU2d(conv, relu)
9794
98-
4. Define the BackendConfig
95+
4. BackendConfig ์ •์˜ํ•˜๊ธฐ
9996
----------------------------
10097

101-
Now we have all the necessary pieces, so we go ahead and define our
102-
BackendConfig. Here we use different observers (will be renamed) for
103-
the input and output for the linear op, so the quantization params
104-
passed to the two quantize ops (quant1 and quant2) will be different.
105-
This is commonly the case for weighted ops like linear and conv.
98+
์ด์ œ ํ•„์š”ํ•œ ๊ฒƒ์€ ๋ชจ๋‘ ์ค€๋น„๊ฐ€ ๋˜์—ˆ์œผ๋‹ˆ BackendConfig๋ฅผ ์ •์˜ํ•ด๋ด…์‹œ๋‹ค.
99+
์„ ํ˜• ์—ฐ์‚ฐ์ž์˜ ์ž…๋ ฅ๊ฐ’๊ณผ ์ถœ๋ ฅ๊ฐ’์— ๋Œ€ํ•ด ์„œ๋กœ ๋‹ค๋ฅธ observer(๋ช…์นญ์€ ์ถ”ํ›„ ๋ณ€๊ฒฝ ์˜ˆ์ •)๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
100+
์ด๋ฅผ ํ†ตํ•ด ์–‘์žํ™” ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์–‘์žํ™” ์—ฐ์‚ฐ์ž(quant1๊ณผ quant2)๋ฅผ ๊ฑฐ์น˜๋ฉฐ
101+
์ด์™€ ๊ฐ™์€ ๋ฐฉ์‹์€ ์„ ํ˜• ์—ฐ์‚ฐ์ด๋‚˜ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ๊ณผ ๊ฐ™์ด ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์—ฐ์‚ฐ์—์„œ
102+
์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
106103

107-
For the conv-relu op, the observation type is the same. However, we
108-
need two BackendPatternConfigs to support this op, one for fusion
109-
and one for quantization. For both conv-relu and linear, we use the
110-
DTypeConfig defined above.
104+
ํ•ฉ์„ฑ๊ณฑ-ReLU ์—ฐ์‚ฐ์ž์˜ ๊ฒฝ์šฐ observation์˜ ํƒ€์ž…์€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
105+
ํ•˜์ง€๋งŒ BackendPatternConfig์˜ ๊ฒฝ์šฐ ๊ฒฐํ•ฉ๊ณผ ์–‘์žํ™”์— ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด 2๊ฐœ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
106+
ํ•ฉ์„ฑ๊ณฑ-ReLU์™€ ์„ ํ˜• ์—ฐ์‚ฐ์ž์—๋Š” ์•ž์„œ ์ •์˜ํ•œ DTypeConfig๋ฅผ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.
111107

112108
.. code:: ipython3
113109
@@ -141,35 +137,32 @@ DTypeConfig defined above.
141137
.set_backend_pattern_config(conv_relu_config) \
142138
.set_backend_pattern_config(fused_conv_relu_config)
143139
144-
5. Set up QConfigMapping that satisfies the backend constraints
140+
5. ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ ์ œ์•ฝ์กฐ๊ฑด์„ ๋งŒ์กฑ์‹œํ‚ค๋Š” QConfigMapping ์„ค์ •ํ•˜๊ธฐ
145141
----------------------------------------------------------------
146142

147-
In order to use the ops defined above, the user must define a QConfig
148-
that satisfies the constraints specified in the DTypeConfig. For more
149-
detail, see the documentation for `DTypeConfig <https://pytorch.org/docs/stable/generated/torch.ao.quantization.backend_config.DTypeConfig.html>`__.
150-
We will then use this QConfig for all the modules used in the patterns
151-
we wish to quantize.
143+
์•ž์„œ ์ •์˜ํ•œ ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” DTypeConfig์˜ ์ œ์•ฝ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š”
144+
QConfig๋ฅผ ์ •์˜ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ `DTypeConfig <https://pytorch.org/docs/stable/generated/torch.ao.quantization.backend_config.DTypeConfig.html>`__์„ ์ฐธ๊ณ ํ•˜์„ธ์š”.
145+
๊ทธ๋ฆฌ๊ณ  ์–‘์žํ™”ํ•˜๋ ค๋Š” ํŒจํ„ด๋“ค์— ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋“  ๋ชจ๋“ˆ์— QConfig๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
152146

153147
.. code:: ipython3
154148
155-
# Note: Here we use a quant_max of 127, but this could be up to 255 (see `quint8_with_constraints`)
149+
# ์ฃผ์˜ : quant_max ๊ฐ’์€ 127์ด์ง€๋งŒ ์ถ”ํ›„ 255๊นŒ์ง€ ๋Š˜์–ด๋‚  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.(`quint8_with_constraints`๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”)
156150
activation_observer = MinMaxObserver.with_args(quant_min=0, quant_max=127, eps=2 ** -12)
157151
qconfig = QConfig(activation=activation_observer, weight=default_weight_observer)
158152
159-
# Note: All individual items of a fused pattern, e.g. Conv2d and ReLU in
160-
# (Conv2d, ReLU), must have the same QConfig
153+
# ์ฃผ์˜ : (Conv2d, ReLU) ๋‚ด๋ถ€ Conv2d์™€ ReLU์™€ ๊ฐ™์€ ๊ฒฐํ•ฉ๋œ ํŒจํ„ด์˜ ๋ชจ๋“  ๊ฐœ๋ณ„ ์š”์†Œ๋“ค์€
154+
# ๋ฐ˜๋“œ์‹œ ๊ฐ™์€ QConfig์—ฌ์•ผํ•ฉ๋‹ˆ๋‹ค.
161155
qconfig_mapping = QConfigMapping() \
162156
.set_object_type(torch.nn.Linear, qconfig) \
163157
.set_object_type(torch.nn.Conv2d, qconfig) \
164158
.set_object_type(torch.nn.BatchNorm2d, qconfig) \
165159
.set_object_type(torch.nn.ReLU, qconfig)
166160
167-
6. Quantize the model through prepare and convert
161+
6. ์‚ฌ์ „ ์ฒ˜๋ฆฌ(prepare)์™€ ๋ณ€ํ™˜(convert)์„ ํ†ตํ•œ ๋ชจ๋ธ ์–‘์žํ™”
168162
--------------------------------------------------
169163

170-
Finally, we quantize the model by passing the BackendConfig we defined
171-
into prepare and convert. This produces a quantized linear module and
172-
a fused quantized conv-relu module.
164+
๋งˆ์ง€๋ง‰์œผ๋กœ ์•ž์„œ ์ •์˜ํ•œ BackendConfig๋ฅผ prepare๊ณผ convert๋ฅผ ๊ฑฐ์ณ ์–‘์žํ™”ํ•ฉ๋‹ˆ๋‹ค.
165+
์ด๋ฅผ ํ†ตํ•ด ์–‘์žํ™”๋œ ์„ ํ˜• ๋ชจ๋“ˆ๊ณผ ๊ฒฐํ•ฉ๋œ ํ•ฉ์„ฑ๊ณฑ-ReLU ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
173166

174167
.. code:: ipython3
175168
@@ -218,16 +211,16 @@ a fused quantized conv-relu module.
218211
sigmoid = self.sigmoid(dequantize_2); dequantize_2 = None
219212
return sigmoid
220213
221-
(7. Experiment with faulty BackendConfig setups)
214+
(7. ์˜ค๋ฅ˜๊ฐ€ ์žˆ๋Š” BackendConfig ์„ค์ • ์‹คํ—˜ํ•˜๊ธฐ)
222215
-------------------------------------------------
223216

224-
As an experiment, here we modify the model to use conv-bn-relu
225-
instead of conv-relu, but use the same BackendConfig, which doesn't
226-
know how to quantize conv-bn-relu. As a result, only linear is
227-
quantized, but conv-bn-relu is neither fused nor quantized.
217+
์‹คํ—˜์˜ ์ผํ™˜์œผ๋กœ ํ•ฉ์„ฑ๊ณฑ-ReLU ์—ฐ์‚ฐ์ž ๋Œ€์‹  ํ•ฉ์„ฑ๊ณฑ-๋ฐฐ์น˜์ •๊ทœํ™”-ReLU(conv-bn-relu) ๋ชจ๋ธ์„ ์ด์šฉํ•ฉ๋‹ˆ๋‹ค.
218+
์ด ๋•Œ BackendConfig๋Š” ์ด์ „๊ณผ ๋™์ผํ•œ ๊ฒƒ์„ ์‚ฌ์šฉํ•˜๋ฉฐ ํ•ฉ์„ฑ๊ณฑ-๋ฐฐ์น˜์ •๊ทœํ™”-ReLU ์–‘์žํ™” ๊ด€๋ จ๋œ ์ •๋ณด๋Š” ์—†์Šต๋‹ˆ๋‹ค.
219+
์‹คํ—˜ ๊ฒฐ๊ณผ, ์„ ํ˜• ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ์–‘์žํ™”๊ฐ€ ์„ฑ๊ณต์ ์œผ๋กœ ์ง„ํ–‰๋˜์—ˆ์ง€๋งŒ ํ•ฉ์„ฑ๊ณฑ-๋ฐฐ์น˜์ •๊ทœํ™”-ReLU์˜ ๊ฒฝ์šฐ
220+
๊ฒฐํ•ฉ๊ณผ ์–‘์žํ™” ๋ชจ๋‘ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
228221

229222
.. code:: ipython3
230-
# Only linear is quantized, since there's no rule for fusing conv-bn-relu
223+
# ํ•ฉ์„ฑ๊ณฑ-๋ฐฐ์น˜์ •๊ทœํ™”-ReLU์™€ ๊ด€๋ จ๋œ ์ •๋ณด๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์„ ํ˜• ๋ชจ๋ธ ๋งŒ ์–‘์žํ™”๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
231224
example_inputs = (torch.rand(1, 3, 10, 10, dtype=torch.float),)
232225
model = MyModel(use_bn=True)
233226
prepared = prepare_fx(model, qconfig_mapping, example_inputs, backend_config=backend_config)
@@ -258,9 +251,8 @@ quantized, but conv-bn-relu is neither fused nor quantized.
258251
sigmoid = self.sigmoid(relu); relu = None
259252
return sigmoid
260253
261-
As another experiment, here we use the default QConfigMapping that
262-
doesn't satisfy the dtype constraints specified in the backend. As
263-
a result, nothing is quantized since the QConfigs are simply ignored.
254+
๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์— ๋ฐ์ดํ„ฐ ํƒ€์ž… ์ œ์•ฝ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜์ง€ ์•Š๋Š” ๊ธฐ๋ณธ QConfigMapping์„ ์ด์šฉํ•˜์—ฌ ๋˜ ๋‹ค๋ฅธ ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.
255+
์‹คํ˜ ๊ฒฐ๊ณผ QConfig๊ฐ€ ๋ฌด์‹œ๋˜์–ด ์–ด๋–ค ๋ชจ๋ธ๋„ ์–‘์žํ™” ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
264256

265257
.. code:: ipython3
266258
# Nothing is quantized or fused, since backend constraints are not satisfied
@@ -291,36 +283,33 @@ a result, nothing is quantized since the QConfigs are simply ignored.
291283
return sigmoid
292284
293285
294-
Built-in BackendConfigs
286+
๊ธฐ๋ณธ BackendConfig
295287
-----------------------
296288

297-
PyTorch quantization supports a few built-in native BackendConfigs under
298-
the ``torch.ao.quantization.backend_config`` namespace:
289+
PyTorch ์–‘์žํ™”๋Š” ``torch.ao.quantization.backend_config`` ๋„ค์ž„์ŠคํŽ˜์ด์Šค ํ•˜์œ„
290+
์—ฌ๋Ÿฌ ๊ธฐ๋ณธ BackendConfig๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
299291

300292
- `get_fbgemm_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/fbgemm.py>`__:
301-
for server target settings
293+
์„œ๋ฒ„ ์„ธํŒ…์šฉ BackendConfig
302294
- `get_qnnpack_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/qnnpack.py>`__:
303-
for mobile and edge device target settings, also supports XNNPACK
304-
quantized ops
295+
๋ชจ๋ฐ”์ผ ๋ฐ ์—ฃ์ง€ ์žฅ๋น„, XNNPack ์–‘์žํ™” ์—ฐ์‚ฐ์ž ์ง€์› BackendConfig
305296
- `get_native_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/native.py>`__
306-
(default): a BackendConfig that supports a union of the operator
307-
patterns supported in the FBGEMM and QNNPACK BackendConfigs
297+
(๊ธฐ๋ณธ๊ฐ’): FBGEMM๊ณผ QNNPACK BackendConfig ๋‚ด์—์„œ ์ œ๊ณต๋˜๋Š” ์—ฐ์‚ฐ์ž ํŒจํ„ด์„
298+
์ง€์›ํ•˜๋Š” BackendConfig
308299

309-
There are also other BackendConfigs under development (e.g.ย for
310-
TensorRT and x86), but these are still mostly experimental at the
311-
moment. If the user wishes to integrate a new, custom backend with
312-
PyTorchโ€™s quantization API, they may define their own BackendConfigs
313-
using the same set of APIs used to define the natively supported
314-
ones as in the example above.
300+
๊ทธ ๋ฐ–์— ๋‹ค๋ฅธ BackendConfig(TensorRT, x86 ๋“ฑ)๊ฐ€ ๊ฐœ๋ฐœ ์ค‘์ด์ง€๋งŒ
301+
์•„์ง ์‹คํ—˜ ๋‹จ๊ณ„์— ๋จธ๋ฌผ๋Ÿฌ ์žˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ์ปค์Šคํ…€ ๋ฐฑ์—”๋“œ ํ™˜๊ฒฝ์—์„œ
302+
PyTorch ์–‘์žํ™” API๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์›ํ•œ๋‹ค๋ฉด ์˜ˆ์ œ ์ฝ”๋“œ์— ์ •์˜๋œ
303+
API ์ฝ”๋“œ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ž์ฒด์ ์ธ BackendConfig๋ฅผ ์ •์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
315304

316-
Further Reading
305+
์ฐธ๊ณ ์ž๋ฃŒ
317306
---------------
318307

319-
How BackendConfig is used in FX graph mode quantization:
308+
FX ๊ทธ๋ž˜ํ”„ ๋ชจ๋“œ ์–‘์žํ™”์—์„œ BackendConfig๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฒ•:
320309
https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/README.md
321310

322-
Motivation and implementation details behind BackendConfig:
311+
BackendConfig๊ฐ€ ๋งŒ๋“ค์–ด์ง„ ๋™๊ธฐ์™€ ๊ตฌํ˜„ ๋ฐฉ๋ฒ•
323312
https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/README.md
324313

325-
Early design of BackendConfig:
314+
BackendConfig์˜ ์ดˆ๊ธฐ ์„ค๊ณ„:
326315
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md

0 commit comments

Comments
ย (0)