1
1
(prototype) PyTorch BackendConfig Tutorial
2
2
==========================================
3
- **Author **: `Andrew Or <https://github.com/andrewor14 >`_
4
-
5
- The BackendConfig API enables developers to integrate their backends
6
- with PyTorch quantization. It is currently only supported in FX graph
7
- mode quantization, but support may be extended to other modes of
8
- quantization in the future. In this tutorial, we will demonstrate how to
9
- use this API to customize quantization support for specific backends.
10
- For more information on the motivation and implementation details behind
11
- BackendConfig, please refer to this
3
+ **์ ์ **: `Andrew Or <https://github.com/andrewor14 >`_
4
+ **๋ฒ์ญ **: `์ฅ์นํธ <https://github.com/jason9865 >`_
5
+
6
+ BackendConfig API๋ฅผ ํตํด ๋ฐฑ์๋ ํ๊ฒฝ์์ PyTorch ์์ํ๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค.
7
+ ๊ธฐ์กด ํ๊ฒฝ์์๋ FX ๊ทธ๋ํ ๋ชจ๋ ์์ํ ๋ง์ ์ฌ์ฉํ ์ ์์ง๋ง
8
+ ์ถํ์๋ ๋ค๋ฅธ ๋ชจ๋ ๋ํ ์ง์ ํ ์์ ์
๋๋ค.
9
+ ๋ณธ ํํ ๋ฆฌ์ผ์์๋ ํน์ ๋ฐฑ์๋ ํ๊ฒฝ์์ ์์ํ ๊ธฐ๋ฅ์ ์ปค์คํฐ๋ง์ด์งํ๊ธฐ ์ํด
10
+ BackendConfig API๋ฅผ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ์ ๋ํด ๋ค๋ฃน๋๋ค.
11
+ BackendConfig๊ฐ ๋ง๋ค์ด์ง ๋๊ธฐ์ ๊ตฌํ ๋ฐฉ๋ฒ์ ๋ํ ์ธ๋ถ์ ๋ณด๋ฅผ ์๊ณ ์ถ์ผ์๋ค๋ฉด
12
+ ์๋ ์ฌ์ดํธ๋ฅผ ์ฐธ๊ณ ํ์ธ์.
12
13
`README <https://github.com/pytorch/pytorch/tree/master/torch/ao/quantization/backend_config >`__.
13
14
14
- Suppose we are a backend developer and we wish to integrate our backend
15
- with PyTorch's quantization APIs. Our backend consists of two ops only:
16
- quantized linear and quantized conv-relu. In this section, we will walk
17
- through how to achieve this by quantizing an example model using a custom
18
- BackendConfig through `prepare_fx ` and `convert_fx `.
15
+ ์ฌ๋ฌ๋ถ์ด PyTorch์ ์์ํ API๋ฅผ ๋ฐฑ์๋ ํ๊ฒฝ์์ ์ฌ์ฉํ๊ณ ์ถ์ดํ๋ ๋ฐฑ์๋ ๊ฐ๋ฐ์๋ผ๊ณ ๊ฐ์ ํด๋ด
์๋ค.
16
+ ๋ฐฑ์๋ ํ๊ฒฝ์์ ์ฌ์ฉํ ์ ์๋ ์ ํ์ง๋ ์์ํ๋ ์ ํ(Linear) ์ฐ์ฐ์์ ํฉ์ฑ๊ณฑ(Convolution) ReLU ์ฐ์ฐ์๊ฐ ์์ต๋๋ค.
17
+ ์ด๋ฒ ์ฅ์์๋ `prepare_fx`์ `convert_fx`๋ฅผ ํตํด ์ปค์คํ
BackendConfig๋ฅผ ๋ง๋ค๊ณ ,
18
+ ์ด๋ฅผ ํ์ฉํ์ฌ ์์ ๋ชจ๋ธ์ ์์ํํ์ฌ ๋ฐฑ์๋ ํ๊ฒฝ์ ๊ตฌ์ถํ๋ ๋ฐฉ๋ฒ์ ๋ํด ์ดํด๋ณด๊ฒ ์ต๋๋ค.
19
19
20
20
.. code :: ipython3
21
21
@@ -36,32 +36,30 @@ BackendConfig through `prepare_fx` and `convert_fx`.
36
36
)
37
37
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx
38
38
39
- 1. Derive reference pattern for each quantized operator
39
+ 1. ์์ํ๋ ์ฐ์ฐ์๋ฅผ ์ํ ์ฐธ์กฐ ํจํด ์ ๋ํ๊ธฐ
40
40
--------------------------------------------------------
41
41
42
- For quantized linear, suppose our backend expects the reference pattern
43
- `[dequant - fp32_linear - quant] ` and lowers it into a single quantized
44
- linear op. The way to achieve this is to first insert quant-dequant ops
45
- before and after the float linear op, such that we produce the following
46
- reference model::
42
+ ์์ํ๋ ์ ํ์ฐ์ฐ์๋ฅผ ์ํด ๋ฐฑ์๋ ํ๊ฒฝ์์๋ `[dequant - fp32_linear - quant] ` ์ฐธ์กฐ ํจํด์
43
+ ์์ํ๋ ๋จ์ผ ์ ํ ์ฐ์ฐ์๋ก ์ถ์ํ์ฌ ์ฌ์ฉํ๋ค๊ณ ๊ฐ์ ํฉ์๋ค.
44
+ ์ด๋ฅผ ์ํด ์ฐ์ quant-dequant์ฐ์ฐ์๋ฅผ ๋ถ๋์์์ ์ ํ ์ฐ์ฐ์ ์ ๋ค๋ก ์ฝ์
ํ์ฌ
45
+ ์๋์ ๊ฐ์ ์ถ๋ก ๋ชจ๋ธ์ ๋ง๋ค ์ ์์ต๋๋ค.::
47
46
48
47
quant1 - [dequant1 - fp32_linear - quant2] - dequant2
49
48
50
- Similarly, for quantized conv-relu, we wish to produce the following
51
- reference model, where the reference pattern in the square brackets will
52
- be lowered into a single quantized conv-relu op::
49
+ ์ด์ ์ ์ฌํ๊ฒ ์์ํ๋ ํฉ์ฑ๊ณฑ ReLU ์ฐ์ฐ์๋ฅผ ๋ง๋ค๊ธฐ ์ํด์๋
50
+ ๋๊ดํธ ์์ ์๋ ์ฐธ์กฐํจํด์ ํ๋์ ์์ํ๋ ํฉ์ฑ๊ณฑ ReLU ์ฐ์ฐ์๋ก ๋ณํํ์ฌ ์ฌ์ฉํฉ๋๋ค.::
53
51
54
52
quant1 - [dequant1 - fp32_conv_relu - quant2] - dequant2
55
53
56
- 2. Set DTypeConfigs with backend constraints
54
+ 2. ๋ฐฑ์๋ ํ๊ฒฝ ์ ์ฝ์กฐ๊ฑด์ DTypeConfig๋ก ์ค์ ํ๊ธฐ
57
55
---------------------------------------------
58
56
59
- In the reference patterns above, the input dtype specified in the
60
- DTypeConfig will be passed as the dtype argument to quant1, while the
61
- output dtype will be passed as the dtype argument to quant2. If the output
62
- dtype is fp32, as in the case of dynamic quantization, then the output
63
- quant-dequant pair will not be inserted. This example also shows how to
64
- specify restrictions on quantization and scale ranges on a particular dtype .
57
+ ์์ ์ธ๊ธํ ์ถ๋ก ํจํด์์ DTypeConfig์ ๋ช
์๋ ์
๋ ฅ๊ฐ์ ๋ฐ์ดํฐ ํ์
์
58
+ quant1 ๋ณ์์ ๋ฐ์ดํฐ ํ์
์ธ์๋ก, ์ถ๋ ฅ๊ฐ์ ๋ฐ์ดํฐ ํ์
์ quant2 ๋ณ์์
59
+ ๋ฐ์ดํฐ ํ์
์ธ์๋ก ์ ๋ฌ๋ฉ๋๋ค. ๋์ ์์ํ(dynamic quantization)์ ๊ฒฝ์ฐ,
60
+ ์ถ๋ ฅ๊ฐ์ ๋ฐ์ดํฐ ํ์
์ด fp32์ผ ๊ฒฝ์ฐ ์ถ๋ ฅ๊ฐ์ quant-dequant ์์ ์ฝ์
๋์ง ์์ต๋๋ค.
61
+ ์๋ ์์ ์ฝ๋์์ ์์ํ ์ ํ์ํ ์ ์ฝ์กฐ๊ฑด์ ๋ํ๋ด๊ณ
62
+ ํน์ ๋ฐ์ดํฐ ํ์
์ ๋ฒ์๋ฅผ ์ง์ ํ๋ ๋ฐฉ๋ฒ์ ํ์ธํ ์ ์์ต๋๋ค .
65
63
66
64
.. code :: ipython3
67
65
@@ -79,35 +77,33 @@ specify restrictions on quantization and scale ranges on a particular dtype.
79
77
weight_dtype=torch.qint8,
80
78
bias_dtype=torch.float)
81
79
82
- 3. Set up fusion for conv-relu
80
+ 3. ํฉ์ฑ๊ณฑ ReLU ๊ฒฐํฉ( fusion)ํ๊ธฐ
83
81
-------------------------------
84
82
85
- Note that the original user model contains separate conv and relu ops,
86
- so we need to first fuse the conv and relu ops into a single conv-relu
87
- op (`fp32_conv_relu `), and then quantize this op similar to how the linear
88
- op is quantized. We can set up fusion by defining a function that accepts
89
- 3 arguments, where the first is whether or not this is for QAT, and the
90
- remaining arguments refer to the individual items of the fused pattern.
83
+ ์ด๊ธฐ ์ฌ์ฉ์ ๋ชจ๋ธ์์๋ ํฉ์ฑ๊ณฑ ์ฐ์ฐ์์ ReLU ์ฐ์ฐ์๊ฐ ๋ถ๋ฆฌ๋์ด ์์ต๋๋ค.
84
+ ๋ฐ๋ผ์ ๋จผ์ ํฉ์ฑ๊ณฑ ์ฐ์ฐ์์ ReLU ์ฐ์ฐ์๋ฅผ ๊ฒฐํฉํ์ฌ ํ๋์ ํฉ์ฑ๊ณฑ-ReLU์ฐ์ฐ์๋ฅผ ๋ง๋ ํ
85
+ ์ ํ ์ฐ์ฐ์๋ฅผ ์์ํํ ๊ฒ๊ณผ ์ ์ฌํ๊ฒ ํฉ์ฑ๊ณฑ-ReLU ์ฐ์ฐ์๋ฅผ ์์ํ๋ฅผ ์งํํฉ๋๋ค.
86
+ ์ด ๋ 3๊ฐ์ ์ธ์๋ฅผ ๊ฐ๋ ํจ์๋ฅผ ์ ์ํฉ๋๋ค. ์ฒซ๋ฒ์งธ ์ธ์๋ QAT์ด ์ ์ฉ๋๋์ง ์ฌ๋ถ๋ฅผ ๋ํ๋ด๋ฉฐ
87
+ ๋๋จธ์ง 2๊ฐ์ ์ธ์๋ ๊ฒฐํฉ๋ ํจํด์ ๊ฐ๋ณ ์์(์ฌ๊ธฐ์๋ ํฉ์ฑ๊ณฑ ์ฐ์ฐ์์ ReLU)๋ฅผ ๊ฐ๋ฆฌํต๋๋ค.
91
88
92
89
.. code :: ipython3
93
90
94
91
def fuse_conv2d_relu(is_qat, conv, relu):
95
92
"""Return a fused ConvReLU2d from individual conv and relu modules."""
96
93
return torch.ao.nn.intrinsic.ConvReLU2d(conv, relu)
97
94
98
- 4. Define the BackendConfig
95
+ 4. BackendConfig ์ ์ํ๊ธฐ
99
96
----------------------------
100
97
101
- Now we have all the necessary pieces, so we go ahead and define our
102
- BackendConfig. Here we use different observers (will be renamed) for
103
- the input and output for the linear op, so the quantization params
104
- passed to the two quantize ops (quant1 and quant2) will be different.
105
- This is commonly the case for weighted ops like linear and conv .
98
+ ์ด์ ํ์ํ ๊ฒ์ ๋ชจ๋ ์ค๋น๊ฐ ๋์์ผ๋ BackendConfig๋ฅผ ์ ์ํด๋ด
์๋ค.
99
+ ์ ํ ์ฐ์ฐ์์ ์
๋ ฅ๊ฐ๊ณผ ์ถ๋ ฅ๊ฐ์ ๋ํด ์๋ก ๋ค๋ฅธ observer(๋ช
์นญ์ ์ถํ ๋ณ๊ฒฝ ์์ )๋ฅผ ์ฌ์ฉํฉ๋๋ค.
100
+ ์ด๋ฅผ ํตํด ์์ํ ๋งค๊ฐ๋ณ์๊ฐ ์๋ก ๋ค๋ฅธ ์์ํ ์ฐ์ฐ์(quant1๊ณผ quant2)๋ฅผ ๊ฑฐ์น๋ฉฐ
101
+ ์ด์ ๊ฐ์ ๋ฐฉ์์ ์ ํ ์ฐ์ฐ์ด๋ ํฉ์ฑ๊ณฑ ์ฐ์ฐ๊ณผ ๊ฐ์ด ๊ฐ์ค์น๋ฅผ ์ฌ์ฉํ๋ ์ฐ์ฐ์์
102
+ ์ผ๋ฐ์ ์ผ๋ก ์ฌ์ฉํฉ๋๋ค .
106
103
107
- For the conv-relu op, the observation type is the same. However, we
108
- need two BackendPatternConfigs to support this op, one for fusion
109
- and one for quantization. For both conv-relu and linear, we use the
110
- DTypeConfig defined above.
104
+ ํฉ์ฑ๊ณฑ-ReLU ์ฐ์ฐ์์ ๊ฒฝ์ฐ observation์ ํ์
์ ๋์ผํฉ๋๋ค.
105
+ ํ์ง๋ง BackendPatternConfig์ ๊ฒฝ์ฐ ๊ฒฐํฉ๊ณผ ์์ํ์ ์ฌ์ฉํ๊ธฐ ์ํด 2๊ฐ๊ฐ ํ์ํฉ๋๋ค.
106
+ ํฉ์ฑ๊ณฑ-ReLU์ ์ ํ ์ฐ์ฐ์์๋ ์์ ์ ์ํ DTypeConfig๋ฅผ ํ์ฉํฉ๋๋ค.
111
107
112
108
.. code :: ipython3
113
109
@@ -141,35 +137,32 @@ DTypeConfig defined above.
141
137
.set_backend_pattern_config(conv_relu_config) \
142
138
.set_backend_pattern_config(fused_conv_relu_config)
143
139
144
- 5. Set up QConfigMapping that satisfies the backend constraints
140
+ 5. ๋ฐฑ์๋ ํ๊ฒฝ ์ ์ฝ์กฐ๊ฑด์ ๋ง์กฑ์ํค๋ QConfigMapping ์ค์ ํ๊ธฐ
145
141
----------------------------------------------------------------
146
142
147
- In order to use the ops defined above, the user must define a QConfig
148
- that satisfies the constraints specified in the DTypeConfig. For more
149
- detail, see the documentation for `DTypeConfig <https://pytorch.org/docs/stable/generated/torch.ao.quantization.backend_config.DTypeConfig.html >`__.
150
- We will then use this QConfig for all the modules used in the patterns
151
- we wish to quantize.
143
+ ์์ ์ ์ํ ์ฐ์ฐ์๋ฅผ ์ฌ์ฉํ๊ธฐ ์ํด์๋ DTypeConfig์ ์ ์ฝ์กฐ๊ฑด์ ๋ง์กฑํ๋
144
+ QConfig๋ฅผ ์ ์ํด์ผํฉ๋๋ค. ์์ธํ ๋ด์ฉ์ `DTypeConfig <https://pytorch.org/docs/stable/generated/torch.ao.quantization.backend_config.DTypeConfig.html>`__์ ์ฐธ๊ณ ํ์ธ์.
145
+ ๊ทธ๋ฆฌ๊ณ ์์ํํ๋ ค๋ ํจํด๋ค์ ์ฌ์ฉ๋๋ ๋ชจ๋ ๋ชจ๋์ QConfig๋ฅผ ์ฌ์ฉํฉ๋๋ค.
152
146
153
147
.. code :: ipython3
154
148
155
- # Note: Here we use a quant_max of 127, but this could be up to 255 (see `quint8_with_constraints`)
149
+ # ์ฃผ์ : quant_max ๊ฐ์ 127์ด์ง๋ง ์ถํ 255๊น์ง ๋์ด๋ ์ ์์ต๋๋ค.( `quint8_with_constraints`๋ฅผ ์ฐธ๊ณ ํ์ธ์ )
156
150
activation_observer = MinMaxObserver.with_args(quant_min=0, quant_max=127, eps=2 ** -12)
157
151
qconfig = QConfig(activation=activation_observer, weight=default_weight_observer)
158
152
159
- # Note: All individual items of a fused pattern, e.g. Conv2d and ReLU in
160
- # (Conv2d, ReLU), must have the same QConfig
153
+ # ์ฃผ์ : (Conv2d, ReLU) ๋ด๋ถ Conv2d์ ReLU์ ๊ฐ์ ๊ฒฐํฉ๋ ํจํด์ ๋ชจ๋ ๊ฐ๋ณ ์์๋ค์
154
+ # ๋ฐ๋์ ๊ฐ์ QConfig์ฌ์ผํฉ๋๋ค.
161
155
qconfig_mapping = QConfigMapping() \
162
156
.set_object_type(torch.nn.Linear, qconfig) \
163
157
.set_object_type(torch.nn.Conv2d, qconfig) \
164
158
.set_object_type(torch.nn.BatchNorm2d, qconfig) \
165
159
.set_object_type(torch.nn.ReLU, qconfig)
166
160
167
- 6. Quantize the model through prepare and convert
161
+ 6. ์ฌ์ ์ฒ๋ฆฌ(prepare)์ ๋ณํ(convert)์ ํตํ ๋ชจ๋ธ ์์ํ
168
162
--------------------------------------------------
169
163
170
- Finally, we quantize the model by passing the BackendConfig we defined
171
- into prepare and convert. This produces a quantized linear module and
172
- a fused quantized conv-relu module.
164
+ ๋ง์ง๋ง์ผ๋ก ์์ ์ ์ํ BackendConfig๋ฅผ prepare๊ณผ convert๋ฅผ ๊ฑฐ์ณ ์์ํํฉ๋๋ค.
165
+ ์ด๋ฅผ ํตํด ์์ํ๋ ์ ํ ๋ชจ๋๊ณผ ๊ฒฐํฉ๋ ํฉ์ฑ๊ณฑ-ReLU ๋ชจ๋ธ์ ๋ง๋ค ์ ์์ต๋๋ค.
173
166
174
167
.. code :: ipython3
175
168
@@ -218,16 +211,16 @@ a fused quantized conv-relu module.
218
211
sigmoid = self.sigmoid(dequantize_2); dequantize_2 = None
219
212
return sigmoid
220
213
221
- (7. Experiment with faulty BackendConfig setups )
214
+ (7. ์ค๋ฅ๊ฐ ์๋ BackendConfig ์ค์ ์คํํ๊ธฐ )
222
215
-------------------------------------------------
223
216
224
- As an experiment, here we modify the model to use conv-bn-relu
225
- instead of conv-relu, but use the same BackendConfig, which doesn't
226
- know how to quantize conv-bn-relu. As a result, only linear is
227
- quantized, but conv-bn-relu is neither fused nor quantized .
217
+ ์คํ์ ์ผํ์ผ๋ก ํฉ์ฑ๊ณฑ-ReLU ์ฐ์ฐ์ ๋์ ํฉ์ฑ๊ณฑ-๋ฐฐ์น์ ๊ทํ-ReLU( conv-bn-relu) ๋ชจ๋ธ์ ์ด์ฉํฉ๋๋ค.
218
+ ์ด ๋ BackendConfig๋ ์ด์ ๊ณผ ๋์ผํ ๊ฒ์ ์ฌ์ฉํ๋ฉฐ ํฉ์ฑ๊ณฑ-๋ฐฐ์น์ ๊ทํ-ReLU ์์ํ ๊ด๋ จ๋ ์ ๋ณด๋ ์์ต๋๋ค.
219
+ ์คํ ๊ฒฐ๊ณผ, ์ ํ ๋ชจ๋ธ์ ๊ฒฝ์ฐ ์์ํ๊ฐ ์ฑ๊ณต์ ์ผ๋ก ์งํ๋์์ง๋ง ํฉ์ฑ๊ณฑ-๋ฐฐ์น์ ๊ทํ-ReLU์ ๊ฒฝ์ฐ
220
+ ๊ฒฐํฉ๊ณผ ์์ํ ๋ชจ๋ ์ด๋ฃจ์ด์ง์ง ์์์ต๋๋ค .
228
221
229
222
.. code :: ipython3
230
- # Only linear is quantized, since there's no rule for fusing conv-bn-relu
223
+ # ํฉ์ฑ๊ณฑ-๋ฐฐ์น์ ๊ทํ-ReLU์ ๊ด๋ จ๋ ์ ๋ณด๊ฐ ์๊ธฐ ๋๋ฌธ์ ์ ํ ๋ชจ๋ธ ๋ง ์์ํ๋์์ต๋๋ค.
231
224
example_inputs = (torch.rand(1, 3, 10, 10, dtype=torch.float),)
232
225
model = MyModel(use_bn=True)
233
226
prepared = prepare_fx(model, qconfig_mapping, example_inputs, backend_config=backend_config)
@@ -258,9 +251,8 @@ quantized, but conv-bn-relu is neither fused nor quantized.
258
251
sigmoid = self.sigmoid(relu); relu = None
259
252
return sigmoid
260
253
261
- As another experiment, here we use the default QConfigMapping that
262
- doesn't satisfy the dtype constraints specified in the backend. As
263
- a result, nothing is quantized since the QConfigs are simply ignored.
254
+ ๋ฐฑ์๋ ํ๊ฒฝ์ ๋ฐ์ดํฐ ํ์
์ ์ฝ์กฐ๊ฑด์ ๋ง์กฑํ์ง ์๋ ๊ธฐ๋ณธ QConfigMapping์ ์ด์ฉํ์ฌ ๋ ๋ค๋ฅธ ์คํ์ ์งํํ์ต๋๋ค.
255
+ ์คํ ๊ฒฐ๊ณผ QConfig๊ฐ ๋ฌด์๋์ด ์ด๋ค ๋ชจ๋ธ๋ ์์ํ ๋์ง ์์์ต๋๋ค.
264
256
265
257
.. code :: ipython3
266
258
# Nothing is quantized or fused, since backend constraints are not satisfied
@@ -291,36 +283,33 @@ a result, nothing is quantized since the QConfigs are simply ignored.
291
283
return sigmoid
292
284
293
285
294
- Built-in BackendConfigs
286
+ ๊ธฐ๋ณธ BackendConfig
295
287
-----------------------
296
288
297
- PyTorch quantization supports a few built-in native BackendConfigs under
298
- the `` torch.ao.quantization.backend_config `` namespace:
289
+ PyTorch ์์ํ๋ `` torch.ao. quantization.backend_config `` ๋ค์์คํ์ด์ค ํ์
290
+ ์ฌ๋ฌ ๊ธฐ๋ณธ BackendConfig๋ฅผ ์ง์ํฉ๋๋ค.
299
291
300
292
- `get_fbgemm_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/fbgemm.py >`__:
301
- for server target settings
293
+ ์๋ฒ ์ธํ
์ฉ BackendConfig
302
294
- `get_qnnpack_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/qnnpack.py >`__:
303
- for mobile and edge device target settings, also supports XNNPACK
304
- quantized ops
295
+ ๋ชจ๋ฐ์ผ ๋ฐ ์ฃ์ง ์ฅ๋น, XNNPack ์์ํ ์ฐ์ฐ์ ์ง์ BackendConfig
305
296
- `get_native_backend_config <https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/native.py >`__
306
- (default ): a BackendConfig that supports a union of the operator
307
- patterns supported in the FBGEMM and QNNPACK BackendConfigs
297
+ (๊ธฐ๋ณธ๊ฐ ): FBGEMM๊ณผ QNNPACK BackendConfig ๋ด์์ ์ ๊ณต๋๋ ์ฐ์ฐ์ ํจํด์
298
+ ์ง์ํ๋ BackendConfig
308
299
309
- There are also other BackendConfigs under development (e.g.ย for
310
- TensorRT and x86), but these are still mostly experimental at the
311
- moment. If the user wishes to integrate a new, custom backend with
312
- PyTorchโs quantization API, they may define their own BackendConfigs
313
- using the same set of APIs used to define the natively supported
314
- ones as in the example above.
300
+ ๊ทธ ๋ฐ์ ๋ค๋ฅธ BackendConfig(TensorRT, x86 ๋ฑ)๊ฐ ๊ฐ๋ฐ ์ค์ด์ง๋ง
301
+ ์์ง ์คํ ๋จ๊ณ์ ๋จธ๋ฌผ๋ฌ ์์ต๋๋ค. ์๋ก์ด ์ปค์คํ
๋ฐฑ์๋ ํ๊ฒฝ์์
302
+ PyTorch ์์ํ API๋ฅผ ์ฌ์ฉํ๊ธฐ ์ํ๋ค๋ฉด ์์ ์ฝ๋์ ์ ์๋
303
+ API ์ฝ๋๋ฅผ ๋ฐํ์ผ๋ก ์์ฒด์ ์ธ BackendConfig๋ฅผ ์ ์ํ ์ ์์ต๋๋ค.
315
304
316
- Further Reading
305
+ ์ฐธ๊ณ ์๋ฃ
317
306
---------------
318
307
319
- How BackendConfig is used in FX graph mode quantization :
308
+ FX ๊ทธ๋ํ ๋ชจ๋ ์์ํ์์ BackendConfig๋ฅผ ์ฌ์ฉํ๋ ๋ฒ :
320
309
https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/README.md
321
310
322
- Motivation and implementation details behind BackendConfig:
311
+ BackendConfig๊ฐ ๋ง๋ค์ด์ง ๋๊ธฐ์ ๊ตฌํ ๋ฐฉ๋ฒ
323
312
https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/backend_config/README.md
324
313
325
- Early design of BackendConfig :
314
+ BackendConfig์ ์ด๊ธฐ ์ค๊ณ :
326
315
https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md
0 commit comments