Skip to content

Commit f98e7b4

Browse files
authored
Merge pull request #8 from devitocodes/examples
Make the Joey documentation and add 'expected' to backward()
2 parents b1efd8e + ce7e9c6 commit f98e7b4

16 files changed

+821
-1446
lines changed

README.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,10 @@ Done! You can now use Joey in your environment. If you want to make changes to t
4444
Joey is not available on PyPI yet.
4545

4646
## How to use
47-
The documentation is currently under construction. In the meantime, you can have a look at examples in `examples`.
47+
To start working with Joey, import the following packages:
48+
```
49+
import joey
50+
import joey.activation # If you want to use activation in neural network layers
51+
```
52+
53+
Afterwards, you are free to use all functions Joey offers. The recommended way of getting started is going through examples inside the `examples` directory in this repository and looking at `__doc__` that is provided in every Joey class and public/abstract class method.

examples/README.md

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
## Joey examples
2+
In this directory, you can find Jupyter notebooks with explained step-by-step examples of using Joey. The recommended order of going through them is as follows:
3+
1. `lenet_forward_pass.ipynb`
4+
2. `lenet_backward_pass.ipynb`
5+
3. `lenet_training.ipynb`
6+
7+
Enjoy!

examples/lenet_backward_pass.ipynb

+137-46
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
{
22
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Runinng a backward pass through LeNet using MNIST and Joey"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"In this notebook, we will construct LeNet using Joey and run a backward pass through it with some training data from MNIST.\n",
15+
"\n",
16+
"The aim of a backward pass is calculating gradients of all network parameters necessary for later weight updates done by a PyTorch optimizer. A backward pass follows a forward pass."
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {},
22+
"source": [
23+
"Firstly, let's import the required prerequisites:"
24+
]
25+
},
326
{
427
"cell_type": "code",
528
"execution_count": 1,
@@ -11,7 +34,17 @@
1134
"import torchvision.transforms as transforms\n",
1235
"import joey as ml\n",
1336
"import matplotlib.pyplot as plt\n",
14-
"import numpy as np"
37+
"import numpy as np\n",
38+
"import torch.nn as nn\n",
39+
"import torch.nn.functional as F\n",
40+
"import torch.optim as optim"
41+
]
42+
},
43+
{
44+
"cell_type": "markdown",
45+
"metadata": {},
46+
"source": [
47+
"Then, let's define `imshow()` allowing us to look at the training data we'll use for the backward pass."
1548
]
1649
},
1750
{
@@ -27,6 +60,13 @@
2760
" plt.show()"
2861
]
2962
},
63+
{
64+
"cell_type": "markdown",
65+
"metadata": {},
66+
"source": [
67+
"In this particular example, every training batch will have 4 images."
68+
]
69+
},
3070
{
3171
"cell_type": "code",
3272
"execution_count": 3,
@@ -36,6 +76,13 @@
3676
"batch_size = 4"
3777
]
3878
},
79+
{
80+
"cell_type": "markdown",
81+
"metadata": {},
82+
"source": [
83+
"Once we have `imshow()` and `batch_size` defined, we'll download the MNIST images using PyTorch."
84+
]
85+
},
3986
{
4087
"cell_type": "code",
4188
"execution_count": 4,
@@ -53,6 +100,13 @@
53100
"dataiter = iter(trainloader)"
54101
]
55102
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"In our case, only one batch will be used for the backward pass. Joey accepts only NumPy arrays, so we have to convert PyTorch tensors to their NumPy equivalents first."
108+
]
109+
},
56110
{
57111
"cell_type": "code",
58112
"execution_count": 5,
@@ -63,6 +117,13 @@
63117
"input_data = images.numpy()"
64118
]
65119
},
120+
{
121+
"cell_type": "markdown",
122+
"metadata": {},
123+
"source": [
124+
"For reference, let's have a look at our training data. There are 4 images corresponding to the following digits: 5, 0, 4, 1."
125+
]
126+
},
66127
{
67128
"cell_type": "code",
68129
"execution_count": 6,
@@ -85,6 +146,20 @@
85146
"imshow(torchvision.utils.make_grid(images))"
86147
]
87148
},
149+
{
150+
"cell_type": "markdown",
151+
"metadata": {},
152+
"source": [
153+
"At this point, we're ready to define `backward_pass()` running the backward pass through Joey-constructed LeNet. We'll do so using the `Conv`, `MaxPooling`, `Flat`, `FullyConnected` and `FullyConnectedSoftmax` layer classes along with the `Net` class packing everything into one network we can interact with."
154+
]
155+
},
156+
{
157+
"cell_type": "markdown",
158+
"metadata": {},
159+
"source": [
160+
"Note that a loss function has to be defined manually. Joey doesn't provide any built-in options here at the moment."
161+
]
162+
},
88163
{
89164
"cell_type": "code",
90165
"execution_count": 7,
@@ -95,64 +170,66 @@
95170
" # Six 3x3 filters, activation RELU\n",
96171
" layer1 = ml.Conv(kernel_size=(6, 3, 3),\n",
97172
" input_size=(batch_size, 1, 32, 32),\n",
98-
" activation=ml.activation.ReLU(),\n",
99-
" generate_code=False)\n",
173+
" activation=ml.activation.ReLU())\n",
100174
" # Max 2x2 subsampling\n",
101175
" layer2 = ml.MaxPooling(kernel_size=(2, 2),\n",
102176
" input_size=(batch_size, 6, 30, 30),\n",
103-
" stride=(2, 2),\n",
104-
" generate_code=False)\n",
177+
" stride=(2, 2))\n",
105178
" # Sixteen 3x3 filters, activation RELU\n",
106179
" layer3 = ml.Conv(kernel_size=(16, 3, 3),\n",
107180
" input_size=(batch_size, 6, 15, 15),\n",
108-
" activation=ml.activation.ReLU(),\n",
109-
" generate_code=False)\n",
181+
" activation=ml.activation.ReLU())\n",
110182
" # Max 2x2 subsampling\n",
111183
" layer4 = ml.MaxPooling(kernel_size=(2, 2),\n",
112184
" input_size=(batch_size, 16, 13, 13),\n",
113185
" stride=(2, 2),\n",
114-
" strict_stride_check=False,\n",
115-
" generate_code=False)\n",
186+
" strict_stride_check=False)\n",
116187
" # Full connection (16 * 6 * 6 -> 120), activation RELU\n",
117188
" layer5 = ml.FullyConnected(weight_size=(120, 576),\n",
118189
" input_size=(576, batch_size),\n",
119-
" activation=ml.activation.ReLU(),\n",
120-
" generate_code=False)\n",
190+
" activation=ml.activation.ReLU())\n",
121191
" # Full connection (120 -> 84), activation RELU\n",
122192
" layer6 = ml.FullyConnected(weight_size=(84, 120),\n",
123193
" input_size=(120, batch_size),\n",
124-
" activation=ml.activation.ReLU(),\n",
125-
" generate_code=False)\n",
194+
" activation=ml.activation.ReLU())\n",
126195
" # Full connection (84 -> 10), output layer\n",
127196
" layer7 = ml.FullyConnectedSoftmax(weight_size=(10, 84),\n",
128-
" input_size=(84, batch_size),\n",
129-
" generate_code=False)\n",
197+
" input_size=(84, batch_size))\n",
130198
" # Flattening layer necessary between layer 4 and 5\n",
131-
" layer_flat = ml.Flat(input_size=(batch_size, 16, 6, 6),\n",
132-
" generate_code=False)\n",
199+
" layer_flat = ml.Flat(input_size=(batch_size, 16, 6, 6))\n",
133200
" \n",
134201
" layers = [layer1, layer2, layer3, layer4,\n",
135202
" layer_flat, layer5, layer6, layer7]\n",
136203
" \n",
137204
" net = ml.Net(layers)\n",
138205
" outputs = net.forward(input_data)\n",
139206
" \n",
140-
" def loss_grad(layer, b):\n",
207+
" def loss_grad(layer, expected):\n",
141208
" gradients = []\n",
142209
" \n",
143-
" for i in range(10):\n",
144-
" result = layer.result.data[i, b]\n",
145-
" if i == expected_results[b]:\n",
146-
" result -= 1\n",
147-
" gradients.append(result)\n",
210+
" for b in range(batch_size):\n",
211+
" row = []\n",
212+
" for i in range(10):\n",
213+
" result = layer.result.data[i, b]\n",
214+
" if i == expected[b]:\n",
215+
" result -= 1\n",
216+
" row.append(result)\n",
217+
" gradients.append(row)\n",
148218
" \n",
149219
" return gradients\n",
150220
" \n",
151-
" net.backward(loss_grad)\n",
221+
" net.backward(expected_results, loss_grad)\n",
152222
" \n",
153223
" return (layer1, layer2, layer3, layer4, layer_flat, layer5, layer6, layer7)"
154224
]
155225
},
226+
{
227+
"cell_type": "markdown",
228+
"metadata": {},
229+
"source": [
230+
"Afterwards, we're ready to run the backward pass."
231+
]
232+
},
156233
{
157234
"cell_type": "code",
158235
"execution_count": 8,
@@ -167,9 +244,6 @@
167244
"/home/maksymilian/Desktop/UROP/devito/devito/types/grid.py:206: RuntimeWarning: divide by zero encountered in true_divide\n",
168245
" spacing = (np.array(self.extent) / (np.array(self.shape) - 1)).astype(self.dtype)\n",
169246
"Operator `Kernel` run in 0.01 s\n",
170-
"Operator `Kernel` run in 0.01 s\n",
171-
"Operator `Kernel` run in 0.01 s\n",
172-
"Operator `Kernel` run in 0.01 s\n",
173247
"Operator `Kernel` run in 0.01 s\n"
174248
]
175249
}
@@ -182,23 +256,26 @@
182256
"cell_type": "markdown",
183257
"metadata": {},
184258
"source": [
185-
"PyTorch:"
259+
"Results are stored in the `kernel_gradients` and `bias_gradients` properties of each layer (where applicable)."
186260
]
187261
},
188262
{
189-
"cell_type": "code",
190-
"execution_count": 9,
263+
"cell_type": "markdown",
191264
"metadata": {},
192-
"outputs": [],
193265
"source": [
194-
"import torch.nn as nn\n",
195-
"import torch.nn.functional as F\n",
196-
"import torch.optim as optim"
266+
"In order to check the numerical correctness, we'll create the same network with PyTorch, run a backward pass through it using the same initial weights and data and compare the results with Joey's."
267+
]
268+
},
269+
{
270+
"cell_type": "markdown",
271+
"metadata": {},
272+
"source": [
273+
"Here's the PyTorch code:"
197274
]
198275
},
199276
{
200277
"cell_type": "code",
201-
"execution_count": 10,
278+
"execution_count": 9,
202279
"metadata": {},
203280
"outputs": [],
204281
"source": [
@@ -230,7 +307,7 @@
230307
},
231308
{
232309
"cell_type": "code",
233-
"execution_count": 11,
310+
"execution_count": 10,
234311
"metadata": {},
235312
"outputs": [],
236313
"source": [
@@ -252,7 +329,7 @@
252329
},
253330
{
254331
"cell_type": "code",
255-
"execution_count": 12,
332+
"execution_count": 11,
256333
"metadata": {},
257334
"outputs": [],
258335
"source": [
@@ -263,31 +340,38 @@
263340
"loss.backward()"
264341
]
265342
},
343+
{
344+
"cell_type": "markdown",
345+
"metadata": {},
346+
"source": [
347+
"After running the backward pass in PyTorch, we're ready to make comparisons. Let's calculate relative errors between Joey and PyTorch in terms of weight/bias gradients."
348+
]
349+
},
266350
{
267351
"cell_type": "code",
268-
"execution_count": 13,
352+
"execution_count": 12,
269353
"metadata": {},
270354
"outputs": [
271355
{
272356
"name": "stdout",
273357
"output_type": "stream",
274358
"text": [
275-
"layers[0] maximum relative error: 1.599673499123359e-14\n",
276-
"layers[1] maximum relative error: 5.710234136667345e-12\n",
277-
"layers[2] maximum relative error: 1.9638017195468526e-11\n",
278-
"layers[3] maximum relative error: 1.8676488586249282e-11\n",
279-
"layers[4] maximum relative error: 3.4692340371450744e-13\n",
359+
"layers[0] maximum relative error: 1.4935025269750558e-14\n",
360+
"layers[1] maximum relative error: 1.0457210947850931e-13\n",
361+
"layers[2] maximum relative error: 3.0920027811804816e-12\n",
362+
"layers[3] maximum relative error: 2.615895862310905e-13\n",
363+
"layers[4] maximum relative error: 1.4951643318957554e-12\n",
280364
"\n",
281-
"Maximum relative error is in layers[2]: 1.9638017195468526e-11\n"
365+
"Maximum relative error is in layers[2]: 3.0920027811804816e-12\n"
282366
]
283367
},
284368
{
285369
"name": "stderr",
286370
"output_type": "stream",
287371
"text": [
288-
"<ipython-input-13-c5fd7a032cbe>:11: RuntimeWarning: invalid value encountered in true_divide\n",
372+
"<ipython-input-12-c5fd7a032cbe>:11: RuntimeWarning: invalid value encountered in true_divide\n",
289373
" kernel_error = abs(kernel_grad - pytorch_kernel_grad) / abs(pytorch_kernel_grad)\n",
290-
"<ipython-input-13-c5fd7a032cbe>:16: RuntimeWarning: invalid value encountered in true_divide\n",
374+
"<ipython-input-12-c5fd7a032cbe>:16: RuntimeWarning: invalid value encountered in true_divide\n",
291375
" bias_error = abs(bias_grad - pytorch_bias_grad) / abs(pytorch_bias_grad)\n"
292376
]
293377
}
@@ -320,6 +404,13 @@
320404
"print()\n",
321405
"print('Maximum relative error is in layers[' + str(index) + ']: ' + str(max_error))"
322406
]
407+
},
408+
{
409+
"cell_type": "markdown",
410+
"metadata": {},
411+
"source": [
412+
"As we can see, the maximum error is low enough (given floating-point calculation accuracy and the complexity of our network) for Joey's results to be considered correct."
413+
]
323414
}
324415
],
325416
"metadata": {

0 commit comments

Comments
 (0)