Skip to content

Commit 2c6c5c2

Browse files
Add files via upload
1 parent ce1bf9a commit 2c6c5c2

10 files changed

+3537
-0
lines changed

00-Overview.ipynb

+78
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"collapsed": true
7+
},
8+
"source": [
9+
"Clsutering Analysis\n",
10+
"=====================\n",
11+
"\n",
12+
"### Building Models with Distance Metrics\n",
13+
"\n",
14+
"This chapter will cover the following topics:\n",
15+
"* Using KMeans to cluster data\n",
16+
"* Optimizing the number of centroids\n",
17+
"* Assessing cluster correctness\n",
18+
"* Using MiniBatch KMeans to handle more data\n",
19+
"* Quantizing an image with KMeans clustering\n",
20+
"* Finding the closest objects in the feature space\n",
21+
"* Probabilistic clustering with Gaussian Mixture Models\n",
22+
"* Using KMeans for outlier detection\n",
23+
"* Using k-NN for regression\n",
24+
"\n",
25+
"### Introduction\n",
26+
"* Clustering is often grouped together with unsupervised techniques.\n",
27+
"These techniques assume that we do not know the outcome variable.\n",
28+
"\n",
29+
"* This leads to ambiguity in outcomes and objectives in practice, but\n",
30+
"nevertheless, clustering can be useful. As we'll see, we can use clus-\n",
31+
"tering to \"localize\" our estimates in a supervised setting. This is\n",
32+
"perhaps why clustering is so effective; it can handle a wide range of\n",
33+
"situations, and often, the results are for the lack of a better term,\n",
34+
"\"sane\". We'll walk through a wide variety of applications in this\n",
35+
"chapter; from image processing to regression and outlier detection.\n",
36+
"\n",
37+
"* Through these applications, we'll see that clustering can often be\n",
38+
"viewed through a probabilistic or optimization lens. Di\u000b",
39+
"erent inter-\n",
40+
"pretations lead to various trade-o\u000b",
41+
"s. We'll walk through how to \f",
42+
"t\n",
43+
"the models here so that you have the tools to try out many models\n",
44+
"when faced with a clustering problem.\n"
45+
]
46+
},
47+
{
48+
"cell_type": "code",
49+
"execution_count": null,
50+
"metadata": {
51+
"collapsed": true
52+
},
53+
"outputs": [],
54+
"source": []
55+
}
56+
],
57+
"metadata": {
58+
"kernelspec": {
59+
"display_name": "Python 3",
60+
"language": "python",
61+
"name": "python3"
62+
},
63+
"language_info": {
64+
"codemirror_mode": {
65+
"name": "ipython",
66+
"version": 3
67+
},
68+
"file_extension": ".py",
69+
"mimetype": "text/x-python",
70+
"name": "python",
71+
"nbconvert_exporter": "python",
72+
"pygments_lexer": "ipython3",
73+
"version": "3.5.1"
74+
}
75+
},
76+
"nbformat": 4,
77+
"nbformat_minor": 2
78+
}

AssessingClusterCorrectness.ipynb

+440
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)