PyDataWorkshop
diff --git a/‎00-Overview.ipynb
+78 b/‎00-Overview.ipynb
+78
diff --git a/‎AssessingClusterCorrectness.ipynb
+440 b/‎AssessingClusterCorrectness.ipynb
+440
@@ -0,0 +1,78 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "Clsutering Analysis\n",
+    "=====================\n",
+    "\n",
+    "### Building Models with Distance Metrics\n",
+    "\n",
+    "This chapter will cover the following topics:\n",
+    "* Using KMeans to cluster data\n",
+    "* Optimizing the number of centroids\n",
+    "* Assessing cluster correctness\n",
+    "* Using MiniBatch KMeans to handle more data\n",
+    "* Quantizing an image with KMeans clustering\n",
+    "* Finding the closest objects in the feature space\n",
+    "* Probabilistic clustering with Gaussian Mixture Models\n",
+    "* Using KMeans for outlier detection\n",
+    "* Using k-NN for regression\n",
+    "\n",
+    "### Introduction\n",
+    "* Clustering is often grouped together with unsupervised techniques.\n",
+    "These techniques assume that we do not know the outcome variable.\n",
+    "\n",
+    "* This leads to ambiguity in outcomes and objectives in practice, but\n",
+    "nevertheless, clustering can be useful. As we'll see, we can use clus-\n",
+    "tering to \"localize\" our estimates in a supervised setting. This is\n",
+    "perhaps why clustering is so effective; it can handle a wide range of\n",
+    "situations, and often, the results are for the lack of a better term,\n",
+    "\"sane\". We'll walk through a wide variety of applications in this\n",
+    "chapter; from image processing to regression and outlier detection.\n",
+    "\n",
+    "* Through these applications, we'll see that clustering can often be\n",
+    "viewed through a probabilistic or optimization lens. Di\u000b",
+    "erent inter-\n",
+    "pretations lead to various trade-o\u000b",
+    "s. We'll walk through how to \f",
+    "t\n",
+    "the models here so that you have the tools to try out many models\n",
+    "when faced with a clustering problem.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}