PyDataWorkshop
diff --git a/‎09-DescriptiveStatistics.md
+55-26 b/‎09-DescriptiveStatistics.md
+55-26
diff --git a/‎12-iteration.md
+89-72 b/‎12-iteration.md
+89-72
@@ -1,27 +1,24 @@
 
-
-```python
 Python Pandas - Descriptive Statistics
+=======================================
 
 A large number of methods collectively compute descriptive statistics and other related operations on DataFrame. 
 Most of these are aggregations like sum(), mean(), but some of them, like sumsum(), produce an object of the 
 same size. Generally speaking, these methods take an axis argument, just like ndarray.{sum, std, ...}, but the axis can be 
 specified by name or integer DataFrame − “index” (axis=0, default), “columns” (axis=1) 
 
 Let us create a DataFrame and use this object throughout this chapter for all the operations. 
-```
-
 
-```python
-Example 
-```
+### Example 
 
 
 ```python
 import pandas as pd 
 import numpy as np
 #Create a Dictionary of series 
-d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
 #Create a DataFrame 
 df = pd.DataFrame(d) 
 print( df )
@@ -53,7 +50,9 @@ Returns the sum of the values for the requested axis. By default, axis is index
 import pandas as pd 
 import numpy as np
 #Create a Dictionary of series 
-d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
 #Create a DataFrame 
 df = pd.DataFrame(d) 
 print( df.sum() ) 
@@ -98,18 +97,17 @@ print(df.sum(1))
     dtype: float64
 
 
-
-```python
-mean() 
+### ``mean()``
 Returns the average value
-```
 
 
 ```python
 import pandas as pd
 import numpy as np
 #Create a Dictionary of series 
-d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
 #Create a DataFrame 
 df = pd.DataFrame(d) 
 print(df.mean()) 
@@ -129,7 +127,9 @@ Returns the Bressel standard deviation of the numerical columns.
 import pandas as pd 
 import numpy as np
 #Create a Dictionary of series 
-d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
 #Create a DataFrame 
 df = pd.DataFrame(d) 
 print(df.std() )
@@ -152,15 +152,16 @@ Let us now understand the functions under Descriptive Statistics in Python Panda
 ```python
 
 
-The following table list down the important functions − S.No. Function Description 
+The following table list down the important functions 
+
 1. count() Number of non-null observations 
 2. sum() Sum of values 
 3. mean() Mean of Values 
 4. median() Median of Values 
 5. mode() Mode of values 
 6. std() Standard Deviation of the Values 
-7.min() Minimum Value 
-8. max() Maximum Value 
+7. min() Minimum Value 
+8.  max() Maximum Value 
 9. abs() Absolute Value 
 10. prod() Product of Values 
 11. cumsum() Cumulative Sum 
@@ -180,10 +181,16 @@ The ``describe()`` function computes a summary of statistics pertaining to the D
 
 
 ```python
-import pandas as pd import numpy as np
-#Create a Dictionary of series d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
-#Create a DataFrame df = pd.DataFrame(d) 
-print df.describe() 
+import pandas as pd 
+import numpy as np
+#Create a Dictionary of series 
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+
+#Create a DataFrame 
+df = pd.DataFrame(d) 
+print(df.describe()) 
 ```
 
 
@@ -200,6 +207,11 @@ Takes the list of values; by default, 'number'. object − Summarizes String col
 
  Now, use the following statement in the program and check the output − import pandas as pd import numpy as np
 #Create a Dictionary of series 
+```
+
+
+```python
+
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
 #Create a DataFrame df = pd.DataFrame(d) print df.describe(include=['object']) Its output is as follows − Name count 12 unique 12 top Ricky freq 1 
 
@@ -217,10 +229,27 @@ Now, use the following statement and check the output −
 
 import pandas as pd 
 import numpy as np
-#Create a Dictionary of series d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
-#Create a DataFrame df = pd.DataFrame(d) 
-print df. describe(include='all') 
+#Create a Dictionary of series 
+d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack', 'Lee','David','Gasper','Betina','Andres']), 
+     'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]), 
+     'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}
+#Create a DataFrame 
+df = pd.DataFrame(d) 
+print(df. describe(include='all') )
 
-Its output is as follows − Age Name Rating count 12.000000 12 12.000000 unique NaN 12 NaN top NaN Ricky NaN freq NaN 1 NaN mean 31.833333 NaN 3.743333 std 9.232682 NaN 0.661628 min 23.000000 NaN 2.560000 25% 25.000000 NaN 3.230000 50% 29.500000 NaN 3.790000 75% 35.500000 NaN 4.132500 max 51.000000 NaN 4.800000
 
 ```
+
+                  Age   Name     Rating
+    count   12.000000     12  12.000000
+    unique        NaN     12        NaN
+    top           NaN  Ricky        NaN
+    freq          NaN      1        NaN
+    mean    31.833333    NaN   3.743333
+    std      9.232682    NaN   0.661628
+    min     23.000000    NaN   2.560000
+    25%     25.000000    NaN   3.230000
+    50%     29.500000    NaN   3.790000
+    75%     35.500000    NaN   4.132500
+    max     51.000000    NaN   4.800000
+
@@ -42,46 +42,40 @@ for col in df:
     y
 
 
-
-```python
 To iterate over the rows of the DataFrame, we can use the following functions −
-iteritems() − to iterate over the (key,value) pairs
-iterrows() − iterate over the rows as (index,series) pairs
-itertuples() − iterate over the rows as namedtuples
-iteritems()
-Iterates over each column as key, value pair with label as key and column value as a Series object.
-```
 
-
-```python
+* ``iteritems()`` - to iterate over the (key,value) pairs
+* ``iterrows()`` - iterate over the rows as (index,series) pairs
+* ``itertuples()`` - iterate over the rows as namedtuples
+* ``iteritems()`` - Iterates over each column as key, value pair with label as key and column value as a Series object.
 
 
+```python
 import pandas as pd
 import numpy as np
 
 df = pd.DataFrame(np.random.randn(4,3),columns=['col1','col2','col3'])
 for key,value in df.iteritems():
-   print key,value
-Its output is as follows −
-col1 0    0.802390
-1    0.324060
-2    0.256811
-3    0.839186
-Name: col1, dtype: float64
-
-col2 0    1.624313
-1   -1.033582
-2    1.796663
-3    1.856277
-Name: col2, dtype: float64
-
-col3 0   -0.022142
-1   -0.230820
-2    1.160691
-3   -0.830279
-Name: col3, dtype: float64
+    print(key,value)
 ```
 
+    col1 0    1.141317
+    1    0.289031
+    2   -1.269689
+    3   -1.668425
+    Name: col1, dtype: float64
+    col2 0    1.561011
+    1    0.391033
+    2    0.083089
+    3    0.106299
+    Name: col2, dtype: float64
+    col3 0   -0.436407
+    1    0.136565
+    2    0.444321
+    3    0.738629
+    Name: col3, dtype: float64
+
+
 
 ```python
 
@@ -92,68 +86,84 @@ iterrows() returns the iterator yielding each index value along with a series co
 
 
 ```python
-
 import pandas as pd
 import numpy as np
 
 df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3'])
 for row_index,row in df.iterrows():
-   print row_index,row
-Its output is as follows −
-0  col1    1.529759
-   col2    0.762811
-   col3   -0.634691
-Name: 0, dtype: float64
-
-1  col1   -0.944087
-   col2    1.420919
-   col3   -0.507895
-Name: 1, dtype: float64
- 
-2  col1   -0.077287
-   col2   -0.858556
-   col3   -0.663385
-Name: 2, dtype: float64
-3  col1    -1.638578
-   col2     0.059866
-   col3     0.493482
-Name: 3, dtype: float64
+    print("\n")
+    print(row_index,row)
+```
+
+    
+    
+    0 col1   -0.469367
+    col2   -1.466803
+    col3    0.493435
+    Name: 0, dtype: float64
+    
+    
+    1 col1    0.686016
+    col2   -1.293819
+    col3   -1.087791
+    Name: 1, dtype: float64
+    
+    
+    2 col1    0.646084
+    col2   -0.312096
+    col3    1.518408
+    Name: 2, dtype: float64
+    
+    
+    3 col1   -2.464781
+    col2    0.211235
+    col3   -0.238992
+    Name: 3, dtype: float64
+
+
+
 Note − Because iterrows() iterate over the rows, it doesn't preserve the data type across the row. 0,1,2 are the row indices and col1,col2,col3 are column indices.
 
-```
 
 
 ```python
 itertuples()
 itertuples() method will return an iterator yielding a named tuple for each row in the DataFrame. The first element of the tuple will be the row’s corresponding index value, while the remaining values are the row values.
+
+
+```
+
+
+```python
 import pandas as pd
 import numpy as np
 
 df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3'])
 for row in df.itertuples():
-    print row
+    print("\n")
+    print(row)
 ```
 
+    
+    
+    Pandas(Index=0, col1=-0.9771341226396765, col2=0.2724475615802741, col3=-0.6589499024186599)
+    
+    
+    Pandas(Index=1, col1=1.6177467086432253, col2=-0.9763868574908899, col3=0.08317561529190409)
+    
+    
+    Pandas(Index=2, col1=-0.988445247281908, col2=-0.6366889592765412, col3=0.3956289433362847)
+    
+    
+    Pandas(Index=3, col1=-0.19595952598665276, col2=-0.13115863172857256, col3=-0.04519796025813786)
 
-```python
-
-Its output is as follows −
-Pandas(Index=0, col1=1.5297586201375899, col2=0.76281127433814944, col3=-
-0.6346908238310438)
 
-Pandas(Index=1, col1=-0.94408735763808649, col2=1.4209186418359423, col3=-
-0.50789517967096232)
 
-Pandas(Index=2, col1=-0.07728664756791935, col2=-0.85855574139699076, col3=-
-0.6633852507207626)
 
-Pandas(Index=3, col1=0.65734942534106289, col2=-0.95057710432604969,
-col3=0.80344487462316527)
 Note − Do not try to modify any object while iterating. Iterating is meant for reading and the iterator returns a copy of the original object (a view), thus the changes will not reflect on the original object.
 
 
 
-```
 
 
 ```python
@@ -163,14 +173,21 @@ import numpy as np
 df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3'])
 
 for index, row in df.iterrows():
-   row['a'] = 10
-print df
-Its output is as follows −
-        col1       col2       col3
-0  -1.739815   0.735595  -0.295589
-1   0.635485   0.106803   1.527922
-2  -0.939064   0.547095   0.038585
-3  -1.016509  -0.116580  -0.523158
+    row['a'] = 10
+
+print(df)
+
+```
+
+           col1      col2      col3
+    0  0.325979  0.892602 -1.034127
+    1  2.267333 -0.356288 -2.088448
+    2 -1.159300  1.004701  0.742375
+    3  0.132715 -1.565420 -1.142597
+
+
+
+```python
 Observe, no changes reflected.
 
 ```