You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Linear regression is the first, and therefore, probably the most fundamental
5
-
model—a straight line through data.
4
+
Linear regression is the first, and therefore, probably the most fundamental model—a straight line through data.
6
5
7
6
#### Description
8
-
The ***boston*** dataset is useful for learning about linear regression. The \textit{boston*** dataset has the
9
-
median home price of several areas in Boston. It also has other factors that might impact
10
-
housing prices, for example, crime rate.
7
+
The ***boston*** dataset is useful for learning about linear regression. The ***boston*** dataset has the median home price of several areas in Boston. It also has other factors that might impact housing prices, for example, crime rate.
11
8
12
9
#### Getting the data
13
10
@@ -16,16 +13,15 @@ Firstly, we import the datasets model, then we can load the dataset:
16
13
<pre><code>
17
14
from sklearn import datasets
18
15
boston = datasets.load_boston()
19
-
20
16
</code></pre>
21
17
22
18
### Implementation
23
19
24
20
Using linear regression in scikit-learn is very simple.
25
21
26
22
First, import the ***LinearRegression*** object and create an object (lets call it ***lr***):
27
-
<pre><code>
28
23
24
+
<pre><code>
29
25
from sklearn.linear_model import LinearRegression
30
26
lr = LinearRegression()
31
27
@@ -35,16 +31,13 @@ To fit the model, supply the independent and dependent variables to the ***fit**
@@ -53,43 +46,34 @@ Let's take a look at the regression coefficients:
53
46
</code></pre>
54
47
55
48
A common pattern to express the coefficients of the features and
56
-
their names is ***zip(boston.feature\_names, lr.coef\_)***.
49
+
their names is ***zip(boston.feature_names, lr.coef_)***.
57
50
58
51
59
-
\item We can see which factors have a negative relationship with the
60
-
outcome, and also the factors that have a positive relationship.
61
-
\item The per capita crime rate is the first coefficient in the regression. An increase in the per capita crime rate by town has a negative relationship with the price of a
62
-
home in Boston.
63
-
\end{itemize***
52
+
* We can see which factors have a negative relationship with the outcome, and also the factors that have a positive relationship.
53
+
* The per capita crime rate is the first coefficient in the regression. An increase in the per capita crime rate by town has a negative relationship with the price of a home in Boston.
64
54
55
+
### Making Predictions
65
56
66
-
67
-
### Making Predictions***
68
-
Now, to get the predictions, use the ***predict*** method of
69
-
***LinearRegression***:
57
+
Now, to get the predictions, use the ***predict*** method of ***LinearRegression***:
70
58
71
59
<pre><code>
72
-
73
60
predictions = lr.predict(boston.data)
74
-
75
61
</code></pre>
76
62
77
63
78
64
### Residuals***
79
-
The next step is to look at how close the predicted values are to the actual data. These differences are known as \textbf{residuals***.
65
+
The next step is to look at how close the predicted values are to the actual data. These differences are known as ***residuals***.
80
66
We can use a histogram to look at these residuals.
81
67
% % GRAPHIC
82
68
83
69
84
-
### Other Remarks***
70
+
### Other Remarks
85
71
The ***LinearRegression*** object can automatically normalize (or scale) the inputs:
0 commit comments