mirror of https://github.com/01-edu/public.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4.7 KiB
4.7 KiB
Linear regression with Scikit Learn
Exercise 0: Environment and libraries
The exercise is validated is all questions of the exercise are validated
Activate the virtual environment. If you used conda
run conda activate your_env
Run python --version
Does it print Python 3.x
? x >= 8
Do import jupyter
, import numpy
, import pandas
, import matplotlib
and import sklearn
run without any error?
Exercise 1: Scikit-learn estimator
For question 1, is the output the following?
array([[3.96013289]])
For question 2, is the output the following?
Coefficients: [[0.99667774]]
Intercept: [-0.02657807]
Score: 0.9966777408637874
Exercise 2: Linear regression in 1D
The exercise is validated if all questions of the exercise are validated
For question 1, does the plot look like the following?
For question 2, is the equation of the fitted line the following? y = 42.619430291366946 * x + 99.18581817296929
For question 3, does the plot look like the following?
For question 4, is the outputted prediction for the first 10 values the following?
array([ 83.86186727, 140.80961751, 116.3333897 , 64.52998689,
61.34889539, 118.10301628, 57.5347917 , 117.44107847,
108.06237908, 85.90762675])
For question 5, is the MSE returned 114.17148616819485
?
For question 6, is the MSE returned 2854.2871542048706
?
Exercise 3: Train test split
For question 1, do X_train, y_train, X_test, y_test match this output?
X_train:
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]
[11 12]
[13 14]
[15 16]]
y_train:
[1 2 3 4 5 6 7 8]
X_test:
[[17 18]
[19 20]]
y_test:
[ 9 10]
Exercise 4: Forecast diabetes progression
The exercise is validated if all questions of the exercise are validated
For question 1, is the output of y_train.values[:10]
and y_test.values[:10]
the following?
y_train.values[:10]:
[[202.]
[ 55.]
[202.]
[ 42.]
[214.]
[173.]
[118.]
[ 90.]
[129.]
[151.]]
y_test.values[:10]:
[[ 71.]
[ 72.]
[235.]
[277.]
[109.]
[ 61.]
[109.]
[ 78.]
[ 66.]
[192.]]
For question 2, are the coefficients and the intercept the following?
[('age', -60.40163046086952),
('sex', -226.08740652083418),
('bmi', 529.383623302316),
('bp', 259.96307686274605),
('s1', -859.121931974365),
('s2', 504.70960058378813),
('s3', 157.42034928335502),
('s4', 226.29533600601638),
('s5', 840.7938070846119),
('s6', 34.712225788519554),
('intercept', 152.05314895029233)]
For question 3, is the output of predictions_on_test[:10]
?
array([[111.74351759],
[ 98.41335251],
[168.36373195],
[255.05882934],
[168.43764643],
[117.60982186],
[198.86966323],
[126.28961941],
[117.73121787],
[224.83346984]])
For question 4, is the mse on the train set 2888.326888
and the mse on the test set 2858.255153
?
Exercise 5: Gradient Descent (Optional)
The exercise is validated if all questions of the exercise are validated.
+The question 1 is validated if the outputted plot looks like:
+The question 2 is validated if the output is: 11808.867339751561
+The question 3 is validated if grid.shape
is (640000,2)
.
+The question 4 is validated if the 10 first values of losses are:
array([158315.41493175, 158001.96852692, 157689.02212209, 157376.57571726,
157064.62931244, 156753.18290761, 156442.23650278, 156131.79009795,
155821.84369312, 155512.39728829])
+The question 5 is validated if the outputted plot looks like
+The question 6 is validated if the point returned is:
array([42.5, 99. ])
. It means that a= 42.5
and b=99
.
+The question 7 is validated if the coefficients returned are:
Coefficients (a): 42.61943031121358
Intercept (b): 99.18581814447936
+The question 8 is validated if the outputted plot is:
+The question 9 is validated if the coefficients and intercept returned are:
Coefficients: [42.61943029]
Intercept: 99.18581817296929