Chore(AI): fix problems of accuracy

3 weeks ago · da46bf9359
5 changed files with 10 additions and 9 deletions
--- a/subjects/ai/credit-scoring/README.md
+++ b/subjects/ai/credit-scoring/README.md
@ -26,7 +26,7 @@ There are 3 expected deliverables associated with the scoring model:
 - The trained machine learning model with the features engineering pipeline:

  - Do not forget: **Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning’ is basically feature engineering.**
-  - The model is validated if the **AUC on the test set is higher than 50%**.
+  - The model is validated if the **AUC on the test set is at minimum 55%, ideally to 62% included (or in best cases higher than 62% if you can !)**.
  - The labelled test data is not publicly available. However, a Kaggle competition uses the same data. The procedure to evaluate test set submission is the same as the one used for the project 1.
  - Here are the [DataSets](https://assets.01-edu.org/ai-branch/project5/home-credit-default-risk.zip).

--- a/subjects/ai/credit-scoring/audit/README.md
+++ b/subjects/ai/credit-scoring/audit/README.md
@ -46,7 +46,7 @@ project

 ###### Is the model trained only the training set?

-###### Is the AUC on the test set higher than 50%?
+###### Is the AUC on the test set is between 55% (included) to 62%(included) or higher than 62%?

 ###### Does the model learning curves prove that the model is not overfitting?

@ -59,7 +59,7 @@ project
 ```prompt
    python predict.py

-    AUC on test set: 0.50
+    AUC on test set: 0.62

 ```

--- a/subjects/ai/emotions-detector/README.md
+++ b/subjects/ai/emotions-detector/README.md
@ -164,10 +164,10 @@ Balance technical prowess with psychological insight: as you fine-tune your CNN

 ### Resources

- https://machinelearningmastery.com/what-is-computer-vision/
+- [What is computer vision](https://machinelearningmastery.com/what-is-computer-vision/)

- Use a pre-trained CNN: https://arxiv.org/pdf/1812.06387.pdf
+- [Use a pre-trained CNN](https://arxiv.org/pdf/1812.06387.pdf)

- Hack the CNN https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196
+- [Hack the CNN](https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196)

- https://arxiv.org/pdf/1812.06387.pdf
+- [Convolutional Neural Network](https://arxiv.org/pdf/1812.06387.pdf)
--- a/subjects/ai/kaggle-titanic/README.md
+++ b/subjects/ai/kaggle-titanic/README.md
@ -74,7 +74,7 @@ All people having 100% of accuracy on the Leaderboard cheated, there's no point
 ```console
 project
 │   README.md
-│   environment.yml
+│   requirements.txt
 │   username.txt
 │
 └───data
@ -90,7 +90,7 @@ project

 - `README.md` introduction of the project, shows the username, describes the features engineering and the best score on the **leaderboard**. Note the score on the test set using the exact same pipeline that led to the best score on the leaderboard.

- `environment.yml` contains all libraries required to run the code.
+- 'requirements.txt` contains all libraries required to run the code.

 - `username.txt` contains the username, the last modified date of the file **has to correspond to the first day of the project**.

--- a/subjects/ai/nlp-scraper/README.md
+++ b/subjects/ai/nlp-scraper/README.md
@ -155,6 +155,7 @@ project
 ├── data
 │   └── ...
 ├── nlp_enriched_news.py
+├── requirements.txt
 ├── README.md
 ├── results
 │   ├── training_model.py