From da46bf9359de30bf41a918eb91a5a42770990d79 Mon Sep 17 00:00:00 2001
From: Oumaima Fisaoui <48260689+Oumaimafisaoui@users.noreply.github.com>
Date: Wed, 2 Oct 2024 10:48:14 +0100
Subject: [PATCH] Chore(AI): fix problems of accuracy

---
 subjects/ai/credit-scoring/README.md       | 2 +-
 subjects/ai/credit-scoring/audit/README.md | 4 ++--
 subjects/ai/emotions-detector/README.md    | 8 ++++----
 subjects/ai/kaggle-titanic/README.md       | 4 ++--
 subjects/ai/nlp-scraper/README.md          | 1 +
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/subjects/ai/credit-scoring/README.md b/subjects/ai/credit-scoring/README.md
index 66a7f65c6..3ed8741a7 100644
--- a/subjects/ai/credit-scoring/README.md
+++ b/subjects/ai/credit-scoring/README.md
@@ -26,7 +26,7 @@ There are 3 expected deliverables associated with the scoring model:
 - The trained machine learning model with the features engineering pipeline:
 
   - Do not forget: **Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning’ is basically feature engineering.**
-  - The model is validated if the **AUC on the test set is higher than 50%**.
+  - The model is validated if the **AUC on the test set is at minimum 55%, ideally to 62% included (or in best cases higher than 62% if you can !)**.
   - The labelled test data is not publicly available. However, a Kaggle competition uses the same data. The procedure to evaluate test set submission is the same as the one used for the project 1.
   - Here are the [DataSets](https://assets.01-edu.org/ai-branch/project5/home-credit-default-risk.zip).
 
diff --git a/subjects/ai/credit-scoring/audit/README.md b/subjects/ai/credit-scoring/audit/README.md
index 1cceee536..7eba4ec1f 100644
--- a/subjects/ai/credit-scoring/audit/README.md
+++ b/subjects/ai/credit-scoring/audit/README.md
@@ -46,7 +46,7 @@ project
 
 ###### Is the model trained only the training set?
 
-###### Is the AUC on the test set higher than 50%?
+###### Is the AUC on the test set is between 55% (included) to 62%(included) or higher than 62%?
 
 ###### Does the model learning curves prove that the model is not overfitting?
 
@@ -59,7 +59,7 @@ project
 ```prompt
     python predict.py
 
-    AUC on test set: 0.50
+    AUC on test set: 0.62
 
 ```
 
diff --git a/subjects/ai/emotions-detector/README.md b/subjects/ai/emotions-detector/README.md
index 4d3e31136..f95b70dee 100644
--- a/subjects/ai/emotions-detector/README.md
+++ b/subjects/ai/emotions-detector/README.md
@@ -164,10 +164,10 @@ Balance technical prowess with psychological insight: as you fine-tune your CNN
 
 ### Resources
 
-- https://machinelearningmastery.com/what-is-computer-vision/
+- [What is computer vision](https://machinelearningmastery.com/what-is-computer-vision/)
 
-- Use a pre-trained CNN: https://arxiv.org/pdf/1812.06387.pdf
+- [Use a pre-trained CNN](https://arxiv.org/pdf/1812.06387.pdf)
 
-- Hack the CNN https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196
+- [Hack the CNN](https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196)
 
-- https://arxiv.org/pdf/1812.06387.pdf
+- [Convolutional Neural Network](https://arxiv.org/pdf/1812.06387.pdf)
diff --git a/subjects/ai/kaggle-titanic/README.md b/subjects/ai/kaggle-titanic/README.md
index 93aeed23f..9e1c8f64c 100644
--- a/subjects/ai/kaggle-titanic/README.md
+++ b/subjects/ai/kaggle-titanic/README.md
@@ -74,7 +74,7 @@ All people having 100% of accuracy on the Leaderboard cheated, there's no point
 ```console
 project
 │   README.md
-│   environment.yml
+│   requirements.txt
 │   username.txt
 │
 └───data
@@ -90,7 +90,7 @@ project
 
 - `README.md` introduction of the project, shows the username, describes the features engineering and the best score on the **leaderboard**. Note the score on the test set using the exact same pipeline that led to the best score on the leaderboard.
 
-- `environment.yml` contains all libraries required to run the code.
+- 'requirements.txt` contains all libraries required to run the code.
 
 - `username.txt` contains the username, the last modified date of the file **has to correspond to the first day of the project**.
 
diff --git a/subjects/ai/nlp-scraper/README.md b/subjects/ai/nlp-scraper/README.md
index 69545a6bf..71209fb80 100644
--- a/subjects/ai/nlp-scraper/README.md
+++ b/subjects/ai/nlp-scraper/README.md
@@ -155,6 +155,7 @@ project
 ├── data
 │   └── ...
 ├── nlp_enriched_news.py
+├── requirements.txt
 ├── README.md
 ├── results
 │   ├── training_model.py