Browse Source

Chore(AI): fix problems of accuracy

pull/2755/head
Oumaima Fisaoui 3 weeks ago
parent
commit
da46bf9359
  1. 2
      subjects/ai/credit-scoring/README.md
  2. 4
      subjects/ai/credit-scoring/audit/README.md
  3. 8
      subjects/ai/emotions-detector/README.md
  4. 4
      subjects/ai/kaggle-titanic/README.md
  5. 1
      subjects/ai/nlp-scraper/README.md

2
subjects/ai/credit-scoring/README.md

@ -26,7 +26,7 @@ There are 3 expected deliverables associated with the scoring model:
- The trained machine learning model with the features engineering pipeline:
- Do not forget: **Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning’ is basically feature engineering.**
- The model is validated if the **AUC on the test set is higher than 50%**.
- The model is validated if the **AUC on the test set is at minimum 55%, ideally to 62% included (or in best cases higher than 62% if you can !)**.
- The labelled test data is not publicly available. However, a Kaggle competition uses the same data. The procedure to evaluate test set submission is the same as the one used for the project 1.
- Here are the [DataSets](https://assets.01-edu.org/ai-branch/project5/home-credit-default-risk.zip).

4
subjects/ai/credit-scoring/audit/README.md

@ -46,7 +46,7 @@ project
###### Is the model trained only the training set?
###### Is the AUC on the test set higher than 50%?
###### Is the AUC on the test set is between 55% (included) to 62%(included) or higher than 62%?
###### Does the model learning curves prove that the model is not overfitting?
@ -59,7 +59,7 @@ project
```prompt
python predict.py
AUC on test set: 0.50
AUC on test set: 0.62
```

8
subjects/ai/emotions-detector/README.md

@ -164,10 +164,10 @@ Balance technical prowess with psychological insight: as you fine-tune your CNN
### Resources
- https://machinelearningmastery.com/what-is-computer-vision/
- [What is computer vision](https://machinelearningmastery.com/what-is-computer-vision/)
- Use a pre-trained CNN: https://arxiv.org/pdf/1812.06387.pdf
- [Use a pre-trained CNN](https://arxiv.org/pdf/1812.06387.pdf)
- Hack the CNN https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196
- [Hack the CNN](https://medium.com/@ageitgey/machine-learning-is-fun-part-8-how-to-intentionally-trick-neural-networks-b55da32b7196)
- https://arxiv.org/pdf/1812.06387.pdf
- [Convolutional Neural Network](https://arxiv.org/pdf/1812.06387.pdf)

4
subjects/ai/kaggle-titanic/README.md

@ -74,7 +74,7 @@ All people having 100% of accuracy on the Leaderboard cheated, there's no point
```console
project
│ README.md
environment.yml
requirements.txt
│ username.txt
└───data
@ -90,7 +90,7 @@ project
- `README.md` introduction of the project, shows the username, describes the features engineering and the best score on the **leaderboard**. Note the score on the test set using the exact same pipeline that led to the best score on the leaderboard.
- `environment.yml` contains all libraries required to run the code.
- 'requirements.txt` contains all libraries required to run the code.
- `username.txt` contains the username, the last modified date of the file **has to correspond to the first day of the project**.

1
subjects/ai/nlp-scraper/README.md

@ -155,6 +155,7 @@ project
├── data
   └── ...
├── nlp_enriched_news.py
├── requirements.txt
├── README.md
├── results
   ├── training_model.py

Loading…
Cancel
Save