Fake News Detection using Deep Learning
Work done with Ekimetrics. The report, in French, can be found here.
In a team of 4, we train several models to detect fake news on the dataset ‘LIAR’ :
- Ensemble Model using 9 Machine Learning algorithms (Decision Tree, Logistic Regression, XGBoost, Random Forest, Extra Trees, AdaBoost, Support Vector Machine with SGD or Linear Programming fitting and Naive Bayes)
- LSTM model and CNN with Word2Vec embeddings. This allows to use effective pretrained embeddings to capture the meaning of the words in the text. The LSTM model then provides a text embedding that is used to classify the article, by using the word embeddings while using the order of the words.
- BERT model using the Transformer architecture for a effective model, which allows to capture the context of the words in the text and interactions between the words thanks to the attention mechanism. Bert achievs a 70% accuracy on the test set.

