Fake News Detection using Deep Learning

Work done with Ekimetrics. The report, in French, can be found here.

In a team of 4, we train several models to detect fake news on the dataset ‘LIAR’ :

Ensemble Model using 9 Machine Learning algorithms (Decision Tree, Logistic Regression, XGBoost, Random Forest, Extra Trees, AdaBoost, Support Vector Machine with SGD or Linear Programming fitting and Naive Bayes)
LSTM model and CNN with Word2Vec embeddings. This allows to use effective pretrained embeddings to capture the meaning of the words in the text. The LSTM model then provides a text embedding that is used to classify the article, by using the word embeddings while using the order of the words.
BERT model using the Transformer architecture for a effective model, which allows to capture the context of the words in the text and interactions between the words thanks to the attention mechanism. Bert achievs a 70% accuracy on the test set.