In this post, I will use the Python programming language, Natural Language Processing (NLP), Bag of Words model, and Multinomial Naive Bayes algorithm to classify if an email is spam or not.


Spam emails are unsolicited or junk email that gets sent out in bulk. A classic example of spam emails are emails from so-called Nigerian princes that promise to send you huge amounts of money if you provide them your bank accout details. Spam are annoying because they take up a lot of storage space and communication bandwidth. According to (Awad 2011) “ 40% of all emails are spam…

