One of the powers of machine learning is to use it for classifying data or even predicting any event. This time we will play with text data. So, we have tweet data with keyword, location, text, and label. The labels are either 1 or 0 which shows if the tweets contain any information about the disaster (label ‘1’) or not (label ‘0’). Some of the data does not have any values on keyword or location so we have to do something to handle that problem.
In this experiment, we will learn to use some new packages like eli5 and seaborn. We also learn how to use pipe from sklearn. The important things we learn from this experiment is to create a data-scientist mindset, doing analysis before the experiment instead of jumping directly into the model’s training.
This tutorial is kind of long, so I just put it in my medium account. Here are some links you might want to visit.