MedStatStudio Logo

MedStatStudio Projects: Tweet Machine

During large-scale disasters, social media may give insight into the societal implications, concerns, and sentiments of the affected area. Twitter is a commonly used social media and may represent a valuable source of information. However, as tweets are generally formed of unstructured text, they can be difficult and time consuming to analyze. The present study compares the ability of several machine learning algorithms to classify tweets from the 2012 Emilia-Romagna earthquake into meaningful categories. Machine learning was performed using three algorithms: k-nearest neighbors (KNN), kernel support vector machine using a term-document matrix (KSVM), and string kernel support vector machine (SKSVM). Accuracy of machine learning classification was compared to that of a group of three skilled reviewers.