How might we determine the impact of the Coronavirus on the language around ‘humanitarianism’ in global media discourse?
We used Natural Language Processing (NLP) to look at a corpus of humanitarian news articles from Euro-Atlantic Countries, Gulf Donors, and New Global Media Players from December 2019 to August 2020. This analysis gave us insight into the directionality of humanitarian aid, key topics during the time, and the approach of nations towards handling the pandemic.
MY ROLE
Researcher
Data Analyst
Designer
Developer
TOOLS & METHODS
Python for NLP
Topic Modelling
TF-IDF
Collocation Analysis
HighCharts.js
AmCharts
Adobe XD
TEAM
Tashfeen Ahmed
Tamara Lottering
Xiaohang Xu
Minjia Zhao
Jin Mu
DURATION
3 months
Methodology and Analyses
We started off with initial cleaning and tokenization. The pre-processing involved normalization, stemming, and stopword processing. Once the cleaning and preprocessing were completed, we analysed the collocation of terms using n-grams. trigrams yielded better results than bigrams since they revealed the context of how the terms were used.
To better understand the topics highlighted over time, we performed LSA and LDA topic modeling and took a look at the TF-IDF term frequency. The term frequency gave us a better idea about the topics that we could compare.
During the exploratory data analysis (EDA), I looked at news articles from 2010-2020 to understand the subjectivity of humanitarian news discourse. The pattern over a decade shows that the US and UK were poles apart in 2010 (the UK being more subjective). But they come to the same level by 2020. This visualisation has not been included in the ‘COVID in Pixels’ web page.
Visualisations
I used Highcharts.js and AmCharts to show visualisations. The goal was to make the viewers quickly grasp key insights from the data. The website is hosted on Github pages and features a timeline of news articles along with line and bar graphs. The data was generated in Python and exported to JSON for JavaScript-based visualisations.
This was a group project for one of the courses in the Design Informatics programme.