Infodemic: first insights into the dynamics of online information on the COVID-19 outbreak


Infodemiology in brief

If social-media is a critical source of information for a large part of the population, the “study of the determinants and distribution of health information and misinformation” (“infodemiology” as defined in Eysenbach, 2002*) becomes extremely important. We have collected almost all the tweets concerning the international debate on the coronavirus outbreak on Twitter. So far we have more than 9 million of tweets (from 28 Jan till 28 Feb) covering both the international and the Italian debate. Here we present some preliminary insights from our ongoing data analysis.

MAIN INSIGHTS:

  1. INFO-NETWORK&OUTBREAK:
    The main picks in the debate are linked to specific events of the outbreak (e.g. increase in the number of infected people, new countries with positive cases, WHO decisions, etc.);
  2. PRIVATE PROFILES (AND NOT MEDIA OR INSTITUTIONS) LEAD THE ITALIAN DEBATE:
    During this first month, the discussion on the Italian twitter network has been dominated by private&personal accounts rather than institutional ones. At international level we didn’t observe the same phenomenon: international media were able to keep a more central role in the debate;
  3. MOST ACTIVE ITALIAN PROFILES SHARE THE SAME INFO-COMMUNITY:
    The most active nodes of the Italian information network are individual profile with similar perspectives on a broad range of topics and often sharing a large part of their information references and followers;
  4. FEW PROFILES CAN INFLUENCE A LARGE PART OF THE DEBATE:
    A relatively low number of nodes is able to convey or influence a good part of the debate;
  5. EXPERTS COULD BE MISSING AN OPPORTUNITY TO REACH A LARGER AUDIENCE:
    Some prominent experts are able to attract a relevant part of the debate even without using key-hashtags. This however could have an impact on their ability to communicate outside their established info-network.

Evolution of COVID-19 info network (28 January – 28 February)

Since January 28th we started collecting tweets having a specific set of keywords/#hashtags connected with the COVID-19 outbreak. So far we have collected around 9 million of tweets. In this online FRI report we consider the information network adopting the keyword #coronavirus at international level and the keyword #coronavirusitalia at Italian level.

International Network

In the last weeks, the public debate on COVID-19 is dominating the social media debate with a daily average of 293 thousand tweets per day and several picks over the 400 thousand tweets per day. The number of collected tweets in this month (28th Jan – 28th Feb) has exceeded 9 million.

COVID-19 Daily Frequency of Tweets
COVID-19 Daily Frequency of Tweets #coronavirus

The words mostly used in these tweets are coronavirus (15%), China (3%), Wuhan (1,6%) and virus (1,4%), mainly reflecting the initial phase of the outbreak.

The overal information network presents several clusters -communities of nodes tightly debating among themselves- [over 24k], a pretty low density [0,0001] and high average degree [1,418]. This represents an information network where the debate is brokered by relatively few relevant actors able to communicate their perspective in their community.

COVID-19 international information network 2020

Most active profiles at international level

The initial phase of the outbreak was largely dominated by news from and about China, especially Wuhan region. The initial twitter debate has largely been driven by personal accounts reporting from China and some media actor that focused on the health emergency. Institutional accounts (e.g. WHO, UN) did not have a leading role in impacting this initial phase of social communication. However data concerning the last few days seems to show an increasing influence of their communication online.

The 10 top accounts for answers & retweet are:

RankProfileDescriptionRetweets+AnswersProfiles with interaction
1@Howroute Private profile141024411
2@Conflits_FRMedia128743018
3@jenniferatndPrivate profile/Media109583587
4@IsChinarPrivate profile93902424
5@livecrisisnewsMedia63442039
6@Finanzas_timesMedia57821617
7@EpochTimesChinaMedia49561646
8@PDChinaMedia47901962
9@MaihenHPrivate profile/media47761937
10@QuickTakeMedia46181825

The initial info-network clearly reflects the geography of the initial phase of the outbreak with Chinese profiles (or accounts focused on Chinese news) dominating the debate. Among the most active profile there are news platform, traditional media, journalist or blogger. The first institutional accounts with an active participation in the debate are:

RankProfileDescriptionRetweets+AnswersProfiles with interaction
30@WHOWorld Health Organization Profile3843983
40@UNUnited Nations Profiles581578

Here below there is a video showing the evolution of the international information network from Jan 28th till Feb 20th when the cumulated number tweets with #coronavirus exceeded 5 million.


Italian Network

The debate on COVID-19 has been very active not only at international but also national and regional level. In Italy the discussion was initially focused on the potential impact of the outbreak on the world and on the country itself. Suddenly the discussion widens after the first Italian case was reported in Codogno during the night between the 20th and the 21st of February.

The Italian information network includes tweets, retweets and answers to tweets containing the hashtag #coronavirusitalia. While it is relatively small if compared to the global #coronavirus network, it already has more than 80k profiles (nodes) and more than 100k active information exchanges (edges).

Most active profiles at Italian level

The initial phase of the outbreak was largely dominated by news from and about China, especially Wuhan region. The initial Twitter debate has largely been driven by personal accounts reporting from China and by some media actor that focused on the health emergency. Institutional accounts (e.g. WHO, UN) did not have a leading role in impacting this initial phase of social communication. However data concerning the last few days seems to show an increasing influence of their communication online.

The top 10 accounts for answers & retweet are:


RankProfilesShort Description#retweetsProfiles Involved
1@RadioSavanaPrivate profile33221617
2@ultimenotizieMedia1167965
3@francescatotoloPrivate profile1353866
4@fdragoniPrivate profile1013835
5@Conflits_FRMedia1546824
6@NelValdezPrivate profile/Media1630813
7@_grayDorianPrivate profile680651
8@ChiodiDonatellaPrivate profile583339
9@TgrRaiMedia355340
10@NicolaPorroPrivate profile/Media314583

While in the international network news platforms have a clear leading role in the debate, in the emerging Italian info-network the most active profiles are individuals (often without a clear official role in emergency crisis). Moreover a large number of private profiles in the top 100 positions share the same info community and have a certain alignment in their communication.

The first institutional accounts with an active participation in the Italian debate are:

12@MinisteroSalute309304
24@Reglombardia205149
37@RegPiemonte188102

For those following the Italian pubic debate, the absence in the list of the most active profiles of Dr. Roberto Burioni (Professor of Microbiology and Virology and a public figure in discussing vaccines and now COVID-19) would be surprising. However this profile, as well as other key profiles (e.g. WHO Director Dr. Tedros Adhanom Ghebreyesus), is able to engage a relevant number of people even without using relevant #hashtags. This however could have an impact on their ability to communicate outside their established info-network.

Here below the evolution of the Italian information network #coronavirusitalia

The sequence of images of the Italian info-network (video above) shows an increasing debate on the topic, clearly spiking between the 20th and the 21st of February due to the first confirmed cases reported in Codogno (the first area with a high number of positive cases in the country).

The initial phases of the online debate in Italy has been largely dominated by private profiles. In the last days institutional online communication is instead emerging more decisively. The analysis of these dynamics is needed to grasp a better understanding of fake news in general, and in particular to be able to design a communication strategy in which health institutions can reach a large audience with the appropriate reliable information since the early stage of a crisis.

Limitations and rationale of this fast report

This is what we call a FRI (Fast Research Insights) report, so it extrapolates some preliminary but timely insights from a much larger and longer research project.  Let’s assume that a completed research project could be seen as a film ready to be shown in theaters (after being checked, double checked, modified, cut, post-produced, etc.). Then, what we show here are some frames of the film, with the risk that some of them are blurred and others will not be very relevant at the end. Yet by sharing up-to-date frames everyone can understand many elements and characteristics of what is going on (e.g. plot, main actors, background dynamics) long before the film is ready for theaters.

Reliable Health Information on COVID-19 outbreak

If you are looking for health information about the COVID-19 outbreak, we strongly recommend to check reliable sources, as:

  1. At international level, WHO is a reliable, timely and structured source of information. WHO also provides a visual representation on a map of the confirmed cases.
  2.  ECDC provide information about the situation in Europe.
  3. For Italy, there is a dedicated page updated by the Italian Health Ministry and one by Istituto Superiore della Sanità.

*Eysenbach, G. (2002). Infodemiology: The epidemiology of (mis) information. The American journal of medicine, 113(9), 763-765.