Background
The epidemiology of airborne infectious diseases such as the coronavirus SARS-CoV-2 is an inherently spatiotemporal phenomenon. Moving through time and space, a disease can be tracked by interpreting geographical data such as medical incidence and news reports, but also from informal accounts such as geolocated social media posts. Statistical models for disease outbreak monitoring and prediction can make use of this fact by incorporating the data from a region of interest, as well as their respective neighbourhood.
Methods
Aiming at the prediction of disease patterns, we employ and extend a wide range of from the fields of geostatistics, geo-machine learning and natural language processing. Further, we develop our methods with the aspects of data privacy and handling personalised data in mind, resulting in a strong privacy-by-design aspect of our work.
Results
Exemplary results include the time-series of COVID-related tweets, the latest COVID-related tweets clustered on a map and the latest COVID-related tweets that are geospatially related to the target area in the form of a list. These results are regularly sent to relief organisations to help them to get a quick overview of the Twitter stream with the possibility to focus on particularly interesting tweets. Meanwhile, the results contribute to the work of several crisis teams. Results are listed in HTML-files that can be easily explored by experts. In the first example that covers the COVID-19 related tweets shortly before Christmas 2020 in the german-speaking countries, users discuss various topics such as the vaccine’s effectiveness and the government’s local measures. In the other example, the immediate response after the disastrous explosion in the harbour of Beirut can be seen in the example output. After a few minutes, the first images of the explosion were shared on social media.
We are also continuously producing new results for different study regions, for example in central Europe and mainland USA.
Dorian Arifi