We are happy to finally launch the interface to download a collection of tweets related to the Covid-19 pandemic. You can choose a range date, an area (Mexico, Argentina, Colombia, Perú, Ecuador, Spain, Miami area), and language (only for the Miami area, in English and Spanish).
https://covid.dh.miami.edu/get/
The texts are processed by removing accents, punctuations, mention of users (@users) to protect privacy, and replacing all links with “URL.” Emojis are transliterated into a UTF-8 charset and transformed into emojilabels. We also decided to unify all different spellings of Covid-19 under a unique form, and all other characteristics, including hashtags, are always preserved.
But there’s more! We have implemented a simple API to select your collection with no need to access to the interface.
The API entrance point is also here: https://covid.dh.miami.edu/get/ and it serves to deliver the .txt files that you want.
There are three main variables for queries and each query is separated by an ‘&’: language, geolocalization, and date. Each query starts always with a “?” and is abbreviated as follows:
- lang = es or en
- geo = fl, ar, es, co, pe, ec, mx, all
- date: month-year-day, {month}-year-month, {year}-year, or a range ‘ {from}-year-month-day-{to}-year-month-day’
Here are some examples:
- Tweets in English, from Florida, on April 24th:
https://covid.dh.miami.edu/get/?lang=en&geo=fl&date=2020-04-24 - Tweets in Spanish, from Florida, on April 24th:
https://covid.dh.miami.edu/get/?lang=es&geo=fl&date=2020-04-24 - Tweets in Spanish, from Colombia, on May 17th:
https://covid.dh.miami.edu/get/?lang=es&geo=co&date=2020-05-17 - All tweets in Spanish from Flroida:
https://covid.dh.miami.edu/get/?lang=es&geo=fl&date=all - Tweets from Argentina from April 24th to 28th:
https://covid.dh.miami.edu/get/?lang=es&geo=ar&date=from-2020-04-24-to-2020-04-28 - All tweets from Spain during April:
https://covid.dh.miami.edu/get/?lang=es&geo=es&date=month-2020-04
Please, have fun! 😉
Remember: if the file is not generated already in the database, it will take some minutes to be generated.