Visualising the COVID-19 Spread in Singapore using Tableau
Project description
This visualisation shows the spread of COVID-19 in Singapore across time. The data is filtered to show only data in May 2021, as the dataset used is more promptly updated during this period due to the rise in community cases. The range of data used can however be easily changed and will be reflected by the visualisation in real-time.
Data preparation
This project uses the Public Places Visited by Singapore Covid-19 Cases dataset by Hui Xiang Chua. The list includes public places COVID-19 cases had visited for more than 30 minutes, released by the Ministry of Health Singapore beginning 25 May 2020.
To plot this data on Tableau, we need the longitude and latitude values of the locations. We obtained this data using OneMap’s API, which provides updated location information from the Singapore Land Authority. The documentation for OneMap can be found here.
Before making get requests to OneMap’s API, we processed and cleaned the data by retrieving the location addresses and fixing the typos in them. Addresses that cannot be found were also replaced. The processed address for each location is then passed as a parameter to OneMap’s API to obtain the location’s geocodes. However, OneMap’s API does not provide information about a location’s region/planning area. To generate this data, we retrieved a list of Singapore’s planning areas, found their respective geocodes, and assigned a location’s region to its closest planning area.
The full Python code used for data processing can be found in this repository.
Visualisation using Tableau
Tableau is a simple-to-use data visualisation software. It offers free one-year licenses to students at accredited academic institutions through the Tableau for Students program.
We wanted to use Tableau to create a filled map at first but was unable to do so as a filled map requires proper geographical data at least at District/County level. Hence we settled on a symbol map instead. We set the Longitude and Latitude variables as the columns and rows of the plot and selected Tableau’s symbol map suggestion.
However, we cannot just use Count(Area) as a marker as Tableau ignores the Area variable and simply takes it as the count of points with the same geocodes. To get the count by date and area, we defined a new field, by selecting from the toolbar: Analysis > Create calculated field and using the following as the field definition:
{ FIXED [Date],[Area] : COUNT(Time) }
We then set this field to be a marker. To accentuate the outbreaks, we varied the colour and sizes of the points according to the number of cases in the area.
Lastly, we added the Date variable to the Pages shelf to create a timeline slider and show how the locations of COVID-19 cases change across time.
Result and Insights
The resulting Tableau dashboard here shows the spread of COVID-19 in Singapore across time.
From this visualisation, we can see that:
- Changi’s outbreak started on 3 May (4 cases). The situation started escalating on 5 May when the spread began in the east, and peaked on May 8, when the Singapore government introduced Phase 2.
- The outbreak at Sengkang began on May 7 and peaked on May 10.
- The outbreak at Jurong East began on May 10 and was serious from May 12 to 14.
- After Phase 2 (Heightened alert) started on May 16, there were no outbreaks. The number of places visited by the cases is largely reduced, as more people stay home. The locations are scattered and show no obvious trend.
Note: The above visualisation uses data extracted on 5 June 2021 and may not be up-to-date. This new visualisation uses the most updated data; its data source is updated daily. The code for automated data preparation can be found here.