COVID-19 Real-time Data Tracker

Background

It was 2020 and I had two things I was trying to put to use: I'd just passed the AWS Certified Solutions Architect Associate exam, and I'd been learning statistics with Python. The pandemic gave me a problem worth solving — I wanted to be able to look up what was actually happening with COVID infections in my own area, by ZIP code.

This was the first time I designed and built a production system from scratch entirely on AWS. That mattered to me more than the stats themselves.

How it works

The New York Times was publishing daily COVID-19 data at the county level. I built a pipeline to pull that data every day and make it queryable by ZIP code:

A Python Lambda function ran on a daily schedule, pulling the latest NYT dataset and loading it into AWS DynamoDB
A separate table mapped ZIP codes to their corresponding US counties — bridging the gap between how people think about location and how the public health data was structured
An AWS API Gateway endpoint accepted a ZIP code, ran a Lambda lookup, and returned the county's infection statistics

The data ran from early 2020 through May 2022, when the NYT wound down its dataset. I stopped updating the pipeline at that point.

Try it — historical data

The tracker below is still wired to the original API. The data reflects statistics through May 2022 — enter any US ZIP code to see what the numbers looked like at the end of the dataset.

What I took from it

What I actually cared about was the system, not the statistics. I had an idea, knew roughly what AWS could do, and built something end-to-end: scheduled ingestion, a database with a lookup pattern, and a live API someone else could call. For a project I built while learning, it held up.

I also learned something about designing for non-technical users. The ZIP code input was a deliberate choice; county FIPS codes would have been easier to work with technically, but no one thinks in FIPS codes.