View on GitHub

covid-analytics

Modeling, simulation, and analytics of COVID-19 data

Coda TimeSeries Library

This will be the future site of coda timeseries library documentation, and the CodaApp TornadoFx application for browsing COVID-19 timeseries data from public sources.

Package Structure

The following two modules provide the core API for working with timeseries data in the coda library:

The following four modules provide specific functionality and data associated with these:

In addition, there is a utility module coda-utils for general-purpose utilities.

TimeSeries Model and Data Structure

When working with COVID (and similar) data, there are hundreds of different metrics where daily measurements or observations are important to track, although almost always a series of operations needs to be applied for any kind of analysis or presentation (e.g. 7-day averages). The coda-time module is designed to provide a common baseline for the most common timeseries aggregations, derivations, and other analytics, and also provides efficient storage utilities.

Reporting a single value in this context (hundreds of metrics, thousands of areas) involves three steps: (i) select the timeseries (by source, area, metric), (ii) make any intermediate operations on that series, and (iii) sample the value. Prior to timeseries selection, one might also need operations on a “cohort” of many timeseries (e.g. median across areas), or a derivation of a new timeseries from several others (e.g. positive tests divided by total tests).

A TimeSeries is a data structure that stores a set of values by dates, associated with some information about where the data comes from and what it references. It has the following fields:

In general, timeseries may also specify values by hour, month, etc. or by specific timestamps, but these are not yet supported.

The APIs for timeseries and area are defined within the coda-area and coda-time modules, as shown here: Coda Core Module UML

CodaApp

CodaApp is a data exploration JavaFx tool written using TornadoFx. It was initially written early in the COVID-19 pandemic (Spring and Summer 2020), so contains an emphasis on understanding of data early in the pandemic, within the first wave. However, it can still be used to load and display the most recent data available from the JHU CSSE COVID-19 dataset.

Currently, the tool has four tabs: