Scans of Source Documents

We are systematically contacting data stewards across Canada to access the disparate source documents that contain Canada's historical infectious disease data. We are making scans of these documents conveniently available for all.

Digitized Spreadsheets

We are manually entering the information provided by these source documents into Excel spreadsheets, which we are making publicly available. The layout of each spreadsheet is identical to the original, making it as easy as possible to compare each reproduction with its source.

Tidy Data Frames and CSV Files

We are producing reproducible automated processes for converting the digitized spreadsheets into tidy data structures. These data structures contain all of the information in the original source documents, but are more convenient for analysis and discovery.

Data Harmonization Tools

The information in the original source documents often present challenges for data analysis and modelling (e.g. missing report weeks, disease names and classifications that change over the years). We are therefore producing tools for converting the tidy data structures that are convenient for exploring the data as-they-are, into data sets ready for analysis. These tools include options and features for exploring the consequences of various data harmonization and imputation techniques.