Platform

Implementation

OpenAQ aggregates air quality data from disparate sources around the world to provide access of these sources in a single location.

OpenAQ uses an ETL (Extract-Transform-Load) process to ingest and harmonize air quality data. The data process has four main components: fetch, storage, presentation and archive.

Fetch

After identifying a source of air quailty data, a fetch adapter is developed to pull data from the source and parse into a standard file format. OpenAQ fetch scripts range from HTML scrapers, FTP directory scanners to REST API scrapers depending on the source of the data.

Currently data fetching is split between two repositories:

https://github.com/openaq/openaq-fetch — primarily reference grade/government sources written in NodeJS and runs on AWS Fargate processes.
https://github.com/openaq/openaq-lcs-fetch — our newer generation of fetch processes, primarily fetching low cost sensor sources. Written in NodeJS and runs on AWS lambda processes.

Storage

Data is stored in a PostgreSQL database using the TimeScale extension for added time series functionality and PostGIS for geospatial functionality.

View the source code for the database on github (https://github.com/openaq/openaq-db

Presentation

OpenAQ provides a REST API for programmatic access of the database.

Read more in the API section below or in the API Reference for detailed API reference.

View the source code for the REST API on github https://github.com/openaq/openaq-api-v2

Implementation

Fetch

Storage

Presentation

Archive