stratum-duckdb

Run full analytical SQL against Parquet, CSV, and JSON files on any static website. Powered by DuckDB WASM — no server required. Part of the Stratum toolkit for research data publishing.

DuckDB WASM Parquet native Remote file queries Window functions No server Quarto / OJS ready

Live demo

Loading countries.csv, languages.csv, and indicators.csv from this site and joining them with DuckDB entirely in your browser.

⏳ Initialising DuckDB WASM…

Quick start — plain HTML

stratum-duckdb is an ES module loaded via import(). Only the two DuckDB binary files need to be self-hosted (see README for download instructions).

<script type="module">
  const StratumDuckDB =
    (await import('libs/duckdb/stratum-duckdb.esm.js')).default;

  const db = await StratumDuckDB.open({
    duckdbPath: 'libs/duckdb/',   // folder with worker.js and .wasm
  });

  // Register local files once
  await db.registerFile('countries',  'data/countries.parquet');
  await db.registerFile('indicators', 'data/indicators.parquet');

  // Full analytical SQL — including window functions
  const rows = await db.query(`
    SELECT
      c.name,
      i.life_expectancy,
      RANK() OVER (ORDER BY i.life_expectancy DESC) AS rank
    FROM countries c
    JOIN indicators i ON c.iso3 = i.country_iso
    WHERE i.year = 2022
    ORDER BY rank
  `);
</script>

Quick start — Quarto / ObservableJS

Add this block to your project's _duckdb-init.html (included via _quarto.yml). It loads stratum-duckdb via import() and exposes a ready Database as a Promise that OJS cells can safely await.

<!-- _duckdb-init.html -->
<script type="module">
(async function () {
  const siteLibScript = document.querySelector('script[src*="site_libs/"]');
  const root = siteLibScript
    ? siteLibScript.getAttribute('src').replace(/site_libs\/.*$/, '')
    : '';

  window._duckdbBase = root + 'libs/duckdb/';
  window._dataRoot   = root + 'data/';

  // Expose the initialised Database as a Promise so OJS cells can await it.
  window._stratumDuckDB = (async () => {
    const StratumDuckDB =
      (await import(window._duckdbBase + 'stratum-duckdb.esm.js')).default;
    return StratumDuckDB.open({ duckdbPath: window._duckdbBase });
  })();
}());
</script>

// In your .qmd file (OJS cell):
db = {
  const database = await window._stratumDuckDB;
  await database.registerFile('indicators', 'indicators.csv');
  return database;
}

rows = db.query("SELECT * FROM indicators WHERE year = 2022")
Inputs.table(await rows)

Remote file query — no registration needed

DuckDB uses HTTP range requests to fetch only the columns and row groups each query needs — even for files hundreds of MB in size.

const rows = await db.query(`
  SELECT country_iso, AVG(life_expectancy) AS avg_life
  FROM 'https://github.com/org/repo/releases/download/v1/indicators.parquet'
  WHERE year = 2022
  GROUP BY country_iso
  ORDER BY avg_life DESC
  LIMIT 10
`);

Hosting your data

Your data files can live anywhere with public HTTPS and Access-Control-Allow-Origin: *. For Parquet files, the server must also support HTTP range requests (Accept-Ranges: bytes):

GitHub Pages — same origin, no CORS needed
GitHub Releases — free, 2 GB per file, CORS + range requests
Zenodo / Hugging Face Datasets — public research repositories
Cloudflare R2 / AWS S3 — configure bucket CORS policy

See STRATUM.md for a comparison of stratum-duckdb and stratum-sqlite and guidance on choosing the right library for your dataset.

stratum-duckdb

Live demo

Top 8 countries by GDP (USD bn)

Life expectancy vs internet access — 2022 (JOIN across two files)

Official languages per country

Quick start — plain HTML

Quick start — Quarto / ObservableJS

Remote file query — no registration needed

Hosting your data