stratum-duckdb
Run full analytical SQL against Parquet, CSV, and JSON files on any static website. Powered by DuckDB WASM — no server required. Part of the Stratum toolkit for research data publishing.
Live demo
Loading countries.csv, languages.csv, and
indicators.csv from this site and joining them with DuckDB
entirely in your browser.
Quick start — plain HTML
stratum-duckdb is an ES module loaded via import().
Only the two DuckDB binary files need to be self-hosted
(see README
for download instructions).
<script type="module">
const StratumDuckDB =
(await import('libs/duckdb/stratum-duckdb.esm.js')).default;
const db = await StratumDuckDB.open({
duckdbPath: 'libs/duckdb/', // folder with worker.js and .wasm
});
// Register local files once
await db.registerFile('countries', 'data/countries.parquet');
await db.registerFile('indicators', 'data/indicators.parquet');
// Full analytical SQL — including window functions
const rows = await db.query(`
SELECT
c.name,
i.life_expectancy,
RANK() OVER (ORDER BY i.life_expectancy DESC) AS rank
FROM countries c
JOIN indicators i ON c.iso3 = i.country_iso
WHERE i.year = 2022
ORDER BY rank
`);
</script>
Quick start — Quarto / ObservableJS
Add this block to your project's _duckdb-init.html
(included via _quarto.yml). It loads stratum-duckdb via
import() and exposes a ready Database as a
Promise that OJS cells can safely await.
<!-- _duckdb-init.html -->
<script type="module">
(async function () {
const siteLibScript = document.querySelector('script[src*="site_libs/"]');
const root = siteLibScript
? siteLibScript.getAttribute('src').replace(/site_libs\/.*$/, '')
: '';
window._duckdbBase = root + 'libs/duckdb/';
window._dataRoot = root + 'data/';
// Expose the initialised Database as a Promise so OJS cells can await it.
window._stratumDuckDB = (async () => {
const StratumDuckDB =
(await import(window._duckdbBase + 'stratum-duckdb.esm.js')).default;
return StratumDuckDB.open({ duckdbPath: window._duckdbBase });
})();
}());
</script>
// In your .qmd file (OJS cell):
db = {
const database = await window._stratumDuckDB;
await database.registerFile('indicators', 'indicators.csv');
return database;
}
rows = db.query("SELECT * FROM indicators WHERE year = 2022")
Inputs.table(await rows)
Remote file query — no registration needed
DuckDB uses HTTP range requests to fetch only the columns and row groups each query needs — even for files hundreds of MB in size.
const rows = await db.query(`
SELECT country_iso, AVG(life_expectancy) AS avg_life
FROM 'https://github.com/org/repo/releases/download/v1/indicators.parquet'
WHERE year = 2022
GROUP BY country_iso
ORDER BY avg_life DESC
LIMIT 10
`);
Hosting your data
Your data files can live anywhere with public HTTPS and
Access-Control-Allow-Origin: *.
For Parquet files, the server must also support HTTP range requests
(Accept-Ranges: bytes):
- GitHub Pages — same origin, no CORS needed
- GitHub Releases — free, 2 GB per file, CORS + range requests
- Zenodo / Hugging Face Datasets — public research repositories
- Cloudflare R2 / AWS S3 — configure bucket CORS policy
See STRATUM.md for a comparison of stratum-duckdb and stratum-sqlite and guidance on choosing the right library for your dataset.