This tutorial covers multiple use cases with different database backends. The table below shows which components are needed for each:
| Component | Solar system / OMERO | OMOP CDM | GBIF | GraphDB mode |
|---|---|---|---|---|
| Java 21+ | required | required | required | required |
| Ontop CLI | required | required | required | optional |
| PostgreSQL client | required | — | — | — |
| DuckDB | — | required | required | optional |
| Python 3 + rdflib + pandas | warmup chapter | — | — | — |
Apache Jena (arq) | warmup chapter | — | — | — |
| GraphDB Desktop | — | — | — | required |
| DuckDB JDBC driver | — | — | — | required |
| AWS CLI | — | — | for extraction | — |
1. Java 21+¶
Check whether Java is already installed:
java -versionYou should see version 21 or higher. If not, install it:
brew install openjdk@21
sudo ln -sfn /opt/homebrew/opt/openjdk@21/libexec/openjdk.jdk \
/Library/Java/JavaVirtualMachines/openjdk-21.jdksudo apt update
sudo apt install openjdk-21-jdkDownload OpenJDK 21 from https://
java -version2. Ontop CLI¶
Ontop CLI is the core component that powers the SPARQL endpoint.
Download the latest 5.x CLI distribution from https://
ontop -vkg .org /download/ Unzip the archive:
unzip ontop-cli-5.x.zipAdd the
bindirectory to your PATH:
export PATH=$PATH:/path/to/ontop-cli-5.x/binAdd this line to your ~/.zshrc or ~/.bashrc to make it permanent.
Add the bin directory to your System PATH via Environment Variables.
Verify the installation:
ontop --versionExpected output:
Ontop version 5.x
JDBC drivers¶
Place the appropriate JDBC driver .jar files in the Ontop jdbc/ directory:
PostgreSQL (for solar system and OMERO): Download from https://
jdbc .postgresql .org /download/ DuckDB (for OMOP and GBIF): Download from https://
repo1 .maven .org /maven2 /org /duckdb /duckdb _jdbc/
3. Python 3 with rdflib and pandas (warmup chapter)¶
The warmup exercise uses Python to convert CSV data to RDF:
pip install rdflib pandas4. Apache Jena — arq (warmup chapter)¶
arq is a command-line SPARQL query tool for testing RDF files locally:
Download Apache Jena from https://
jena .apache .org /download/ Unzip and add the
bin/directory to your PATH
Verify:
arq --version5. DuckDB (OMOP and GBIF use cases)¶
Verify:
duckdb --version6. GraphDB Desktop (optional — Desktop mode only)¶
Download from https://
After installation, start GraphDB and open the Workbench at:
http://
DuckDB JDBC driver for GraphDB¶
GraphDB’s embedded Ontop needs the DuckDB JDBC driver:
Download the driver
.jarfrom: https://repo1 .maven .org /maven2 /org /duckdb /duckdb _jdbc/ Place it in the GraphDB
lib/extdirectory:OS Path macOS GraphDB.app/Contents/app/lib/ext/Linux graphdb/lib/ext/Windows graphdb\lib\ext\Restart GraphDB after placing the driver.
7. AWS CLI (GBIF extraction only)¶
The GBIF chapter uses the AWS CLI to list S3 bucket contents. No AWS account or credentials are needed — the GBIF bucket is public.
Tested versions¶
| Software | Tested version |
|---|---|
| Ontop CLI | 5.4.x / 5.5.x |
| GraphDB Desktop | 10.4.x |
| DuckDB | 0.10.x |
| Java | 17+ |
| Python | 3.10+ |
| Apache Jena | 4.x / 5.x |
| PostgreSQL | 14+ |