Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Installation

This tutorial covers multiple use cases with different database backends. The table below shows which components are needed for each:

ComponentSolar system / OMEROOMOP CDMGBIFGraphDB mode
Java 21+requiredrequiredrequiredrequired
Ontop CLIrequiredrequiredrequiredoptional
PostgreSQL clientrequired
DuckDBrequiredrequiredoptional
Python 3 + rdflib + pandaswarmup chapter
Apache Jena (arq)warmup chapter
GraphDB Desktoprequired
DuckDB JDBC driverrequired
AWS CLIfor extraction

1. Java 21+

Check whether Java is already installed:

java -version

You should see version 21 or higher. If not, install it:

macOS (Homebrew)
Ubuntu / Debian
Windows
brew install openjdk@21
sudo ln -sfn /opt/homebrew/opt/openjdk@21/libexec/openjdk.jdk \
    /Library/Java/JavaVirtualMachines/openjdk-21.jdk

2. Ontop CLI

Ontop CLI is the core component that powers the SPARQL endpoint.

  1. Download the latest 5.x CLI distribution from https://ontop-vkg.org/download/

  2. Unzip the archive:

    unzip ontop-cli-5.x.zip
  3. Add the bin directory to your PATH:

macOS / Linux
Windows
export PATH=$PATH:/path/to/ontop-cli-5.x/bin

Add this line to your ~/.zshrc or ~/.bashrc to make it permanent.

  1. Verify the installation:

    ontop --version

    Expected output: Ontop version 5.x

JDBC drivers

Place the appropriate JDBC driver .jar files in the Ontop jdbc/ directory:


3. Python 3 with rdflib and pandas (warmup chapter)

The warmup exercise uses Python to convert CSV data to RDF:

pip install rdflib pandas

4. Apache Jena — arq (warmup chapter)

arq is a command-line SPARQL query tool for testing RDF files locally:

  1. Download Apache Jena from https://jena.apache.org/download/

  2. Unzip and add the bin/ directory to your PATH

Verify:

arq --version

5. DuckDB (OMOP and GBIF use cases)

macOS
Ubuntu
Windows
brew install duckdb

Verify:

duckdb --version

6. GraphDB Desktop (optional — Desktop mode only)

Download from https://www.ontotext.com/products/graphdb/ and follow the vendor’s installation instructions.

After installation, start GraphDB and open the Workbench at: http://localhost:7200

DuckDB JDBC driver for GraphDB

GraphDB’s embedded Ontop needs the DuckDB JDBC driver:

  1. Download the driver .jar from: https://repo1.maven.org/maven2/org/duckdb/duckdb_jdbc/

  2. Place it in the GraphDB lib/ext directory:

    OSPath
    macOSGraphDB.app/Contents/app/lib/ext/
    Linuxgraphdb/lib/ext/
    Windowsgraphdb\lib\ext\
  3. Restart GraphDB after placing the driver.


7. AWS CLI (GBIF extraction only)

The GBIF chapter uses the AWS CLI to list S3 bucket contents. No AWS account or credentials are needed — the GBIF bucket is public.

macOS
Ubuntu
Windows
brew install awscli

Tested versions

SoftwareTested version
Ontop CLI5.4.x / 5.5.x
GraphDB Desktop10.4.x
DuckDB0.10.x
Java17+
Python3.10+
Apache Jena4.x / 5.x
PostgreSQL14+