benchmarking

Splitgraph benchmarking

Since Splitgraph defers to Postgres for a lot of its operations (e.g. after checkout, a Splitgraph table becomes a normal PostgreSQL table with change tracking enabled), it's difficult to specify what is considered a benchmark for Splitgraph.

There are two Jupyter notebooks here. The first one, benchmarking, tests the overhead of common Splitgraph operations on a series of synthetic PostgreSQL tables and compares dataset sizes when stored in Splitgraph vs when stored as PostgreSQL tables.

The second one, benchmarking_real_data, uses some datasets that are available on the Splitgraph registry to compare the size improvement from storing them as Splitgraph objects as well as benchmarks querying Splitgraph repositories directly (using layered querying) vs querying them as PostgreSQL tables.

Running the example

You can view the notebooks in your browser. Alternatively, you can build and start up the engine:

export COMPOSE_PROJECT_NAME=splitgraph_example 
docker-compose down -v
docker-compose build
docker-compose up -d
sgr init

You need to have been logged into the registry (sgr cloud login or sgr cloud login-api).

You can also use your own engine that's managed by sgr engine.

Install this package with Poetry: poetry install

Open the notebook in Jupyter: jupyter notebook benchmarking.ipynb or jupyter notebook benchmarking_real_data.ipynb

Name		Name	Last commit message	Last commit date
parent directory ..
.sgconfig		.sgconfig
README.md		README.md
benchmarking.ipynb		benchmarking.ipynb
benchmarking_real_data.ipynb		benchmarking_real_data.ipynb
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Splitgraph benchmarking

Running the example

FilesExpand file tree

benchmarking

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarking

Folders and files

parent directory

README.md

Splitgraph benchmarking

Running the example