Results

The scripts in this directory are used to load TPC-DS data into bigquery and run the benchmark. We run the following two scripts:

400-PopulateBigQuery.sh
401-BenchmarkBigQuery.sh

This is originally cloned from https://github.com/fivetran/benchmark and the scripts have been modified to add the right project name.

README for original project follows.

Results

https://fivetran.com/blog/warehouse-benchmark

Design

This is based on the TPC-DS benchmark, a standard data warehouse benchmark that uses lots of joins, aggregations and subqueries. The TPC-DS queries have been modified somewhat to improve portability across implementations, and eliminate the use of obscure SQL features like grouping-sets. We generated 1 TB of data, which contains about 4 billion rows in the largest fact table. We used the following warehouse configurations:

	Configuration	Cost / Hour
Redshift	5x ra3.4xlarge	$16.30
Snowflake	Large	$16.00
Presto	4x n2-highmem-32	$8.02
BigQuery	Flat-rate 500 slots	$13.70

Usage

These scripts are intended to be manually copy-pasted into various terminals. You can skip steps 1-4 since gs://fivetran-benchmark and s3://fivetran-benchmark are already populated.

Name		Name	Last commit message	Last commit date
Latest commit History 259 Commits
microsoft_sql		microsoft_sql
query		query
.gitignore		.gitignore
001-LaunchDataproc.sh		001-LaunchDataproc.sh
002-GenerateData.sh		002-GenerateData.sh
003-GenerateGs.sh		003-GenerateGs.sh
004-CopyToS3.sh		004-CopyToS3.sh
006-LaunchPresto.sh		006-LaunchPresto.sh
007-ConnectPresto.sh		007-ConnectPresto.sh
008-ConnectToEc2Instance.sh		008-ConnectToEc2Instance.sh
100-PopulatePresto.sh		100-PopulatePresto.sh
101-BenchmarkPresto.sh		101-BenchmarkPresto.sh
102-PrestoTiming.sh		102-PrestoTiming.sh
200-PopulateRedshift.sh		200-PopulateRedshift.sh
201-BenchmarkRedshift.sh		201-BenchmarkRedshift.sh
202-RedshiftTiming.sh		202-RedshiftTiming.sh
300-PopulateSnowflake.sh		300-PopulateSnowflake.sh
301-BenchmarkSnowflake.sh		301-BenchmarkSnowflake.sh
302-SnowflakeTiming.sql		302-SnowflakeTiming.sql
400-PopulateBigQuery.sh		400-PopulateBigQuery.sh
401-BenchmarkBigQuery.sh		401-BenchmarkBigQuery.sh
500-BenchmarkAzure.sh		500-BenchmarkAzure.sh
500-PopulateAzure.sql		500-PopulateAzure.sql
502-AzureTiming.sql		502-AzureTiming.sql
600-GenPopulateDatabricks.js		600-GenPopulateDatabricks.js
601-PopulateDatabricks.sql		601-PopulateDatabricks.sql
602-BenchmarkDatabricks.sh		602-BenchmarkDatabricks.sh
AzureQueryRunner.sh		AzureQueryRunner.sh
ForwardHttp.sh		ForwardHttp.sh
ForwardProfiler.sh		ForwardProfiler.sh
MicrosoftTools.sh		MicrosoftTools.sh
Presto.sh		Presto.sh
PrestoShutdown.sh		PrestoShutdown.sh
README.md		README.md
RedshiftUtilization.sql		RedshiftUtilization.sql
Scratch.md		Scratch.md
ShutdownPresto.sh		ShutdownPresto.sh
Warmup.sql		Warmup.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Results

Design

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Results

Design

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages