This repository contains all scripts, results, and further information related to our paper Using Microbenchmark Suites to Detect Application Performance Changes.
Software performance changes are costly and often hard to detect pre-release. Similar to software testing frameworks, either application benchmarks or microbenchmarks can be integrated into quality assurance pipelines to detect performance changes before releasing a new application version. Unfortunately, extensive benchmarking studies usually take several hours which is problematic when examining dozens of daily code changes in detail; hence, trade-offs have to be made. Optimized microbenchmark suites, which only include a small subset of the microbenchmarks, could solve this problem, but should still reliably detect (almost) all application performance changes such as an increased request latency. It is, however, unclear whether microbenchmarks and application benchmarks detect the same performance problems and whether one can be a proxy for the other.
In our paper, we explore whether microbenchmark suites can detect the same application performance changes as an application benchmark. For this, we run extensive benchmark experiments with both the complete and the optimized microbenchmark suites of InfluxDB and VictoriaMetrics and compare their results to the respective results of an application benchmark. We do this for 70 and 110 commits respectively. Our results show that it is indeed possible to detect application performance changes using an optimized microbenchmark suite. This detection, however, (i) is only possible when the optimized microbenchmark suite covers all application-relevant code sections, (ii) is prone to false alarms, and (iii) cannot precisely quantify the impact on application performance. Overall, an optimized microbenchmark suite can, thus, provide fast performance feedback to developers (e.g., as part of a local build process), help to estimate the impact of code changes on application performance, and support a detailed analysis while a daily application benchmark detects major performance problems. Thus, although a regular application benchmark cannot be substituted, our results motivate further studies to validate and optimize microbenchmark suites.
If you use (parts of) this software in a publication, please cite it as:
Martin Grambow, Denis Kovalev, Christoph Laaber, Philipp Leitner, David Bermbach. Using Microbenchmark Suites to Detect Application Performance Changes. In: IEEE Transactions on Cloud Computing. IEEE 2022.
@article{grambow_using_2022,
title = {{Using Microbenchmark Suites to Detect Application Performance Changes}},
journal = {{IEEE} Transactions on Cloud Computing},
volume = {Early Access},
author = {Grambow, Martin and Kovalev, Denis and Laaber, Christoph Laaber and Leitner, Philipp and Bermbach, David},
year = {2022}
}
For a full list of publications, please see our website.
A full replication package including all raw result files is availabe at Deposit Once: http://dx.doi.org/10.14279/depositonce-15532
Files:
results_all.zip: all raw result filesresults_aggr.zip: aggregated result files (see section analysis)scripts.zip:analysis: Scripts to analyze the result filesappBenchmarks: Scripts to run the application benchmarkscreateCommitTable: Scripts to create the commit tablesGoABS: Scripts to run the microbenchmark suites using RMIT executiongocg: Scripts to analyse the call graphs and find the optimal MB suitesmicrobenchmarks: Scripts to setup and run the microbenchmarksoptimizingMB-Suite: Scripts to generate the call graphs
The code in this repository is licensed under the terms of the MIT license.
analysis: Scripts to analyze the result filesappBenchmarks: Scripts to run the application benchmarkscreateCommitTable: Scripts to create the commit tablesmicrobenchmarks: Scripts to setup and run the microbenchmarksoptimizingMB-Suite: Scripts to generate the call graphsresults_aggr: Aggregated result files (see section analysis)
- Clone this project
git clone https://github.com/martingrambow/benchmarkStrategy
- Install google cloud sdk
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
sudo apt-get install apt-transport-https ca-certificates gnupg
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update && sudo apt-get install google-cloud-sdk
gcloud auth activate-service-account --key-file=microbenchmarkevaluation-275929759504.json
- Install golang and graphviz
sudo apt-get install golang
sudo apt-get install graphviz
- Create a google cloud project
-
Create Service Account and download json key
-
activate compute engine
5 Open Firewall (e.g., InfluxDB traffic)
- (e.g., Open ports 8086, 8087, 80, and 81 in the firewall)
- Navigate to the application benchmark folder of the respective system, i.e., "appBenchmarks/influxdb" or "appBenchmarks/vm"
- Check the commitTable.csv
- Start the application benchmark using the main.sh script and three arguments:
- Start commit number
- End commit cumber
- Run number
E.g., the script runs an application benchmark for every commit from 1 to 120 and saves the results in a folder named run1.
cd appBenchmarks/influxdb
screen -d -m -L -Logfile experiment1.log ./main.sh 1 120 1
To run AA-test, replace the commit table name in the main.sh script with a file which refers to the AA-test commit table and run the script.
- Navigate to the microbenchmarks folder of the respective system
- Check the commitTable.csv
- Rename the file abs_config_all.json to abs_config.json
- Start the application benchmark using the main.sh script and three arguments:
- Start commit number
- End commit cumber
- Run number
E.g., the script runs an application benchmark for every commit from 1 to 120 and saves the results in a folder named run1.
cd microbenchmarks/influxdb
screen -d -m -L -Logfile experiment1.log ./main.sh 1 120 1
-
Generate application benchmark call graph
- Move to the respective folder
cd optimizingMB-Suite/cgApp_influxdb - Adjust the 0_main.sh and enter the base commit
- Run
./0_main.shand save the generated .pprof and .dot file
- Move to the respective folder
-
Generate call graphs for all microbenchmarks
- Move to the respective folder
cd optimizingMB-Suite/cgMicro_influxdb - Adjust the 0_main.sh and enter the base commit
- Review the abs_config.json configuration
- run
./0_main.shand save the generated .pprof and .dot files (copy the result folder)
- Move to the respective folder
-
Transform pprof profiles to dot files
- Clone the gocg tool: https://bitbucket.org/sealuzh/gocg/src/master/ (this tool is also part of our replication package)
- Run
gocg/cmd/transform_profileswith 3 positional arguments:- Folder with generated .pprof files of microbenchmarks
- Folder to which the .dot files will be written
- Configuration parameters: type:maximumNumberOfNodes:minmalNodeFraction:MinimalEdgeFraction
- will generate dot files
- all nodes (not only the most important ones) should be included
- all nodes and edge should be included, even if their fraction is very small
gocg/cmd/transform_profiles \ ../benchmarkStrategy/results_all/optimizing_influx/micro/profiles \ ../benchmarkStrategy/results_all/optimizing_influx/micro/dots dot:100000:0.000:0.000 -
Determine practical relevance (optional)
- Clone the gocg tool: https://bitbucket.org/sealuzh/gocg/src/master/
- Run
/gocg/cmd/overlapwith 4 arguments- project name to differ between project and library nodes
- folder with application benchmark call graph
- folder with microbenchmark call graphs
- output folder
/gocg/cmd/overlap \ github.com/influxdata/influxdb \ ../benchmarkStrategy/results_all/optimizing_influx/app \ ../benchmarkStrategy/results_all/optimizing_influx/micro/dots \ ../benchmarkStrategy/results_all/optimizing_influx/overlap- View the file struct_node_overlap.csv in the output folder
- The file lists the number of every microbenchmark and the respective overlap with the application benchmark call graph
- There are two metrics for every microbenchmark: one considering project-only nodes, another considering all nodes
- There is an aggreated "ALL" row which states the practical relevance
-
Remove redundancies
- Run
gocg/cmd/minimizationwith 4 arguments- project name to differ between project and library nodes
- folder with application benchmark call graph
- folder with microbenchmark call graphs
- output folder
/gocg/cmd/minimization \ github.com/influxdata/influxdb \ ../benchmarkStrategy/results_all/optimizing_influx/app \ ../benchmarkStrategy/results_all/optimizing_influx/micro/dots \ ../benchmarkStrategy/results_all/optimizing_influx/overlap- Review the 4 additional files in the output folder
- The file app_minFile_GreedySystem.csv shows the optimized suite without redundancies
- Run
-
Recommend functions
- Navigate to the microbenchmarks folder of the respective system
- Check the commitTable.csv
- Rename the file abs_config_opti.json to abs_config.json (and/or adjust the microbenchmark names in the file)
- Start the application benchmark using the main.sh script and three arguments:
- Start commit number
- End commit cumber
- Run number
E.g., the script runs an application benchmark for every commit from 1 to 120 and saves the results in a folder named run1.
cd microbenchmarks/influxdb
screen -d -m -L -Logfile experiment1.log ./main.sh 1 120 1
- Open the
analyisfolder as a Pycharm project - The analysis folder:
appBenchmarks_influxdbandappBenchmarks_vm:app_AATests_prepare_XX.ipynb: Aggregates raw results in a csv fileapp_AATests_bootstrap_XX.ipynb: Runs the bootstrapping on the csv file and finds median value and CIsapp_regression_aggregate_XX.ipynb: Aggregates raw results in csv fileapp_regression_bootstrap_XX.ipynb: Finds median performance changes and determines CIsapp_regression_draw_XX.ipynb: Draws performance history
microbenchmarks_influxdbandmicrobenchmarks_vm:micro_regression_bootstrap_prepare_XX.ipynb: Aggregates raw resultsmicro_regression_bootstrap_analyze_XX.ipynb: Finds median performance change and determines CIsmicro_regression_bootstrap_draw_XX.ipynb: Draws performance history
paperplots:- Scripts to draw the paper figures
- Find a detailed documentation in the respective script and notebook