Artifact for PLDI2020: Question Selection for Interactive Program Synthesis
- Install dependencies
$ apt-get install cmake libgoogle-glog-dev python3-pip
$ pip3 install pyparsing z3-solver xlrd matplotlib - Clone IntSy and all its submodules (or just unzip
IntSy.zipshared on Dropbox)
$ git clone --recursive https://github.com/jiry17/IntSy.git- Build the whole project. Under the root directory of the project:
$ ./installWe also release a docker container in which this project is already built at ~/IntSy.
docker pull takanashirikka/intsyThe docker container can be executed using the following command:
docker run --rm -it takanashirikka/intsyHowever, since docker will affect the execution speed, we still recommend building the project from source.
-
Test whether Euphony is successfully installed:
$ cd recommend/euphony $ . bin/setenv $ ./bin/run_string benchmarks/string/train/firstname.sl
The expected output is a program which splits an input string by a space and returns the first substring:
(define-fun f ((name String)) String (str.substr name 0 (str.indexof name " " 0))) -
Test whther Eusolver is successfully installed:
$ cd recommend/eusolver $ ./eusolver benchmarks/max/max_2.slThe expected output is a program which takes two integers as the input and returns the larger one:
(define-fun max2 ((a0 Int) (a1 Int)) Int (ite (<= a1 a0) a0 a1)) -
Test whether the project is successfully built:
$ cd run $ ./run_benchmark -t repair -b t1.slThe expected output contains a series of log information describing the interaction between the synthesizer and a simulated developer. The last line of the log should be a program semantically equivalent with the following one:
{== #P10 #P7}
We provide two scripts run_benchmark and run_run under the directory run: the former one helps run synthesizers one a single benchmark, and the latter one helps reproduce the experiment results.
$ cd run
$ ./run_benchmark -b BENCHMARK [-d DEPTH] [-l LOGFILE] [-s {SampleSy,EpsSy,RandomSy}] -t {string,repair}-b: the name of the benchmark.-t: the type of the benchmark.-l: the name of the log file, e.g., if its value istest.log, all the logs will be stored inlog/test.log. The default value of this flag isNONE, which means directly output the logs to STDOUT.-d: the depth of the program space. The default value is stored inbenchmark.xls.-s: the name of the synthesizer. The default synthesizer is SampleSy, and the threshold for EpsSy is 5.
Some examples are listed below:
# Run RandomSy on benchmark t1.sl inside the repair track
$ ./run_benchmark -s RandomSy -t repair -b t1.sl
# Run SampleSy on benchmark phone.sl inside the string track with the depth 5
$ ./run_benchmark -t string -b phone.sl -d 5If you want to execute the synthesizers on a new benchmark, the benchmark should be firstly put to the corresponding directories: benchmarks/repair for the repair track and benchmarks/string/final for the string track.
$ cd run
$ ./run_exp [-tn THREADNUM] [-m MEMORY <GB>] [-exp {1,2,3,4}] [-r REPEATNUM] [-c {R <Restart>,C <Clear>}]
# For example, to reproduce all results:
$ ./run_exp -c R-
-tn: the number of threads. The default value is 8. -
-m: the memory limit for each thread. The default value is 6 GB. Please ensure the number of threads times the memory limit is smaller than the memory size of your computer. -
-exp: the id of the experiment you want to run. All experiments will be executed by default. -
-r: the number of repetitions of each execution. The default value is 5. Note that all the algorithms are random, the smaller this value is, the more volatile the result will be. -
-c: whether to clear the cache:Rrepresents yes whileCrepresents no, and the default value isC.
The result of each single execution is cached in run/result.pkl . run_exp will firstly check whether result.pkl contains enough data to draw the figure: only the insufficient part will be executed.
The raw data of each single execution can be found in run/temp with the following format:
Finished
n
size[0] size[2] ... size[n]
survive[1] survive[2] ... survive[n]
where run/log.
The figure of each experiment will be stored in run/figure.
$ cd run
$ ./run_exp -exp 1For Figure 2, the script will redraw it as run/figure/exp1.png while the original figure used in our paper can be found as run/expected/exp1.png.
For data listed in Section 7.3, run_exp will recalculate them and print them to the standard output.
$ cd run
$ ./run_exp -exp 2For data listed in Table 2, run_exp will recalculate them and print them to the standard output.
$ cd run
$ ./run_exp -exp 3For Figure 4, the script will redraw it as run/figure/exp3.png while the original figure used in our paper can be found as run/expected/exp3.png.
For data listed in Section 7.5, run_exp will recalculate them and print them to the standard output.
$ cd run
$ ./run_exp -exp 4For Figure 5, the script will redraw it as run/figure/exp4.png while the original figure used in our paper can be found as run/expected/exp4.png.
Note: There may be some small differences between the results listed in our paper and the reproduced ones since all the synthesizers in our paper are random. The results of the string track are more stable than those of the repair track since the former one contains much more benchmarks. Besides, to reduce the influence of randomness, run_exp repeats each execution for -r 1 could significantly speed up the whole process.