Welcome to ARG-V Automated Program Transformations. This application automatically downloads, filters and transforms open source repositories into benchmarks for static analysis tools.
Clone the repository:
git clone [email protected]:unl-pal/argv-transformer.gitThis project uses gradle to build, compile and run the program.
After cloning the repository run the gradle wrapper with no arguments ./gradlew to set up the project.
This will fetch the appropriate version of Gradle, if needed, as well as all dependencies, and set to build Java 8 compatible code.
Run ./gradlew followed by any of the following tasks to get desired effect.
reset- Resets the program - deletesbuild,database,suitablePrgmsandbenchmarksdirectories
javadoc- Generates Javadocs in thebuild/docs/javadocdirectory withindex.htmlas the entry point. These docs can be viewed and navigated from the browser.
download- Compiles and Runs the download programfilter- Compiles and Runs the filter programfull- Compiles Runs the full applicationtransform- Compiles and Runs the transformer program
regressionTransformer- Runs regression test for transformertest- Runs the test suite and creates a report that can be viewed in a browser. The report is located in thebuild/reports/tests/testdirectory with theindex.htmlfile as the main entry point.
Other tasks are available and can be seen by running ./gradlew tasks from
the command line. This will display all tasks available and a brief description
of what they do.
The tool uses the file config.properties to set the options and properties of the tool before each run. This file can be edited manually to suit the users needs by changing the values of the file.
The available options to set are as follows:
csv- relative file path to file with OSS GitHub URLsprojectCount- how many projects to grab that meet the requirementsmaxLoc- upper limit to number of lines of codeminLoc- lower limit to number of lines of codedownloadDir- where to download the repos tobenchmarkDir- where to write the benchmarks todebug- run code with debugging logs and features ondebugLevel- what level of debug errors are reportedtype- type of expressionminExpr- a minimum number of infix, prefix or postfix expression of the defined type encountered in a methodminIfStmt- int minimum number of if statements requiredminParams- int minimum number of parameters requiredminTypeExpr- int minimum number of type expressions requiredminTypeCond- int minimum number of type conditions requiredminTypeParams- int minimum number of type parameters requiredtransformAll- boolean to transform all code regardless of if it already compilestarget- string for compatibility i.e. SVCOMPverifier- location of the verifier code needed to compile benchmarks using x compatibility
expression Types Available for value:type:
X- don't care what type of expression isI- integer type (int, short, byte, long)R- real type (double, float)S- string type (String, char) Currently code also ensures that the arguments to a method of that type too (plus boolean).
ifStmt:
X- don't care if there is an ifStmt in a methodY- a conditional statement should be present in a methodN- a method should have no ifStmt
ifStmtType -- what expression type should be present in a conditional statement:
X- don't care what type of expression isI- integer type (int, short, byte, long)R- real type (double, float)S- string type (String, char)
Given a csv file with names and meta data of repositories this program downloads repositories and stores them in a database.
Input:
csv- csv file with meta data and names of repositories such as created by RepoReaper
Config Settings:
csv- relative file path to file with OSS github URLsprojectCount- how many projects to grab that meet the requirementsmaxLoc- upper limit to number of lines of codeminLoc- lower limit to number of lines of codedownloadDir- where to download the repos to
Output:
- creates database directory of downloaded repositories for next steps
This program filters a directory of repositories for java files suitable for symbolic execution using the values set by the user.
(Suitability is defined in sourceAnalysis.AnalyzedMethod.java by the isSymbolicSuitable() method. A Java file
is suitable if at least one of its methods is suitable.)
Input:
database- Directory of repositories.
Config Settings:
maxLoc- upper limit to number of lines of codeminLoc- lower limit to number of lines of codedownloadDir- where to download the repos tobenchmarkDir- where to write the benchmarks totype- type of expressionminExpr- int minimum number of expressions requiredminIfStmt- int minimum number of if statements requiredminParams- int minimum number of parameters requiredminTypeExpr- int minimum number of type expressions requiredminTypeCond- int minimum number of type conditions requiredminTypeParams- int minimum number of type parameters required
Output:
suitablePrgms- Directory of repositories containing only java files suitable for analysis (in original directory structure).
Given a directory of suitable Java files, this program attempts to transform each into a compilable benchmark.
A directory of benchmarks is created, containing the programs that would successfully compile in their original directory structure.
Input:
suitablePrgms- Directory of repositories containing only Java files suitable for symbolic execution.
Config Settings:
-
debug- run code with debugging logs and features on -
debugLevel- what level of debug errors are reported -
transformAll- boolean to transform all code regardless of if it already compiles -
target- string for compatibility i.e. SVCOMP -
verifier- location of the verifier code needed to compile benchmarks using x compatibility -
csv- relative file path to file with OSS GitHub URLs
Output:
benchmarks- Directory of compilable, suitable programs (in original directory structure).
Given a CSV of GitHub repositories (as gathered by RepoReaper), this program will select suitable repositories, download them, search for classes containing SPF-suitable methods, and transform suitable classes into compilable, benchmark programs.
Input:
- CSV of GitHub repositories as gathered by RepoReaper.
Config Settings:
- All settings
Output:
database- Directory of GitHub repos that meet project filter specification.suitablePrgms- Directory containing suitable files extracted from GitHub repos. (This is where the transformation of the source code takes place.)benchmarks- Directory of compilable, suitable programs, in their original directory structure.
dataset.csv- First 5,000 entries (each entry is a GitHub project url with metadata) from RepoReaper dataset.filtered-dataset.csv- 250 Java projects with min and max loc of 100 and 10,000config.properties- Properties for running the full framework (i.e. downloading, filtering, transforming).
The javac version (used in main.MainTransform.java and mainFullFramework.MainAnalysis.java to compile benchmarks) needs to be the same version as JDK for rt.jar set in Soot classpath (in jpf.ProgramUnderTest.java, used for loop detection).
download- Downloading GitHub projectsfilter- Filtering for relevant projects and fileslogging- For simple loggingfull- Contains main for running with full framework, i.e. download, filter, transform, outputsourceAnalysis- Used to track files and methods suitable for symbolic executionteststransform- Transforming files into compilable benchmarks