-
Notifications
You must be signed in to change notification settings - Fork 9
Using the HAM Endpoint Detection API
The Endpoint Detection Feature of the ASTAM Correlator is available as a Maven dependency on the ASTAM HAM module. This library allows you to detect endpoints in source code for any of the supported frameworks listed on the projects README.md.
Add the following to your pom.xml
<dependency>
<groupId>com.github.secdec.astam-correlator</groupId>
<artifactId>threadfix-entities</artifactId>
<version>1.3.5</version>
</dependency>
<dependency>
<groupId>com.github.secdec.astam-correlator</groupId>
<artifactId>threadfix-ham</artifactId>
<version>1.3.5</version>
</dependency>
The HAM module API consists of 3 main classes:
-
EndpointDatabase- Generates and contains the framework-specific set of detectedEndpointswithin a codebase -
FrameworkCalculator- Detects the possible framework types within a codebase -
Endpoint- A framework-specific Endpoint that contains all available data for a detected endpoint
import com.denimgroup.threadfix.framework.engine.full.EndpointDatabase;
import com.denimgroup.threadfix.framework.engine.full.EndpointDatabaseFactory;
import com.denimgroup.threadfix.data.interfaces.Endpoint;
...
File sourceCodeFolderPath = new File("/my/source");
// Change SPRING_MVC as appropriate, or remove the FrameworkType to use auto-detection.
EndpointDatabase database = EndpointDatabaseFactory(sourceCodeFolderPath, FrameworkType.SPRING_MVC);
List<Endpoint> parsedEndpoints = database.generateEndpoints();
An EndpointDatabase can be made using EndpointDatabaseFactory.getDatabase(..). This accepts a File for the folder containing source code, and optionally a FrameworkType that the source code should be parsed as. If the FrameworkType is not provided, a best-guess will be made based on the contents of the source code. The getDatabase method will return null if no framework could be detected.
Each EndpointDatabase instance contains endpoints for one specific FrameworkType. For projects that using multiple frameworks, the FrameworkCalculator can be manually invoked to collect all possible frameworks used in the given source code.
Invoke FrameworkCalculator.getTypes(file) to get a List<FrameworkType>. Iterate over these framework types and invoke EndpointDatabaseFactory.getDatabase(file,frameworkType) to get all possible EndpointDatabases for the given source code.
Once an EndpointDatabase is available, call myEndpointDatabase.generateEndpoints() to get a List<Endpoint>.
Endpoints contain the following data:
- Source file (
getFilePath) - Endpoint path (
getUrlPath) - Endpoint path structure (
getUrlPathNodes) - HTTP method (
getHttpMethod) - Start and end line number (-1 if unavailable) (
getStarting/EndingLineNumber) - Parameters (
getParameters) - Variants (
getVariants)
Some of these data are described below.
A given endpoint may have multiple possible representations. For example / and /index.html represent the same Endpoint, but have different URL paths. The shortest Endpoint is chosen to be the canonical version, and other Endpoints are added to the canonical endpoint as its variants. In this example, / would be a returned endpoint, and the /index.html endpoint would be available as one of its variants.
Variants contain data duplicated from the canonical version.
A set of endpoints can be collapsed to a list containing canonical and variant endpoints using EndpointUtil.flattenWithVariants(endpointsCollection).
Parameters are available as a Map<String, RouteParameter>, where the parameter name is the key and details of the parameter are the value.
A RouteParameter contains the following information:
- Name (
getName) - The name of the parameter, same as theMap's key - Data type (
getDataType) - A primitive representation of the parameter's data type, such asString,Integer, andDateTime. Parameters with an undetected or unknown data type may be set tonullorString - Parameter type (
getParamType) - The way the parameter is used in an HTTP request, such asQUERY_STRING,FORM_DATA,COOKIE, and so on. Parameters with an unknown type will be assigned toUNKNOWN - Accepted values (
getAcceptedValues) - The set of accepted values for this parameter, if available - Data type source (
getDataTypeSource) - The original string used to determine the Data type for the parameter
The Endpoint interface provides various methods for matching Endpoint objects with a given URL.
A general measure of relevancy is provided through myEndpoint.compareRelevance(urlString), which returns an integer score to be compared against other endpoints. Any value greater than 0 may be considered relevant. This is a broad comparison, such that the endpoint /foo will have relevance with /foo/bar/baz. An endpoint representing /foo/bar will have a higher relevance than the /foo endpoint.
A simpler relevancy check is through myEndpoint.isRelevant(urlString, strictness), which returns true or false depending on strictness. Strictness can be defined as EndpointRelevanceStrictness.STRICT or EndpointRelevanceStrictness.LOOSE. A LOOSE comparison defers to compareRelevance(urlString) and returns true if the relevance is greater than 0. A STRICT comparison will only return true if the endpoint is a perfect match.
There are two available formats for JSON serialization - Info and Full. Info serialization will output all endpoints in the same format, making the data easier to parse but removing the ability to deserialize into the original Endpoint types. Full serialization will output endpoints in a format unique to their framework type, and can be deserialized to their original Endpoint types at the cost of file size and parsing difficulty. Info serialization may be used to generically import endpoints, whereas Full serialization may be used to take advantage of HAM capabilities such as URL matching.
Note that neither of these approaches will serialize or deserialize endpoint variants. Variants are ignored. To serialize variant endpoints, use the EndpointUtil.flattenWithVariants utility method to get a full list of endpoints including variants before serializing.
Info serialization involves converting all Endpoint objects to Endpoint.Info objects, and manually serializing those objects using the JSON library of your choice. Any Endpoint can be converted to an Endpoint.Info by using Endpoint.Info.fromEndpoint(...).
Endpoints in this format include:
- HTTP method
- URL path
- Source code path (relative to project root)
- Start and end line numbers
- Parameters
Full serialization will serialize all data in an Endpoint object, allowing it to be reconstructed later for use of its complete features (ie relevancy checks, code-line/parameter mapping, etc.)
Full endpoint serialization can be done for any Endpoint implementation using the EndpointSerialization class. They can be serialized individually or as a collection. A JSON array of objects are returned from serialization. These objects are wrappers around the endpoints, containing the framework type of the endpoint and the endpoint itself. Endpoints can be serialized and deserialized by using EndpointSerialization.serialize[All] and EndpointSerialization.deserialize[All], respectively.
Endpoints serialized using this method will have different formats depending on the web framework type they were generated from. Deserialized endpoints will have all features of their original objects, excluding variant data.
A URL can be compared to a set of endpoints to determine whether any endpoint perfectly matches the given URL. It does not return the specific endpoint that matches.
Create an EndpointStructure instance and add endpoints via acceptEndpointPath, acceptEndpoint, and acceptAllEndpoints. Check for matching by calling matchesUrlPath.
This material is based on research sponsored by the Department of Homeland Security (DHS) Science and Technology Directorate, Cyber Security Division (DHS S&T/CSD) via contract number HHSP233201600058C.