Skip to content

Commit a3bffd0

Browse files
author
Chris Schmidt
committed
Implemented the framework for the new Encoding API
1 parent 35fa31f commit a3bffd0

7 files changed

Lines changed: 248 additions & 14 deletions

File tree

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
import org.owasp.esapi.core.validation.ValidationException;
4+
5+
/**
6+
* The Encoder interface provides an API for contextually output encoding untrusted data.
7+
*
8+
* // TODO: Provide lots more documentation here on using the new Encoder APIs
9+
*
10+
* @author Chris Schmidt ([email protected]) http://www.ContrastSecurity.com
11+
*/
12+
public interface Encoder {
13+
14+
15+
/**
16+
* Encodes the given untrusted input for the supplied context.
17+
*
18+
* @param context The context to apply encoding for
19+
* @param input The input to be encoded
20+
* @param <Context> The Context Type implied by the context parameter.
21+
* @return
22+
*/
23+
<Context extends EncodingContext> String encode(Context context, String input) throws EncodingException;
24+
25+
/**
26+
* Canonicalization is simply the operation of reducing a possibly encoded
27+
* string down to its simplest form. This is important, because attackers
28+
* frequently use encoding to change their input in a way that will bypass
29+
* validation filters, but still be interpreted properly by the target of
30+
* the attack. Note that data encoded more than once is not something that a
31+
* normal user would generate and should be regarded as an attack.
32+
* <p>
33+
* Everyone <a href="http://cwe.mitre.org/data/definitions/180.html">says</a> you shouldn't do validation
34+
* without canonicalizing the data first. This is easier said than done. The canonicalize method can
35+
* be used to simplify just about any input down to its most basic form. Note that canonicalize doesn't
36+
* handle Unicode issues, it focuses on higher level encoding and escaping schemes. In addition to simple
37+
* decoding, canonicalize also handles:
38+
* <ul><li>Perverse but legal variants of escaping schemes</li>
39+
* <li>Multiple escaping (%2526 or &#x26;lt;)</li>
40+
* <li>Mixed escaping (%26lt;)</li>
41+
* <li>Nested escaping (%%316 or &%6ct;)</li>
42+
* <li>All combinations of multiple, mixed, and nested encoding/escaping (%2&#x35;3c or &#x2526gt;)</li></ul>
43+
* <p>
44+
* Using canonicalize is simple. The default is just...
45+
* <pre>
46+
* String clean = encoder.canonicalize(request.getParameter("input"));
47+
* </pre>
48+
*
49+
* Although ESAPI is able to canonicalize multiple, mixed, or nested encoding, it's safer to not accept
50+
* this stuff in the first place. In ESAPI, the default is "strict" mode that throws an {@link EncodingException}
51+
* if it receives anything not single-encoded with a single scheme.
52+
* <p/>
53+
* Implementors can choose to allow overwriting of this default policy either by explicitly not throwing the
54+
* {@link EncodingException} in their implementation or by providing a means of configuring the behavior of the
55+
* Encoder implementation.
56+
*
57+
* @see <a href="http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4">W3C specifications</a>
58+
*
59+
* @param input The input to be canonicalized
60+
* @return The supplied input reduced to it's simplest form.
61+
*/
62+
String canonicalize(String input) throws EncodingException;
63+
64+
}
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
/**
4+
* The basic functionality of the EncodingContext is to encode a single character to it's correct format for the
5+
* current Context. For example, if this implementation was meant to encode characters for HTML it would encode the {@literal<}
6+
* character and return the value &amp;lt;.
7+
*
8+
* @author Chris Schmidt ([email protected]) https://www.ContrastSecurity.com
9+
*/
10+
public interface EncodingContext {
11+
/**
12+
* Encodes a single character returning either the character itself or the encoded version of the character for the
13+
* current context.
14+
*
15+
* @param c The character to be encoded.
16+
* @return Either the character itself or the encoded representation of the character.
17+
* @throws UnencodableCharacterException If the supplied character cannot be encoded in the current context. An example
18+
* could be a UTF-16 character passed into a UTF-8 encoder implementation.
19+
*/
20+
String encode(char c) throws UnencodableCharacterException;
21+
22+
/**
23+
* Decodes the supplied string
24+
* @param str
25+
* @return
26+
* @throws EncodingException
27+
*/
28+
String decode(String str) throws EncodingException;
29+
30+
/**
31+
* Returns the character sequence that marks the beginning of a control sequence. For example in HTML this would simply
32+
* return the '&amp;' character.
33+
*
34+
* @return The character or character sequence that marks the beginning of a control sequence.
35+
*/
36+
String getControlSequence();
37+
38+
/**
39+
* Returns the character or character sequence that marks the end of a control sequence. For example in HTML this would
40+
* simply return the ';' character.
41+
*
42+
* @return The character or character sequence that marks the end of a control sequence.
43+
*/
44+
String getControlSequenceEnd();
45+
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
import org.owasp.esapi.core.EnterpriseSecurityException;
4+
5+
public abstract class EncodingException extends EnterpriseSecurityException {
6+
/**
7+
* Creates a new instance of EnterpriseSecurityException. This exception is automatically logged, so that simply by
8+
* using this API, applications will generate an extensive security log. In addition, this exception is
9+
* automatically registered with the IntrusionDetector, so that quotas can be checked.
10+
* <p/>
11+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
12+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
13+
* context of the exception.
14+
*
15+
* @param userMessage the message displayed to the user
16+
* @param logMessage the message logged
17+
*/
18+
public EncodingException(String userMessage, String logMessage) {
19+
super(userMessage, logMessage);
20+
}
21+
22+
/**
23+
* Creates a new instance of EnterpriseSecurityException that includes a root cause Throwable.
24+
* <p/>
25+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
26+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
27+
* context of the exception.
28+
*
29+
* @param userMessage the message displayed to the user
30+
* @param logMessage the message logged
31+
* @param cause the cause
32+
*/
33+
public EncodingException(String userMessage, String logMessage, Throwable cause) {
34+
super(userMessage, logMessage, cause);
35+
}
36+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
public class MixedEncodingException extends EncodingException {
4+
/**
5+
* Creates a new instance of EnterpriseSecurityException. This exception is automatically logged, so that simply by
6+
* using this API, applications will generate an extensive security log. In addition, this exception is
7+
* automatically registered with the IntrusionDetector, so that quotas can be checked.
8+
* <p/>
9+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
10+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
11+
* context of the exception.
12+
*
13+
* @param userMessage the message displayed to the user
14+
* @param logMessage the message logged
15+
*/
16+
public MixedEncodingException(String userMessage, String logMessage) {
17+
super(userMessage, logMessage);
18+
}
19+
20+
/**
21+
* Creates a new instance of EnterpriseSecurityException that includes a root cause Throwable.
22+
* <p/>
23+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
24+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
25+
* context of the exception.
26+
*
27+
* @param userMessage the message displayed to the user
28+
* @param logMessage the message logged
29+
* @param cause the cause
30+
*/
31+
public MixedEncodingException(String userMessage, String logMessage, Throwable cause) {
32+
super(userMessage, logMessage, cause);
33+
}
34+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
public class MultipleEncodingException extends EncodingException {
4+
/**
5+
* Creates a new instance of EnterpriseSecurityException. This exception is automatically logged, so that simply by
6+
* using this API, applications will generate an extensive security log. In addition, this exception is
7+
* automatically registered with the IntrusionDetector, so that quotas can be checked.
8+
* <p/>
9+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
10+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
11+
* context of the exception.
12+
*
13+
* @param userMessage the message displayed to the user
14+
* @param logMessage the message logged
15+
*/
16+
public MultipleEncodingException(String userMessage, String logMessage) {
17+
super(userMessage, logMessage);
18+
}
19+
20+
/**
21+
* Creates a new instance of EnterpriseSecurityException that includes a root cause Throwable.
22+
* <p/>
23+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
24+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
25+
* context of the exception.
26+
*
27+
* @param userMessage the message displayed to the user
28+
* @param logMessage the message logged
29+
* @param cause the cause
30+
*/
31+
public MultipleEncodingException(String userMessage, String logMessage, Throwable cause) {
32+
super(userMessage, logMessage, cause);
33+
}
34+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
package org.owasp.esapi.core.encoding;
2+
3+
public class UnencodableCharacterException extends EncodingException {
4+
/**
5+
* Creates a new instance of EnterpriseSecurityException. This exception is automatically logged, so that simply by
6+
* using this API, applications will generate an extensive security log. In addition, this exception is
7+
* automatically registered with the IntrusionDetector, so that quotas can be checked.
8+
* <p/>
9+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
10+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
11+
* context of the exception.
12+
*
13+
* @param userMessage the message displayed to the user
14+
* @param logMessage the message logged
15+
*/
16+
public UnencodableCharacterException(String userMessage, String logMessage) {
17+
super(userMessage, logMessage);
18+
}
19+
20+
/**
21+
* Creates a new instance of EnterpriseSecurityException that includes a root cause Throwable.
22+
* <p/>
23+
* It should be noted that messages that are intended to be displayed to the user should be safe for display. In
24+
* other words, don't pass in unsanitized data here. Also could hold true for the logging message depending on the
25+
* context of the exception.
26+
*
27+
* @param userMessage the message displayed to the user
28+
* @param logMessage the message logged
29+
* @param cause the cause
30+
*/
31+
public UnencodableCharacterException(String userMessage, String logMessage, Throwable cause) {
32+
super(userMessage, logMessage, cause);
33+
}
34+
}

src/main/java/org/owasp/esapi/core/validation/Validator.java

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
package org.owasp.esapi.core.validation;
22

33
/**
4-
* The Validator interface defines a set of methods for canonicalizing and validating untrusted input. Validators can be
4+
* The Validator interface defines a set of methods for validating untrusted input. Validators can be
55
* used to validate simple or complex data-types depending on the implementation.
66
* <p/>
77
* Implementations must adopt a "whitelist" approach to validation where a specific pattern or character set is
@@ -40,17 +40,4 @@ public interface Validator {
4040
* @return True if this validator supports the supplied data, false otherwise.
4141
*/
4242
boolean supports(Object input);
43-
44-
/**
45-
* Canonicalizes the given input to it's simplest form. Implementors should ensure that canonicalize throws
46-
* a {@link ValidationException} if the input contains multiple or mixed encodings in most cases, special cases
47-
* may allow these specific situations.
48-
*
49-
* Validation should invoke this method prior to validating data.
50-
*
51-
* @param input The input to be canonicalized
52-
* @param <T> Data-Type inferred by the input argument
53-
* @return The supplied input reduced to it's simplest form.
54-
*/
55-
<T> T canonicalize(T input) throws ValidationException;
5643
}

0 commit comments

Comments
 (0)