@@ -24,7 +24,7 @@ The parser extracts the following information from HTTP messages:
2424 * Response status code
2525 * Transfer-Encoding
2626 * HTTP version
27- * Request path, query string, fragment
27+ * Request URL
2828 * Message body
2929
3030Building
@@ -49,3 +49,135 @@ Usage
4949 help or have suggestions, feel free to contact me at
50505151
52+
53+ One ` http_parser ` object is used per TCP connection. Initialize the struct
54+ using ` http_parser_init() ` and set the callbacks. That might look something
55+ like this for a request parser:
56+
57+ http_parser_settings settings;
58+ settings.on_path = my_path_callback;
59+ settings.on_header_field = my_header_field_callback;
60+ /* ... */
61+
62+ http_parser *parser = malloc(sizeof(http_parser));
63+ http_parser_init(parser, HTTP_REQUEST);
64+ parser->data = my_socket;
65+
66+ When data is received on the socket execute the parser and check for errors.
67+
68+ size_t len = 80*1024, nparsed;
69+ char buf[len];
70+ ssize_t recved;
71+
72+ recved = recv(fd, buf, len, 0);
73+
74+ if (recved < 0) {
75+ /* Handle error. */
76+ }
77+
78+ /* Start up / continue the parser.
79+ * Note we pass recved==0 to signal that EOF has been recieved.
80+ */
81+ nparsed = http_parser_execute(parser, &settings, buf, recved);
82+
83+ if (parser->upgrade) {
84+ /* handle new protocol */
85+ } else if (nparsed != recved) {
86+ /* Handle error. Usually just close the connection. */
87+ }
88+
89+ HTTP needs to know where the end of the stream is. For example, sometimes
90+ servers send responses without Content-Length and expect the client to
91+ consume input (for the body) until EOF. To tell http_parser about EOF, give
92+ ` 0 ` as the forth parameter to ` http_parser_execute() ` . Callbacks and errors
93+ can still be encountered during an EOF, so one must still be prepared
94+ to receive them.
95+
96+ Scalar valued message information such as ` status_code ` , ` method ` , and the
97+ HTTP version are stored in the parser structure. This data is only
98+ temporally stored in ` http_parser ` and gets reset on each new message. If
99+ this information is needed later, copy it out of the structure during the
100+ ` headers_complete ` callback.
101+
102+ The parser decodes the transfer-encoding for both requests and responses
103+ transparently. That is, a chunked encoding is decoded before being sent to
104+ the on_body callback.
105+
106+
107+ The Special Problem of Upgrade
108+ ------------------------------
109+
110+ HTTP supports upgrading the connection to a different protocol. An
111+ increasingly common example of this is the Web Socket protocol which sends
112+ a request like
113+
114+ GET /demo HTTP/1.1
115+ Upgrade: WebSocket
116+ Connection: Upgrade
117+ Host: example.com
118+ Origin: http://example.com
119+ WebSocket-Protocol: sample
120+
121+ followed by non-HTTP data.
122+
123+ (See http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75 for more
124+ information the Web Socket protocol.)
125+
126+ To support this, the parser will treat this as a normal HTTP message without a
127+ body. Issuing both on_headers_complete and on_message_complete callbacks. However
128+ http_parser_execute() will stop parsing at the end of the headers and return.
129+
130+ The user is expected to check if ` parser->upgrade ` has been set to 1 after
131+ ` http_parser_execute() ` returns. Non-HTTP data begins at the buffer supplied
132+ offset by the return value of ` http_parser_execute() ` .
133+
134+
135+ Callbacks
136+ ---------
137+
138+ During the ` http_parser_execute() ` call, the callbacks set in
139+ ` http_parser_settings ` will be executed. The parser maintains state and
140+ never looks behind, so buffering the data is not necessary. If you need to
141+ save certain data for later usage, you can do that from the callbacks.
142+
143+ There are two types of callbacks:
144+
145+ * notification ` typedef int (*http_cb) (http_parser*); `
146+ Callbacks: on_message_begin, on_headers_complete, on_message_complete.
147+ * data ` typedef int (*http_data_cb) (http_parser*, const char *at, size_t length); `
148+ Callbacks: (requests only) on_uri,
149+ (common) on_header_field, on_header_value, on_body;
150+
151+ Callbacks must return 0 on success. Returning a non-zero value indicates
152+ error to the parser, making it exit immediately.
153+
154+ In case you parse HTTP message in chunks (i.e. ` read() ` request line
155+ from socket, parse, read half headers, parse, etc) your data callbacks
156+ may be called more than once. Http-parser guarantees that data pointer is only
157+ valid for the lifetime of callback. You can also ` read() ` into a heap allocated
158+ buffer to avoid copying memory around if this fits your application.
159+
160+ Reading headers may be a tricky task if you read/parse headers partially.
161+ Basically, you need to remember whether last header callback was field or value
162+ and apply following logic:
163+
164+ (on_header_field and on_header_value shortened to on_h_*)
165+ ------------------------ ------------ --------------------------------------------
166+ | State (prev. callback) | Callback | Description/action |
167+ ------------------------ ------------ --------------------------------------------
168+ | nothing (first call) | on_h_field | Allocate new buffer and copy callback data |
169+ | | | into it |
170+ ------------------------ ------------ --------------------------------------------
171+ | value | on_h_field | New header started. |
172+ | | | Copy current name,value buffers to headers |
173+ | | | list and allocate new buffer for new name |
174+ ------------------------ ------------ --------------------------------------------
175+ | field | on_h_field | Previous name continues. Reallocate name |
176+ | | | buffer and append callback data to it |
177+ ------------------------ ------------ --------------------------------------------
178+ | field | on_h_value | Value for current header started. Allocate |
179+ | | | new buffer and copy callback data to it |
180+ ------------------------ ------------ --------------------------------------------
181+ | value | on_h_value | Value continues. Reallocate value buffer |
182+ | | | and append callback data to it |
183+ ------------------------ ------------ --------------------------------------------
0 commit comments