Monday, December 15, 2014

PicoHTTPParser now has a chunked-encoding decoder

Today I have added phr_decode_chunked - a function for decoding chunked-encoded input - to picohttpparser.

As suggested in the doc-comment of the function (shown below), the function is designed to decode the data in-place. In other words, it is not copy-less.
/* the function rewrites the buffer given as (buf, bufsz) removing the chunked-
 * encoding headers. When the function returns without an error, bufsz is
 * updated to the length of the decoded data available. Applications should
 * repeatedly call the function while it returns -2 (incomplete) every time
 * supplying newly arrived data. If the end of the chunked-encoded data is
 * found, the function returns a non-negative number indicating the number of
 * octets left undecoded at the tail of the supplied buffer. Returns -1 on
 * error.
 */
ssize_t phr_decode_chunked(struct phr_chunked_decoder *decoder, char *buf,
                           size_t *bufsz);
It is intentionally designed as such.

Consider a input like the following. The example is more than 2MB long even though it contains only 2 bytes of data. The input is conformant to the HTTP/1.1 specification since it does not define the maximum length of the chunked extensions, requires every conforming implementation to ignore unknown extensions.
1 very-very-veery long extension that lasts ...(snip) 1MB
a
1 very-very-veery long extension that lasts ...(snip) 1MB
a
To handle such input without getting the memory exhausted, a decoder should either a) only preserve the decoded data (requires a copy), or b) limit the size of the chunked-encoded data.

B might have been easier to implement, but such a feature might be difficult to administer. So I decided to take the route a, and for simplicity implemented the decoder to always adjust the position of the data in-place.

Always calling memmove for adjusting the position might induce some overhead, but I assume it to be negligible for two reasons: both the source and destination would exist in the CPU cache / the overhead of unaligned memory access is small on recent Intel CPU.

For ease-of-use, I have added examples to the README.

10 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. I'm shakshi.I am working in India top most Escort service Mumbai Call girls.If u want to join the all facility of escorts Puri call girls
    pls call me & whatssap Cuttack call girls
    visit the sites. Bhubaneswar call girls or follow me. jaipur call girls

    ReplyDelete
  3. What a way to express things. I have found comfort in reading this post and i am sure other users may feel the same, because this post carries a social message in it. Keep it up ! Regards:-
    office.com/setup | norton.com/setup | mcafee.com/activate

    ReplyDelete

Note: Only a member of this blog may post a comment.