Monday, December 15, 2014

PicoHTTPParser now has a chunked-encoding decoder

Today I have added phr_decode_chunked - a function for decoding chunked-encoded input - to picohttpparser.

As suggested in the doc-comment of the function (shown below), the function is designed to decode the data in-place. In other words, it is not copy-less.
/* the function rewrites the buffer given as (buf, bufsz) removing the chunked-
 * encoding headers. When the function returns without an error, bufsz is
 * updated to the length of the decoded data available. Applications should
 * repeatedly call the function while it returns -2 (incomplete) every time
 * supplying newly arrived data. If the end of the chunked-encoded data is
 * found, the function returns a non-negative number indicating the number of
 * octets left undecoded at the tail of the supplied buffer. Returns -1 on
 * error.
 */
ssize_t phr_decode_chunked(struct phr_chunked_decoder *decoder, char *buf,
                           size_t *bufsz);
It is intentionally designed as such.

Consider a input like the following. The example is more than 2MB long even though it contains only 2 bytes of data. The input is conformant to the HTTP/1.1 specification since it does not define the maximum length of the chunked extensions, requires every conforming implementation to ignore unknown extensions.
1 very-very-veery long extension that lasts ...(snip) 1MB
a
1 very-very-veery long extension that lasts ...(snip) 1MB
a
To handle such input without getting the memory exhausted, a decoder should either a) only preserve the decoded data (requires a copy), or b) limit the size of the chunked-encoded data.

B might have been easier to implement, but such a feature might be difficult to administer. So I decided to take the route a, and for simplicity implemented the decoder to always adjust the position of the data in-place.

Always calling memmove for adjusting the position might induce some overhead, but I assume it to be negligible for two reasons: both the source and destination would exist in the CPU cache / the overhead of unaligned memory access is small on recent Intel CPU.

For ease-of-use, I have added examples to the README.

14 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. I'm shakshi.I am working in India top most Escort service Mumbai Call girls.If u want to join the all facility of escorts Puri call girls
    pls call me & whatssap Cuttack call girls
    visit the sites. Bhubaneswar call girls or follow me. jaipur call girls

    ReplyDelete

  3. Hi friends ! I'm Archana Kumari. I like to have bed relationship with different persons rather than with a single man for the whole of life. Not only is this my notion , but also is this fact that the secret organs are made for enjoyment.
    Please follow me on the links given bellow.


    Mumbai Escorts
    Mumbai Call girls
    Escorts in Kolkata

    Escorts in Kolkata
    Escorts In kolkata
    Jaipur Call girls


    Kolkata Escorts
    Call girl in Kolkata
    House wife escorts in Kolkata
    If you are interested in me and my ideology ,then you are invited to my bed. Thanks.
    ========================================================

    ReplyDelete
  4. Hey Thanks for sharing this valuable information with us. I will come back to your site and keep sharing this information with us.
    Best Regards - www.office.com/setup
    www.office.com/setup
    www.office.com/setup

    ReplyDelete
  5. What a way to express things. I have found comfort in reading this post and i am sure other users may feel the same, because this post carries a social message in it. Keep it up ! Regards:-
    office.com/setup | norton.com/setup | mcafee.com/activate

    ReplyDelete

Note: Only a member of this blog may post a comment.