Tuesday, January 20, 2015

H2O version 0.9.1 released with support for HTTP2 dependency-based prioritization

Today I am pleased to announce the release of H2O version 0.9.1. The release tar-ball can be found in the releases section of the GitHub repository.

H2O is an optimized HTTP server with support for HTTP/1.x and the upcoming HTTP/2 protocol.

Version 0.9.1 contains many improvements that have gone into the source tree since the initial release, including contributions by people other than the original developers. I am very happy and thankful to those who have started sending pull-requests to H2O.

The full list of the changes can be found in the changelog. But on this blogpost I would like to concentrate on the issues regarding HTTP/2.

■Support for draft-16

H2O version 0.9.1 supports both draft-14 and draft-16. Draft-16 is the latest draft of HTTP/2, now at the final stage of becoming an RFC standard. Draft-14 is still supported for interoperability with older user-agents.

■Dependency-based prioritization

Dependency-based prioritization is one of the key functions of HTTP/2. It allows the web browsers to control the order in which the server sends the responses (e.g. send CSS files before HTML). Firefox/37 is going to start using the prioritization method, and we are happy to be able to support it. For more information about dependency-based prioritization and how it would be used, I would suggest reading Bits Up!: HTTP/2 Dependency Priorities in Firefox 37.

As Mozilla has enabled HTTP/2 by default in Firefox/35, it is now possible to take the benefit of using HTTP/2 to serve contents to the users faster than ever. I hope that H2O could be of some help in doing so.

Thursday, January 15, 2015

[メモ] TCP上(もしくはHTTP)にリトライ可能なアプリケーションプロトコルを実現する方法

HTTP/1.1の持続的接続においては、サーバがリクエストを受け取ったあとに異常終了したのか、リクエストを受け取らずに接続を閉じたのか判別することができない。このため、べき等性の保証がないアプリケーションにおいて、リトライを行うべきか否か自動的に判断できなくなる場合がしばしば発生する

リトライ可能か否か(ピアがメッセージの処理を開始した否か)を判別するには、より細かな情報交換を行う別種のプロトコルを採用しても良いが、複雑なプトロコルはパフォーマンスに悪影響を及ぼす可能性が高いので避けたいところである。

というわけで、以下本題。

pipeliningを行わないHTTP/1.1のような単純なリクエスト/レスポンス型プロトコルをそのままに、アプリケーションレイヤへのリクエスト到達可否を判定する手軽な方法としては、SO_LINGERを用いる方法がある。具体的には、以下のような形式でサーバを実装する。
while (1) {
  receive_request();
  use_RST_for_close(); // SO_LINGERを使い、RSTで切断するよう設定
  handle_request();
  send_response();
  use_graceful_close(); // SO_LINGERを使い、graceful closeを行うよう設定
}
クライアントサイドでは、リクエスト送信後にEPIPE(write(2)の場合)かECONNRESET(read(2)の場合)を受け取った場合のみ、リクエストを再送すればよい。

別解としては、サーバが接続を切断する際に「HTTP/1.1 599 Going Away」のようなレスポンスを(たとえリクエストを受信していなくとも)送信するという方法が考えられる(この場合はlingering closeを行わない)。クライアントは、サーバからこのレスポンスを受信した場合のみ、リクエスト再送を行えば良い。

追記: H2Oでは後者の方式をサポートしようかと考えている。そうすれば「Docker と SO_REUSEPORT を組み合わせてコンテナのHot Deployにチャレンジ - blog.nomadscafe.jp」で挙げられているようなデプロイ手法において、(例示されている常にリトライする方法とは異なり)安全なホットデプロイが実現可能になる。

注: パイプライン処理については行わない前提で考える

Friday, December 26, 2014

[Ann] Initial release of H2O, and why HTTPD performance will matter in 2015

Happy Holidays!

Today I am delighted to announce the first release of H2O, version 0.9.0; this is a christmas gift from me.


H2O is an optimized HTTP server with support for HTTP/1.x and the upcoming HTTP/2; it can be used either as a standalone server or a library.

Built around PicoHTTPParser (a very efficient HTTP/1 parser), H2O outperforms Nginx by a considerable margin. It also excels in HTTP/2 performance.



Why do we need a new HTTP server? The answer is because its performance does matter in the coming years.

It is expected that the number of files being served by the HTTP server will dramatically increase as we transit from HTTP/1 to HTTP/2.

This is because current techniques used to decrease the number of asset files (e.g. CSS sprites and CSS concatenation) becomes a drag in page rendering speed in HTTP/2. Such techniques were beneficial in HTTP/1 since the protocol had difficulty in utilizing all the available bandwidth. But in HTTP/2 the issue is fixed, and the overhead of transmitting all the images / CSS styles used by the website at once while only some of them is needed to render a specific page, becomes a bad idea. Instead, switching back to sending small asset files for every required element consisting the webpage being request becomes an ideal approach.

Having an efficient HTTP/1 server is also a good thing, as we large-scale adopt the idea of Microservices; it increases the number of HTTP requests transmitted within the datacenter.

As shown in the benchmark charts, H2O is designed with these facts in mind, making it (as we believe) an ideal choice of HTTP server of the future.

With this first release, H2O is concentrates on serving static files / working as a reverse proxy at high performance.

Together with the contributors I will continue to optimize / add more features to the server, and hopefully reach a stable release (version 1.0.0) when HTTP/2 becomes standardized in the coming months.

Stay tuned.

PS. It is also great that the tools developed by H2O is causing other effects; not only have we raised the bar on HTTP/2 server performance (nghttp2 (a de-facto reference implementation of HTTP/2) has become much faster in recent months), the performance race of HTTP/1 parser has once again become (Performance improvement and benchmark by indutny · Pull Request #200 · joyent/http-parser, Improving PicoHTTPParser further with AVX2), @imasahiro is working on merging qrintf (a preprocessor that speeds up the sprinf(3) family by a magnitude developed as a subproduct of H2O) to Clang. Using H2O as a footstep, I am looking forward to bringing in new approaches for running / maintaining websites next year.

Monday, December 22, 2014

URL パーサにおける IPv6 対応

プログラマにとって IPv6 対応といえば、C言語のような低レベルな言語の話であって、ホスト名を文字列で扱うスクリプト言語には関係ないと考えがちです。ですが、実際には、文字列の取り扱いにおいても対応が必要になるところがあります。

その代表例が URL のパーサです。多くのエンジニアがイメージする URL の書式は「protocol://host:port/path#fragment」です。この書式を見れば、「:」と「/」を用いてトークンを分割するのが自然であり、RFC を参照せずに記述された URL パーサは、host の中に「:」が含まれないことを前提としていることがあります。あるいは、/[0-9A-Za-z_-\.]+/ のような正規表現を使って、ホスト名または IPv4 アドレスをチェックしている場合も多いのではないでしょうか。

ところが、IPv6 アドレスの文字列表現には「:」が含まれるため、IPv6 のアドレスを含む URL については、

http://[::FFFF:129.144.52.38]:80/index.html

のように、「[]」で囲むこと、と規定されています注1。つまり、「http://」のあとの最初の「:」をポート番号の始まりだと解釈する URL パーサは、IPv6 アドレスを正しく取り扱えないのです。

今後、IPv6 を利用する機会が増えるにともない、このような IPv6 アドレスを含んだ URL を取り扱えないパーサによるバグに直面する可能性も出てくるでしょう。

年末の大掃除がてら、自分が使っているURLパーサの挙動を確認・修正するのも良いかもしれませんね注2

注1: 参照 RFC 3986
注2: なにを言いたかったというと、H2Oの URL パーサで IPv6 対応するのを忘れていたってことですね!!!!! というわけでこの記事は H2O Advent Calendar 2014 の一部です。てへぺろ

Friday, December 19, 2014

Memory Management in H2O

This blogpost (as part of the H2O Advent Calendar 2014) provides a high-level overview of the memory management functions in H2O that can be categorized into four groups.

h2o_mem_alloc, h2o_mem_realloc

They are wrappers of malloc(3) / realloc(3), that calls abort(3) if memory allocation fails. The returned chunks should be freed by calling free(3).

h2o_mem_init_pool, h2o_mem_clear_pool, h2o_mem_alloc_pool

The functions create, clear, and allocate from a memory pool. The term memory pool has several meanings, but in case of H2O the term has been borrowed from Apache; it refers to a memory allocator that frees all associated chunks at once when the destructor (h2o_mem_clear_pool) is being called.

The primary use-case of the functions is to allocate memory that relates to a HTTP request. The request object h2o_req_t has a memory pool associated to it; small chunks of memory that need to be allocated while handling a request should be obtained by calling h2o_mem_alloc_pool instead of h2o_mem_alloc, since the former is generally faster than the latter.

h2o_mem_alloc_shared, h2o_mem_link_shared, h2o_mem_addref_shared, h2o_mem_release_shared

They are the functions to handle ref-counted chunks of memory. Eeach shared chunk has its own dispose callback that gets called when the reference counter reaches zero. A chunk can be optionally associated to a memory pool, so that the reference counter gets decremented when the pool gets flushed.

The functions are used for handling things like headers transferred via HTTP/2, or to for associating a resource that needs a custom dispose callback to a HTTP request through the use of the memory pool.

h2o_buffer_init, h2o_buffer_dispose, h2o_buffer_reserve, h2o_buffer_consume, h2o_buffer_link_to_pool

The functions provide access to buffer, that can hold any length of octets. They internally use malloc(3) / realloc(3) for handling short buffers, and switch to using temporary-file-backed mmap(2) when the length of the buffer reaches a predefined threshold (default: 32MB). A buffer can also be associated to memory pool by calling the h2o_buffer_link_to_pool function.

The primary use-case of the buffer is to store incoming HTTP requests and POST contents (as it can be used to hold huge chunks on 64-bit systems since it switches to temporary-file-backed memory as described).

h2o_vector_reserve

The function reserves given number of slots for H2O_VECTOR which is a variable length array of an arbitrary type of data. Either h2o_mem_realloc or the memory pool can be used as the underlying memory allocator (in the former case, the allocated memory should be manually freed by the caller). The structure is initialized by zero-filling it.

The vector is used everywhere, from storing a list of HTTP headers to a list of configuration directives.

For details, please refer to their doc-comment and the definitions in include/h2o/memory.h and lib/memory.c.

Tuesday, December 16, 2014

GitHub で submodule ではなく subtree を使うべき理由

GitHub には、タグを打つとソースパッケージを自動的にリリースするという機能があります。スクリプト言語においては、それぞれの言語について一般的なパッケージ管理システム注1があるため、この機能を使うことが少ないかと思いますが、デファクトのパッケージ管理システムが存在しないC等の言語で書かれたプログラムや、単独で動作する管理用のスクリプトを GitHub で開発・配布する際には、本機能はとても便利なものです。

しかし、この機能は git-archive コマンドのラッパーとして実装されているため、サブモジュールのファイルが含まれないという問題を抱えています。この点は GitHub の人たちも認識しているものの、今のところ GitHub で独自に対応するということは考えていないようです注2

私がこの問題を 知ることになったのは、picojson の issue で指摘を受けたからです。picojson については問題が「テストが動かない」という程度なので後回しにしても良かったのですが、H2O についても同様の問題が発生することが目に見えていました。

そこでどうするか、irc で相談、実験した結果、サブモジュールのかわりに サブツリーを使えば、参照先のファイルについても git-archive の結果に含めることが可能であることがわかり、picojson についてはサブツリーへの移行を完了しました。

ツールの仕様に引っ張られてやり方を変えるという、ある意味しょうもない話なのですが、H2O についても今後リリースまでにサブツリーへの切り替えを行おうと考えています。

※本記事 H2O Advent Calendar 2014 の一部です。

注1: たとえば Perl については CPAN、JavaScript については NPM が存在する
注2: 参照: » Github zip doesn’t include Submodules Academic Technology Group Developers Blog のコメント

Monday, December 15, 2014

PicoHTTPParser now has a chunked-encoding decoder

Today I have added phr_decode_chunked - a function for decoding chunked-encoded input - to picohttpparser.

As suggested in the doc-comment of the function (shown below), the function is designed to decode the data in-place. In other words, it is not copy-less.
/* the function rewrites the buffer given as (buf, bufsz) removing the chunked-
 * encoding headers. When the function returns without an error, bufsz is
 * updated to the length of the decoded data available. Applications should
 * repeatedly call the function while it returns -2 (incomplete) every time
 * supplying newly arrived data. If the end of the chunked-encoded data is
 * found, the function returns a non-negative number indicating the number of
 * octets left undecoded at the tail of the supplied buffer. Returns -1 on
 * error.
 */
ssize_t phr_decode_chunked(struct phr_chunked_decoder *decoder, char *buf,
                           size_t *bufsz);
It is intentionally designed as such.

Consider a input like the following. The example is more than 2MB long even though it contains only 2 bytes of data. The input is conformant to the HTTP/1.1 specification since it does not define the maximum length of the chunked extensions, requires every conforming implementation to ignore unknown extensions.
1 very-very-veery long extension that lasts ...(snip) 1MB
a
1 very-very-veery long extension that lasts ...(snip) 1MB
a
To handle such input without getting the memory exhausted, a decoder should either a) only preserve the decoded data (requires a copy), or b) limit the size of the chunked-encoded data.

B might have been easier to implement, but such a feature might be difficult to administer. So I decided to take the route a, and for simplicity implemented the decoder to always adjust the position of the data in-place.

Always calling memmove for adjusting the position might induce some overhead, but I assume it to be negligible for two reasons: both the source and destination would exist in the CPU cache / the overhead of unaligned memory access is small on recent Intel CPU.

For ease-of-use, I have added examples to the README.