Kazuho's Weblog: December 2015

Friday, December 18, 2015

H2O version 1.6.1 released, now officially supports 64-bit ARM processors

Today I am happy to announce the release of H2O version 1.6.1. The release fixes two build-related issues found in 1.6.0 (link: release notes).

Among them, one is a build error on ARMv8-A, i.e. the 64-bit ARM architecture.

Yes, we now officially support 64-bit ARM!

In late 2015 we have started to see 64bit ARM-based servers / appliances hit the market, and it is likely that we would see more in the coming year.

I hope that H2O will be the best HTTP server for the platform.

EDIT. Personally I am enjoying using Geekbox, a sub-$100 microserver with an octa-core Cortex-A53 processor running Ubuntu :)

Friday, December 4, 2015

H2O HTTP/2 server version 1.6.0 released

Today, I have tagged version 1.6.0 of the H2O HTTP/2 server.

Full list of changes can be found in the release note, but following are the significant ones as I believe.

Error pages are now customizable #606

Finally, error pages can be customized.

You can specify any URL; the contents of the specified URL will be sent to the client when returning an error response. Please consult the document of the errordoc handler for more information.

Support for WebSocket #581

Thanks to Justin Zhu, H2O standalone server is now capable of proxying WebSocket connections.

The feature is considered experimental for the time being and you would need to turn it on explicitly if you need to use it. We will turn the feature on by default in future releases.

Smart file descriptor caching #596

The standalone server now caches open file descriptors, so that serving small static files is significantly faster in version 1.6 when compared to 1.5.

Website administrators need not worry about stale files getting served; the cached file descriptors are closed every once the event loop is being run. In other words, requests that arrive after a file is being modified are guaranteed to have the modified version being served.

Hopefully I will post a benchmark covering the changes in an upcoming blogpost.

HTTP/2 push triggered by mruby is processed ASAP #593

In prior releases, pushed responses were not sent until the response to the request that triggered the push become ready. In 1.6.0, the behavior has been changes so that push triggered by mruby is actually sent while the request is processed by an application server.

The change has noticeable positive impact on user experience, please consult Optimizing performance of multi-tier web applications using HTTP/2 push.

PS. and if you are interested in using H2O with a certificate issued by Let's Encrypt, please refer to Using H2O with Let's Encrypt.

Have fun!

Thursday, December 3, 2015

Optimizing performance of multi-tier web applications using HTTP/2 push

Push is a feature of HTTP/2, that allows a server to speculatively send response to a client, anticipating that the client will use the response.

In my earlier blogpost, I wrote that HTTP/2 push does not have significant effect on web performance when serving static files from a single HTTP/2 server. While that is true, push does improve performance by noticeable margin in other scenarios. Let's look into one common case.

The Theory

Many if not most of today's web applications are multi-tiered. Typically, an HTTP request from a client is first accepted by an httpd (either operated by the provider of the web service or by a CDN operator). The httpd serves asset files by itself, while routing requests for HTML documents to application server through fastcgi or HTTP/1.

It is when the response from the application server takes time that HTTP/2 push gives us a big performance boost.

The chart below should be a clear explanation why. With HTTP/2 push, it has become possible for a server to start sending assets that are going to be referred from HTML, before the generated HTML is returned from the application running behind.

Figure 1. Timing sequence of a multi-tiered webapp
(RTT: 50ms, processing-time: 200ms)

It is not uncommon for an web application to spend hundreds of milliseconds processing an HTTP request, querying and updating the database. It is also common for a CDN edge server to wait for hundreds of milliseconds fetching a HTTP response from an web application server through an inter-continental connection.

In case of the chart, RTT between httpd and client is 50ms and the processing time is 200ms. Therefore, the server is capable of spending 4 round-trips (or typically slightly above 200KB of bandwidth¹) for pushing asset files before HTML becomes ready to be served.

And thanks to push transactions, the connection will be warm enough by the time when HTML becomes available to the web server, so that the chance of the server being able to send whole document at once becomes bigger.

Theoretically, the upper bound of time reducible by the proposed approach (i.e. push assets until the main document becomes ready) is:

time_reduced_max = processing_time + 1 RTT

The additional 1 RTT appears if HTML being delivered is small that it is incapable of growing the send window in the pull case. time_reduced_min is obviously zero, when no resource that can be pushed exists.

Cache-aware Server Push

Even in case you have a time window that can be used to push few hundred kilobytes of data, you would definitely not want to waste the bandwidth by pushing responses already cached by the client.

That is why cache-aware server-pusher (CASPER) becomes important.

Initially implemented as an experimental feature in H2O HTTP2 server version 1.5, CASPER tracks the cache state of the web browser using a single Cookie². The cookie contains a fingerprint of all the high-priority asset files being cached by the browser compressed using Golomb-compressed sets. H2O updates the fingerprint every time it serves a high-priority asset file, as well as for determining whether certain asset files should be pushed or not.

It should be noted that the current fingerprint maintained by the cookie is at best a poor estimate of what is being cached by the client. Without a way to peek into the web browser cache, we cannot update the fingerprint stored in the cookie to reflect evictions from the cache². Ideally, web browsers should calculate the fingerprint by itself and send the value to the server. But until then, we have to live with using cookies (or a ServiceWorker-based implementation that would give us freedom in implementing our own cache³) as a hacky workaround.

Benchmark

Let's move on to an experiment to verify if the theory can be applied in practice.

For the purpose, I am using the top page of h2o.examp1e.net. The server (H2O version 1.6.0-beta2 with CASPER enabled; see configuration) is given 50ms simulated latency using tc qdisc, and a web application that returns index.html with 200ms latency is placed behind the server. Google Chrome 46 is used as the test client.

FWIW, the size of the responses being served are as follows:

Figure 2. Size of the Files by Type
File type	Size
index.html	3,619 bytes
blocking assets	319,700 bytes (5 files)
non-blocking assets	415,935 bytes (2 files)

Blocking assets are CSS and JavaScript files that block the critical rendering path (i.e. the files that need to be obtained by the browser before rendering the webpage). Non-blocking assets are asset files that do not block the critical path (e.g. images).

Next two figures are the charts shown by the Chrome's Developer Tools. In the former, none of the responses were pushed. In the latter, blocking assets were pushed using CASPER.

Figure 3. Chrome Timing Chart without Push

Figure 4. Chrome Timing Chart with Push⁴

As can be seen, both DOMContentLoaded and load events are observed around 230 msec earlier when push is being used; which matches the expectation that we would see an improvement of 200 msec to 250 msec.

Figure 5. Timing Improvements with Push
Event	Without Push (msec)	With Push (msec)	Delta (msec)	Gain
DOMContentLoaded	823	595	228	38%
load	1,010	775	235	30%

Conclusion

As shown in this blogpost, cache-aware server push can be used by a reverse proxy to push assets while waiting for the backend application server to provide dynamically generated content, effectively hiding the processing time of the application server. Or in case of CDN, it can be used to hide the latency between the edge server and the application server.

Considering how common it is the case that the processing time of an web application (or the RTT between an edge server and an application server) is greater than the RTT between the client and the reverse proxy (or the edge server in case of CDN), we can expect cache-aware server push to provide noticeable improvement to web performance in many deployments.

1: in common case where INITCWND is 10 and MSS is around 1,400 bytes, it is possible to send 150 packets in 4 RTT, reaching 210KB in total
2: fortunately, existence of false-positives in the fingerprint is not a big issue, since the client can simply revert to using ordinary GET request in case push is not used
3: ongoing work is explained in HTTP/2 Push を Service Worker + Cache Aware Server Push で効率化したい話 - Block Rockin' Codes
4: Chromes' timing chart shows pushed streams as being fetched when they are actually being adopted after received

EDIT: This blogpost is written as part of the http2 Advent Calendar 2015 (mostly in Japanese).

Wednesday, December 2, 2015

Using H2O with Let's Encrypt

Let's Encrypt is a new certificate authority that is going to issue certificates for free using automated validation process. They have announced that they will enter public beta on Dec. 3rd 2015.

This blogpost explains how to setup a H2O using the automated process.

Step 1. Install the client

% git clone https://github.com/letsencrypt/letsencrypt.git

Step 2. Obtain the certificate

If you already have a web server listening to port 80, then run:

% cd letsencrypt
% ./letsencrypt-auto certonly --webroot \
    --webroot-path $DOCROOT \
    --email $EMAIL \
    --domain $HOST¹

$DOCROOT should be the path of the web sever's document root. $EMAIL should be the email address of the website administrator. $HOST should be the hostname of the web server (also the name for which a new certificate will be issued).

Or if you do not have a web server listening on the server, then run:

% cd letsencrypt
% ./letsencrypt-auto certonly --standalone \
    --email $EMAIL \
    --domain $HOSTNAME

Issued certificate and automatically-generated private key will be stored under /etc/letsencrypt/live/$HOSTNAME.

Step 3. Configure H2O

Setup the configuration file of H2O to use the respective certificate and key files.

listen:
  port: 443
  ssl:
    certificate-file: /etc/letsencrypt/live/$HOSTNAME/fullchain.pem
    key-file: /etc/letsencrypt/live/$HOSTNAME/privkey.pem

Do not forget to replace $HOSTNAMEs within the snippet with your actual hostname.

That's all. Pretty simple, isn't it?
Kudos to the people behind Let's Encrypt for providing all of these (for free).

For more information, please consult documents on letsencrypt.org and h2o.examp1e.net.

1: you may also need to use --server option to obtain a production-ready certificate during the beta process