Thursday, October 22, 2015

Performance improvements with HTTP/2 push and server-driven prioritization

tl;dr

HTTP/2 push only marginally improves web-site performance (even when it does). But it might provide better user experience over mobile networks with TCP middleboxes.

Introduction

Push is an interesting feature of HTTP/2.

By using push, HTTP servers can start sending certain asset files that block rendering (e.g. CSS and script files) before the web browser issues requests for such assets. I have heard (or spoken myself of) anticipations that by doing so we might be able to cut down the response time of the Web.

CASPER (cache-aware server pusher)

The biggest barrier in using HTTP/2 push has so far been considered cache validation.

For the server to start pushing asset files, it needs to be sure that the client does not already have the asset cached. You would never want to push a asset file that is already been cached by the client - doing so not only will waste the bandwidth but also cause negative effect on response time. But how can a server determine the cache state of the client without spending an RTT asking to the client?

That's were CASPER comes in.

CASPER (abbreviation for cache-aware server pusher) is a function introduced in H2O version 1.5, that tracks the cache state of the web browser using a single Cookie. The cookie contains a fingerprint of all the high-priority asset files being cached by the browser compressed using Golomb-compressed sets.

The cookie is associated with every request sent by the client1. So when the server receives a request to HTML, it can immediately determine whether or not the browser is in possession of the blocking assets (CSS and script files) required to render the requested HTML. And it can push the only files that are known not to be cached.

With CASPER, it has now become practical to use HTTP/2 push for serving asset files.

Using HTTP/2 push in H2O

This week, I have started using CASPER on h2o.examp1e.net - the tiny official site of H2O.

The configuration looks like below. The mruby handler initiates push of JavaScript and CSS files if the request is against the top page or one of the HTML documents, and then (by using 399 status code) delegates the request to the next handler (defined by file.dir directive) that actually returns a static file. http2-casper directive is used to turn CASPER on, so that the server will discard push attempts initiated by the mruby handler for assets that are likely cached. http2-reprioritize-blocking-assets is a performance tuning option that raises the priority of blocking assets to highest for web browsers that do not.
"h2o.examp1e.net:443":
    paths:
      "/":
        mruby.handler: |
          lambda do |env|
            push_paths = []
            if /(\/|\.html)$/.match(env["PATH_INFO"])
              push_paths << "/search/jquery-1.9.1.min.js"
              push_paths << "/search/oktavia-jquery-ui.js"
              push_paths << "/search/oktavia-english-search.js"
              push_paths << "/assets/style.css"
              push_paths << "/assets/searchstyle.css"
            end
            return [
              399,
              push_paths.empty? ?
                {} :
                {"link" => push_paths.map{|p| "<#{p}>; rel=preload"}.join("\n")},
              []
            ]
          end
        file.dir: /path/to/doc-root
    http2-casper: ON
    http2-reprioritize-blocking-assets: ON

The benchmark

With the setting above (with http2-reprioritize-blocking-assets both OFF and ON), I have measured unload, first-paint, and load timings2 using Google Chrome 46, and the numbers are as follows. The RTT between the server and the client was ~25 milliseconds. The results were mostly reproducible between multiple attempts.


First, let's look at the first two rows that have push turned off. It is evident that reprio:on3 starts rendering the response 50 milliseconds earlier (2 RTT). This is because unless the option is turned on, the priority tree created by Chrome instructs the web browser to interleave responses containing CSS / JavaScript with those containing image files.

Next, let's compare the first two rows (push:off) with the latter two (push:off). It is interesting that unload timings have moved towards right. This is because when push is turned on, the contents of the asset files are sent before the contents of the HMTL. Since web browsers unload the previous page when it receives the first octets of the HTML file, using HTTP/2 push actually increases the time spent until the previous page is unloaded.

The fact will have both positive and negative effects to user experience; the positive side is that time user sees a blank screen decreases substantially (the red section - time spent after unload before first-paint). The negative side is that users would need to wait longer until he/she knows that the server has responded (by the browser starting to render the next page).

It is not surprising that turning on push only somewhat improves the first-paint timing compared to both being turned off; the server is capable of sending more CSS and JavaScript before it receives request for image files, and start interleaving the responses with them.

On the other hand, it might be surprising that using push together with reprioritization did not cause any differences. The reason is simple; in this scenario, transferring the necessary assets and the <head> section of the HTML (in total about 320KB) required about 10 round trips (including overhead required by TCP, TLS, HTTP/2). With this much roundtrips, the merit of push can hardly be observed; considering the fact that push is technique to eliminate one round trip necessary for the browser to issue requests for the blocking assets4.

Conclusion

The benchmark reinforces the claims made by some that HTTP2 push will only have marginal effect on web performance5. The results have been consistent with expectations that using push will only optimize the web performance by 1 RTT at maximum, and it would be hard to observe the difference considering the effect of TCP slow start and how many roundtrips are usually required to render a web page.

This means to the users of H2O (with reprioritization turned on by default) that they can expect near-best performance without using push.

On the other hand, we may still need to look at networks having TCP proxies. As discussed in Why TCP optimisation has become more important than content optimization (devcentral.f5.com) some mobile carriers do seem to have such middlebox installed.

Existence of such device is generally a good thing since it not only reduces packet retransmits but also improves TCP bandwidth during the slow-start phase. But the downside is that their existence usually increase application-level RTT, since they expand the amount of data in-flight, which has a negative impact on the responsiveness of HTTP/2. HTTP/2 push will be a good optimization under such network conditions.


Notes:
1. the fingerprint contained in the Cookie header is efficiently compressed by HPACK
2. wpbench was used to collect the numbers; first-paint was calculated as max(head-parsed, css-loaded); in this benchmark, DOMContentLoaded timing was indifferent to first-paint
3. starting from H2O version 1.5, http2-reprioritize-blocking-assets option is turned on by default
4. at 10 RTT it is unlikely that we have hit the maximum network bandwidth, and that means that packets will be received by the browser in batch every RTT
5. there are use cases for HTTP/2 push other than pushing asset files

9 comments:

  1. "That's were CASPER comes in."
    Typo? 'were' should be 'where'?

    ReplyDelete
  2. Nice Post to share. Thanks...

    Fred Lam's ipro

    If you're aiming to find out about internet market
    ing, social networks, and so on, then Fred Lam's T
    he Better Business Mogul Blog is an exceptional lo
    cation to start. You'll learn a lot in an easy, ea
    sy to digest way that is a breath of fresh air in
    the blogging world.

    ipro Reviews

    ReplyDelete
  3. If you're aiming to find out about internet market
    ing, social networks, and so on, then Fred Lam's T
    he Better Business Mogul Blog is an exceptional lo
    cation to start.
    ipro Academy Reviews


    ReplyDelete
  4. 100k factory ultra edition reviews
    ------------------------------------------------------------------------------------------------------------------
    There are many errors that can hurt your PPC project without you even understanding it. For this factor, I wish to present to you the leading 5 factors your Pay Per Click project suffers online. look at

    part 1 of this article..
    ------------------------------------------------------------------------------------------------------------------
    100k factory ultra edition

    ReplyDelete
  5. Hello ,
    It's really a nice and helpful piece of info. I'm satisfied that you simply shared this helpful information with us. Please keep us informed like this. Thank you for sharing.
    http://listacademyanik.com/dna-wealth-blueprint-3-0-review-bonus

    ReplyDelete