Monday, September 8, 2014

The reasons I stopped using libuv for H2O

Libuv is a great cross-platform library that abstracts various types of I/O by using callbacks.

So when I started writing H2O - a high-performance HTTP server / library implementation with support for HTTP1, HTTP2 and websocket, using libuv seemed like a good idea. But recently, I have stopped using it for sereval reasons. This blog post explains them.

■No Support for TLS

Although libuv provides an unified interface for various types of streams, it does not provide access to a TLS stream though the interface, nor does provide a way for application developers to implement their own types of streams.

The fact has been a disappointment to me. Initially I had created a tiny library called uvwslay that binds libuv streams to wslay - a websocket protocol implementation. But since libuv does not provide support for TLS, it was impossible to support SSL within the tiny library.

To support wss protocol (i.e. websocket over TLS) I needed to reimplement the websocket binding, to stop using libuv as its stream abstraction layer and switch to an abstraction layer defined atop of libuv that abstracts the differences between libuv streams and the I/O API provided by the SSL library being used (in my case it is OpenSSL).

It would be great if libuv could provide direct support for TLS in future (or provide a way to implement their own type of libuv streams), so that code directly calling the libuv API can support various types of streams.

■Memory Usage is not Optimal

Libuv provides a callback-based API for writes and I love the idea. It simplifies the complicatedness of implementing non-blocking I/O operations. However the problem of libuv is that it does not call the completion callback right after write succeeds. Instead, it is only until all I/Os are performed that the callbacks are called. This means that if you have 1,000 connections sending 64KB of data, your code first allocates 64KB of memory 1000 times (64MB in total) and then call free for all of those memory chunks, even when the network operation does not block. IMO, libuv should better call the completion callback immediately after the application returns the control back to the library after calling uv_write, so that the memory allocation pattern could be a repetition of malloc-and-then-free for 1000 times.

■No Support for Delayed Tasks

Libuv does not provide an interface to schedule actions to be run just before the I/O polling method is being called.

Such interface is necessary if you need to aggregate I/O operations and send the results to a single client. In case of HTTP2, a certain number of HTTP requests need to be multiplexed over a single TCP (or TLS) connection.

With lack of support for delayed tasks, you would need to use uv_timer with 0 second timeout to implement delayed tasks.

This works fine if the intention of the application developer is to aggregate read operations, but does not if the intension was to aggregate the result of write operations, since in libuv the write completion callbacks are called after the timers are called.

■Synchronous File I/O is Slow

Synchronous file I/O seems to be slow when compared to directly calling the POSIX API directly.

■Network I/O has Some Overhead

After I found out the issues mentioned above, I started to wonder how much overhead libuv imposes for network I/O.

So I stripped the parts that depended on libuv off from the stream abstraction layer of H2O (that hides the difference between TCP and TLS as described above) and replaced it with direct calls to POSIX APIs and various polling methods (i.e. select, epoll, queue).

The chart below shows the benchmark results taken on my development environment running Ubuntu 14.04 on VMware. H2O become 14% to 29% faster by not depending on libuv.


■The Decision

Considering these facts altogether, I have decided to stop using libuv in H2O, and use the direct binding that I have written and used for the benchmark, at least until the performance penalty vanishes and the TLS support gets integrated.

OTOH I will continue to use libuv for other projects I have as well as recommending it to others, since I still think that it does an excellent job in hiding the messy details of platform-specific asynchronous APIs with adequate performance.

PS. Please do not blame me for not trying to feedback the issues as pull requests to libuv. I might do so in the future, but such task is too heavy for me right now. I am writing this post in the hope that it would help people (including myself) understand more about libuv, and as an answer to @saghul.