Utilizing HTTP2 PUSH in a Single Page Application

Web application startup time has a big impact on user engagement. When writing client-side applications, we deal with this problem in a variety of ways such as:

Pre-rendering the application on the server.
Minimizing the amount of JavaScript needed for a page by implementing code-splitting and smaller libraries.

HTTP/1 uses one connection per resource, whereas HTTP/2 allows you to use a single connection to serve as many resources as you need. This is such a big change that it warrants rethinking the strategies we use in client oriented applications. With HTTP/1, the blocker to having a booted application is that the resources needed are spread into several requests that are not triggered until the initial HTML is loaded:

This leaves us with two options:

Send as little initial HTML as possible, so that the browser can start downloading the page’s resources (JS, CSS, data) in parallel, to the degree that's possible.
Render the page (mostly) on the server, so that when it gets to the user, they at least have something to see while the application is booting in the background.

Depending on what type of application you are building, it might be better to pick either option with HTTP/1. Pick option 1 if you are building a highly-interactive application like a chat client. Pick option 2 if you are building a passive application like a news website or an ecommerce site; here user retention is driven by what they can see.

HTTP/2 PUSH

The equation changes with HTTP/2 because of the PUSH capability. We are currently exploring how to best utilize HTTP/2 push to make DoneJS apps even faster. Above, I outlined the two main strategies for getting applications booted in HTTP/1. With HTTP/2, the strategies can change because the constraints have changed. Now resource loading can look like:

HTTP/2 supports multiplexing, allowing multiple requests and responses to be intermingled in a single TCP connection.

To explore how we could take advantage of these new capabilities, we set out to compare two strategies we have in mind:

A traditional Single Page Application (SPA) approach where a small HTML page is sent to the client, but with the JavaScript and CSS pushed at the same time.
A hybrid server-client rendered application, where rendering happens on the server and each modification to the virtual DOM is streamed to the client and replicated. As with the SPA approach, the JavaScript and data are streamed as well, except in this case only a small amount of initial JavaScript is required. We are calling this the incremental render approach.

The advantages to the incremental render approach are that:

It uses of the same application code on the server that you would write for the client, so no extra effort is needed.
Rendering begins as soon as the request hits the server, but doesn’t wait until it’s completely finished. This means you get some basic HTML right away. Things like your header and basic page layout will be immediately seen by the user, and contents within the head (such as stylesheets) will be rendered by the browser right away.

The traditional approach (shown below) is able to push more to the browser up front, but still relies on a back-and-forth communication with the browser.

With the incremental render approach (below) all communication is one-way once the server receives the request. And since the updates are sent as part of the initial request as a stream, no additional connection needs to be made from the browser (as would be the case if using web sockets).

A big note of warning here; support for HTTP/2 PUSH is just beginning to roll out in browsers and isn't at all consistent. Check out this article to learn more on these inconsistencies. To make this viable today we are making done-ssr smart; it will be able to automatically switch back to the more conservative rendering strategy if incremental rendering is likely to fail.

The Data

To test these methods, I built a simple app that renders a list it fetches from an API. The methodology of these tests was to measure the times (in milliseconds) at different points in an app’s lifecycle:

Load: How long until the first bit of JavaScript executes.
First render: How long until the first item is rendered.
Last render: How long until the app is fully rendered.

The traditional single page application uses CanJS as its framework and contains 800k of uncompressed JavaScript. The incrementally rendered version pushes that same JavaScript bundle, but also includes a small script that handles pulling in the rendering instructions.

The project's code is available here.

Slow data

This test included a slow data request, taking 10ms to return each item (with 100 total items).

Here we can see that the incremental render method starts a bit faster but finishes in about the same speed as the traditional SPA method; this is because the data is the slowest link in the chain. The SPA is able to fully load and begin rendering before the data has finished being pushed, so it is about as fast as the incremental render method.

Fast data

This test uses a very fast data request that can respond to an item in only 1ms.

In this case the incremental render approach is a bit faster than before. This is because the data is no longer holding it back, and therefore the difference in file size is more significant.

Slow data & slow connection

This test has slow data and also has a slow connection (3G speed).

When you slow down the network you can see significant gains with the incremental render approach. Because the amount of JavaScript required to start rendering with the incremental render approach is so small, it is easily able to download quickly over 3G. It is affected by network conditions on how quickly it can begin to render, but it finishes almost just as quickly.

On the other hand, with a slow connection, needing to download a large SPA bundle is quite a burden. It takes over 18 seconds just to load!

Fast data & slow connection

Here we again have a slow connection, but this time the data requests are not a blocker.

Similar results here as before, the connection being slow disproportionately affects the traditional SPA approach.

Observations

Some observations we can take from this data:

Your app is going to be as slow as the slowest resource. That could be a slow API layer or a large JavaScript bundle.
A slow network connection punishes large resources. Using progressive loading will help here, but since your libraries are usually going to be in the main bundle it’s better to have less JavaScript needed to render.
Pushing your data (from API requests) is a big win that every type of application can benefit from.

Next Steps

We are pleased to see that HTTP/2 PUSH can greatly improve load times and we’re looking for ways we can take advantage of this in DoneJS. One thing I’ve learned from this research is that different types of apps can benefit from different strategies. With that in mind, I’d like to see done-ssr have different “modes” based on what type of application you are developing. One mode might be tuned for traditional SPAs that do not send rendered HTML to the client, but rather send a small HTML page and push their scripts and data. Another mode might be the incremental rendering approach discussed in this article.

In the coming months, we’ll be prototyping the incremental rendering method in DoneJS, as well as bringing HTTP/2 support to done-serve, and likely add many other changes related to HTTP/2 streaming. Please watch the DoneJS Community Hangouts for more in the coming months.

Contact Us