How to improve server response time (TTFB)?

8 Jul 2020

How quickly your server responds to requests has a significant effect on user experience. This article will look at how to identify the cause of slow responses and what options there are to fix them.

The Time to First Byte (TTFB) metric describes how quickly your server responds to requests. We'll assume a TCP connection has already been set up, so the metric will include one network roundtrip plus the time it takes for your server to generate a response.

TTFB does not include any additional time spent downloading the full server response. Sometimes the time spent setting doing the DNS lookup and setting up the TCP connection is included in the TTFB, but in this post we'll use the defintion used by Chrome DevTools and DebugBear.

TTFB as part of an HTTP request

The TTFB depends on how to close the user is to the website's data center. If you're in Sydney and the website is hosted in New York, then the roundtrip will take at least 200ms.

Most of the time we look at the document TTFB of the initial HTML request. This is because before this request finishes the user can't see any part of your website. The document request always blocks the First Contentful Paint.

Ajax or Fetch requests can also require significant processing, so looking at their TTFB is also useful, especially if they block important content from rendering.

For static resources the TTFB is less interesting, since they can usually be served quickly with little processing required by the server.

Determining if server response time is the problem

First, you need to find out how long your server takes to respond. You can find the TTFB for your document request in the Network tab of Chrome DevTools.

TTFB (time spent waiting for server response) in Chrome Devtools

Lighthouse reports include the server response time in the Performance section. You might need to open the "Passed Audits" heading to see it.

Note that this metric does not include the network roundtrip time for the HTTP request!

TTFB in the Lighthouse report

Generally a TTFB under 500ms isn't a problem, though what exactly is acceptable will depend on the website you're working on.

What causes a slow TTFB?

These are a few factors that can cause slow server responses:

  • CPU processing on the server
  • Slow database queries
  • Accessing third-party APIs over HTTP
  • Hard drive access

For example, I once accidentally put a server in the UK and the database in the US, which meant each database lookup included a lot of latency.

I'm also using a third-party payment provider for DebugBear. While reading from my database can take as little as 1ms, making a request to the third-party can easily take 500ms.

Profiling server code

A profiler measures what part of the code your app is spending most of its time. We'll look at profiling Node code here, but most other popular languages have a similar profiler you can use.

When launching your Node server, enable the debugger by passing in the --inspect flag.

node --inspect server.js
// Will print something like this:
// Debugger listening on ws://127.0.0.1:9229/62953438-d65e-4cf6-866a-63a26f8aa57f

Now, go to the browser and open Chrome DevTools on any page.

You'll find a green icon in the top left corner, saying "Open dedicated DevTools for Node.js". Click on it to open the inspector window for the Node process.

DevTools showing button to open Node inspector

In the inspector window, switch to the Profiler tab. Click Start and make a request to your local server, so that there's some activity in the profiler recording.

Profiler tab of Chrome DevTools for Node

Stop the recording and switch the dropdown from "Heavy (Bottom-up)" to "Chart". You'll see a flamechart showing what the server was up to at any given moment.

Flamechart of Node profile

In this case the server spent a lot of time getting the list of JavaScript bundles and rendering a Handlebars template. You can then use this information to see which of these steps you can speed up.

Add print statements when profiling isn't an option

Sometimes it's difficult to profile your code, for example when you're running production code in a Platform as a Service (PaaS) environment. You can try just printing how much time was spent to narrow down what's causing a performance issue.

console.time("Request");
// ...
console.time("After authentication");
// Request: 156.22ms After authentication
// ...
console.time("Template rendered");
// Request: 319.23ms Template rendered
// ...
console.timeEnd("Request");
// Request: 588.71ms

Logging database queries

If server responses are slow but the profile doesn't show a lot of processing, your code might be waiting for responses from the database.

To check if this is the case, try logging every SQL query and measure how long it takes. For example, if you're using Sequelize, you can use this code to log the duration of each query.

cosnt db = new Sequelize('database', 'username', 'password', {
  benchmark: true,
  logging: function (sql, timeInMs) {
    console.log(sql, timeInMs + "ms");
  }
});

If the server is making a lot of small queries, consider if they can be merged into one. For example, you might use joins to fetch related data, or use WHERE id in (12, 23) to fetch multiple records.

Some queries might be duplicated, or altogether unnecessary.

If specific queries take a long time to run, try prepending the SQL command with EXPLAIN ANALYZE to see how the database server is spending its time.

SQL EXPLAIN ANALYZE result

Often, slow queries can be sped up by adding an index to a column that's used for sorting or filtering. This will slow down inserts into the table, but speed up lookups later on.

Avoiding processing with caching

Caching means saving a value so that you can use it again later, without having to redo the processing that was necessary to get the value originally.

For example, you can cache a response you received from a database, or the HTML of a fully-rendered page template. You can store the cached data in memory, or in a separate cache server.

A simple in-memory cache can look something like this:

let cache = [];
async function getData(dataId) {
  if (!cache[dataId]) {
    cache[dataId] = getDataWithoutCache(dataId);
  }
  return cache[dataId];
}
async function getDataWithoutCache() {
  /* slow logic */
}

Note that we are caching the promise, rather than the result of the getDataWithoutCache call. That way we don't end up calling getDataWithCache again if another getData call is made before the result is available.

While an in-memory cache allows very fast access, it will also increase the memory consumption of you server. To mitigate this, you can use a cache that discards infrequently used items.

Lazy loading

Is all content on the page necessary for the user to benefit from seeing the page? Or can you show the most important content first and then lazy-load additional information?

For example, if rendering a sidebar is slow, you could initially render the page without the sidebar and then load the sidebar via Ajax later on.

Use more and faster servers

This option will cost more, but upgrading to a faster machine can be an easy way to work around performance problems. If multiple requests are competing for resources you can also increase the number of servers used to serve your website.

DebugBear is a website monitoring tool built for front-end developers. Track performance metrics and Lighthouse scores in CI and production. Learn more.

Get new articles on web performance by email.

© 2020 DebugBear Ltd