Skip to main content

Time to First Byte (TTFB)

TTFB measures how quickly your website responds to requests. It can have a significant impact on your overall site speed and user experience.

This article explains the Time to First Byte metric, how to measure it, and how to optimize it.

What is Time to First Byte?

The Full TTFB measures how long after navigation the first byte of the HTML document response is received by the browser. This is what Google reports as part of its CrUX dataset.

TTFB as part of an HTTP request

However, different tools use different definitions for TTFB.

When looking at the individual components of a request, TTFB often only measures the duration of the HTTP request itself. Time spent establishing a server connection is not included. We've marked this as HTTP Request TTFB in the diagram.

Chrome DevTools used to describe this as Waiting (TTFB) but now uses the term Waiting for server response to avoid ambiguity. In Lighthouse this metric is called server response time.

What does the Full TTFB consist of?

The TTFB metric measures time spent establishing a server connection, making an HTTP request, and then waiting for the server response:

  • DNS lookup
  • TCP connection
  • SSL connection
  • Document HTTP request
    • only until the first byte, excluding the full download time for the response
  • Any redirects from the initial request URL

Does TTFB include redirects?

Redirects are included in the Full TTFB measurement.

In the example below, the initial server response consists of an HTTP redirect rather than an HTML in the response body that the browser can display. The actual First Byte time is recorded when the second HTTP request returns an HTML document.

Full TTFB including a redirect

HTTP Request TTFB

Every HTTP request that receives a response has a Time to First Byte. When talking about requests after the initial document request, TTFB usually only refers to the time spent waiting for a response to the HTTP request.

For example, you can see the per-request TTFB in WebPageTest. Here it forms only one part of the overall request duration.

Request TTFB in WebPageTest

At DebugBear we also show TTFB as part of the request breakdown.

Request TTFB in WebPageTest

We also track both the Full TTFB and HTTP Server TTFB metrics for the document request.

Full TTFB and HTTP Request TTFB

How does TTFB impact user experience?

Receiving the first byte of your page is the minimum requirement for the browser to start displaying content. By reducing TTFB you can make your website render more quickly.

However, receiving the first byte often isn't sufficient as most pages have additional render-blocking resources that are loaded after the initial document request. First Contentful Paint and Largest Contentful Paint measure when content actually becomes visible to the user.

Full TTFB and HTTP Request TTFB

Does TTFB impact SEO?

Time to First Byte is not one of the Core Web Vitals metrics and Google does not directly use it as part of it's search engine rankings.

However, TTFB does impact the Largest Contentful Paint and a slow server response can still hurt your SEO.

Google does include TTFB as part of the CrUX dataset for debugging purposes. You can use PageSpeed Insights to test what TTFB looks like for real users.

TTFB in PageSpeed Insights

What is a good TTFB?

Google considers a Full TTFB of 800 milliseconds to be good.

Values above 1.8 seconds are considered poor.

Which resources are commonly affected by slow TTFB?

Requests that load dynamic content that needs to be generated for each request typically have a higher Time to First Byte. This usually applies to the initial document request or later XHR requests that load additional data.

Static resources like images and JavaScript files can generally be returned quickly by the server.

Measuring TTFB in DevTools

As mentioned above, PageSpeed Insights is a great tool to check if slow TTFB is a problem for real users. Chrome DevTools can help you test TTFB locally to see if your optimizations are working.

You can find the HTTP Request TTFB in the Network tab of Chrome DevTools. Click on the document request, open the Timing tab, and then check the Waiting for server response value.

If you are looking for the Full TTFB you can look at the sum at the bottom of the Timing tab. This number does not include redirects, if there are any you'll need to manually check how long those requests took and calculate the total TTFB value.

TTFB (time spent waiting for server response) in Chrome Devtools

TTFB in Lighthouse

Lighthouse reports include the server response time in the Performance section.

The Reduce initial server response time audit evaluates how quickly the HTML document response was provided after starting the HTTP request.

Like in DevTools, this number does not represent the Full TTFB.

Poor TTFB in the Lighthouse report

You might need to open the Passed Audits heading to see it.

Good TTFB in the Lighthouse report

What causes a slow TTFB

Slow server processing

The more work your server has to do to generate the HTML document, the longer it will take your visitors to get a response.

For example, making a large number of complex database queries can slow down server responses.

In practice that might mean a Wordpress site with many plugins. Each plugin contributes some processing time, causing a slow overall server response.

Slow server connection time

Establishing an HTTP connection requires multiple network round trips. If your web servers are located far from where your users are each round trip will take longer. A Content Delivery Network (CDN) with many global locations can help with this.

Upgrading to TLS 1.3 can avoid unnecessary round trips. Avoiding Extended Validation certificates prevents expensive certificate revocation requests (OCSP) as part of the connection process.

Accessing third-party APIs

Using external APIs can slow down server response time significantly. Making these API calls means nesting HTTP requests, so your TTFB now also includes the TTFB of the request to the third party.

Choosing API providers with locations close to your own server can reduce the impact of this and lower TTFB.

Instance warm-up

If you're using cloud scaling solutions some of your requests may end up being handled by VM instances that are still being provisioned. These cold starts can mean response times of 10 seconds or more.

To avoid this, check your scaling configuration or ensure significant warm server capacity always exists.

Profiling server code

A profiler measures where in the code your app is spending most of its time. We'll look at profiling JavaScript code with Node here, but most popular languages have a similar profiler you can use.

When launching your Node server, enable the debugger by passing in the --inspect flag.

node --inspect server.js
// Will print something like this:
// Debugger listening on ws://127.0.0.1:9229/62953438-d65e-4cf6-866a-63a26f8aa57f

Now, go to the browser and open Chrome DevTools on any page.

You'll find a green icon in the top left corner, saying Open dedicated DevTools for Node.js. Click on it to open the inspector window for the Node process.

DevTools showing button to open Node inspector

In the inspector window, switch to the Profiler tab. Click Start and make a request to your local server, so that there's some activity in the profiler recording.

Profiler tab of Chrome DevTools for Node

Stop the recording and switch the dropdown from Heavy (Bottom-up) to Chart. You'll see a flame chart showing what the server was up to at any given moment.

Flame chart of Node profile

In this case the server spent a lot of time getting the list of JavaScript bundles and rendering a Handlebars template. You can then use this information to see which of these steps you can speed up.

Add print statements when profiling isn't an option

Sometimes it's difficult to profile your code, for example when you're running production code in a Platform as a Service (PaaS) environment. You can try just printing how much time was spent to narrow down what's causing a performance issue.

console.time("Request");
// ...
console.time("After authentication");
// Request: 156.22ms After authentication
// ...
console.time("Template rendered");
// Request: 319.23ms Template rendered
// ...
console.timeEnd("Request");
// Request: 588.71ms

Logging database queries

If server responses are slow but the profile doesn't show a lot of processing, your code might be waiting for responses from the database.

To check if this is the case, try logging every SQL query and measure how long it takes. For example, if you're using Sequelize, you can use this code to log the duration of each query.

cosnt db = new Sequelize('database', 'username', 'password', {
benchmark: true,
logging: function (sql, timeInMs) {
console.log(sql, timeInMs + "ms");
}
});

If the server is making a lot of small queries, consider if they can be merged into one. For example, you might use joins to fetch related data, or use WHERE id in (12, 23) to fetch multiple records.

Some queries might be duplicated, or altogether unnecessary.

If specific queries take a long time to run, try prepending the SQL command with EXPLAIN ANALYZE to see how the database server is spending its time.

SQL EXPLAIN ANALYZE result

Often, slow queries can be sped up by adding an index to a column that's used for sorting or filtering. This will slow down inserts into the table, but speed up lookups later on.

Using caching to reduce processing time

Caching means saving a value so that you can use it again later, without having to redo the processing that was necessary to get the value originally.

For example, you can cache a response you received from a database, or the HTML for a fully-rendered page template. The cached data can be stored in memory or in a separate cache server.

A simple in-memory cache can look something like this:

let cache = [];
async function getData(dataId) {
if (!cache[dataId]) {
cache[dataId] = getDataWithoutCache(dataId);
}
return cache[dataId];
}
async function getDataWithoutCache() {
/* slow logic */
}

Note that we are caching the promise, rather than the result of the getDataWithoutCache call. That way we don't end up calling getDataWithCache again if another getData call is made before the result is available.

While an in-memory cache allows very fast access, it will also increase the memory consumption of your server. To mitigate this, you can use a cache that discards infrequently used items.

Speeding up TTFB by lazy loading secondary content

Is all content on the page necessary for the user to benefit from seeing the page? Or can you show the most important content first and then lazy-load additional information?

For example, let's say that rendering a sidebar is slow. You could initially render the page without the sidebar and then load the sidebar via Ajax later on.

Use more and faster servers

This option will cost more, but upgrading to a faster machine can be an easy way to work around performance problems. If multiple requests are competing for resources you can also increase the number of servers used to serve your website.

Monitoring Time to First Byte

You can use DebugBear to keep track of TTFB and other Web Vitals metrics.

Site speed monitoring dashboard

We not only run regular lab-based performance tests, but also show how your Google CrUX metrics are changing over time.

Site speed monitoring dashboard

DebugBear is a site speed monitoring service. Start tracking Lighthouse scores and Core Web Vitals in minutes.
Start monitoring your websiteGo to app