Web Vitals Explained

Laurie - Feb 22 '21 - - Dev Community

In my previous post, I talked about automated performance testing tools and how Google uses these scores to help determine page rank in their algorithm. Specifically, I ended the post by mentioning the concept of "core web vitals". So let's talk about what that means!

Google

Google announced in 2020 that site performance was going to influence page rank and that they determined that performance score using three metrics they call core web vitals.

Those metrics are:

  • Cumulative layout shift (CLS)
  • Largest Contentful Paint (LCP)
  • First input delay (FID)

So what do each of those metrics mean? And what influences them?

Largest contentful paint

This metric is meant to measure user experience when loading your site. A poor score typically points to render-blocking resources or slow server response time.

The goal is to find the biggest blocker when loading the page. Typically, this is a font file or an image. If you're handling those well, the site itself will have a great loading experience.

LCP correlates with an older metric called speed index. However, that could only be calculated when a tool was taking snapshots of the site as it loaded. LCP is a faster and cheaper way to determine the same types of performance problems.

Cumulative layout shift

Cumulative layout shift is a metric designed to measure visual stability. Largest Contentful Paint can be great, but if the page is constantly doing layout shifts as new information comes in, it becomes less relevant. It's also not a fun user experience to have things shift around as you're trying to interact with a page.

Part of the reason Google focuses on this metric is to move against ads and sites that slam you with a bunch of pop-ups. Additionally, they don't want you to lazy load content that has a significant impact on the layout of your page, e.g. fonts. A user's first impression of your site should be a stable one.

First input delay

First input delay is the most nuanced core web vital because in most performance testing tools it isn't available.

FID is meant to measure user experience when they first try to interact with a page. If a user presses a button, how long does the page take to respond? The tricky part is that measuring FID requires tracking how a real user interacts with a site. Let's understand why.

Imagine this -- you simulate a page load and clicking the first button the system sees as soon as the page renders. It takes a second or more to register that click because React hasn't finished hydrating. This seems like a bad user experience. But is it? If a real user were to navigate to your site, they'd have to notice there was a button, move their cursor (or tab over to it) and then click the button. In the time it takes that to do that will they experience the same delay as the simulated test? Probably not.

Unfortunately, real user data is expensive to gather. As a result, most testing tools estimate FID using a metric like Total Blocking Time (TBT). It's not a user-centric outcome, but it gives you an idea of how long it takes until your page can be interacted with.

In most cases, you need everything to load faster than 100ms. Anything slower than that is perceived as not working.

Additional metrics

While Google focuses on the three core web vitals, there are a number of other metrics that make up the larger set of web vitals.

  • Time to Interactive
    TTI is similar to TBT and is also sometimes used as an estimate for FID. It's focused on behaviors that block the browser from being interactive. However, it also measures network quiet time so it's not a 1:1 matchup with TBT.

  • First CPU Idle
    This measures the first time at which the page's main thread is quiet enough to handle input.

  • First Contentful Paint
    This is similar to LCP, but instead of measuring the time at which the largest asset paints, it measures when the first asset does.

Are we done yet?

So far we've looked at the metrics that make up performance scores and the tools that provide them. The next post will focus on what behaviors impact this score and the best practices for improving them.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .