A step by step guide to monitoring the competition with the Chrome UX Report

Rick Viscomi - Sep 26 '18 - - Dev Community

What is CrUX

The Chrome UX Report (AKA "CrUX") is a dataset of real user experiences measured by Chrome. Google releases monthly snapshots of over 4 million origins onto BigQuery with stats on web performance metrics like First Contentful Paint (FCP), DOM Content Loaded (DCL), and First Input Delay (FID).

PageSpeed Insights is another tool that integrates with CrUX to provide easy access to origin performance as well as page-specific performance data, in addition to prescriptive information about how to improve the performance of the page.

The CrUX datasets have been around and growing since November 2017, so we can even see historical performance data.

In this post I walk you through the practical steps of how to use it to get insights into your site's performance and how it stacks up against the competition.

How to use it

Writing a few lines of SQL on BigQuery, we can start extracting insights about UX on the web.

SELECT
  SUM(fcp.density) AS fast_fcp
FROM
  `chrome-ux-report.all.201808`,
  UNNEST(first_contentful_paint.histogram.bin) AS fcp
WHERE
  fcp.start < 1000 AND
  origin = 'https://dev.to'
Enter fullscreen mode Exit fullscreen mode

The raw data is organized like a histogram, with bins having a start time, end time, and density value. For example, we can query for the percent of "fast" FCP experiences, where "fast" is defined as happening in under a second. The results tell us that during August 2018, users on dev.to experienced a fast FCP about 59% of the time.

Let's say we wanted to compare that with a hypothetical competitor, example.com. Here's how that query would look:

SELECT
  origin,
  SUM(fcp.density) AS fast_fcp
FROM
  `chrome-ux-report.all.201808`,
  UNNEST(first_contentful_paint.histogram.bin) AS fcp
WHERE
  fcp.start < 1000 AND
  origin IN ('https://dev.to', 'https://example.com')
GROUP BY
  origin
Enter fullscreen mode Exit fullscreen mode

competitive analysis results

Not much different, we just add the other origin and group it so we end up with a fast FCP density for each. It turns out that dev.to has a higher density of fast experiences than example.com, whose density is about 43%.

Now let's say we wanted to measure this change over time. Because the tables are all dated as YYYYMM, we can use a wildcard to capture all of them and group them:

SELECT
  _TABLE_SUFFIX AS month,
  origin,
  SUM(fcp.density) AS fast_fcp
FROM
  `chrome-ux-report.all.*`,
  UNNEST(first_contentful_paint.histogram.bin) AS fcp
WHERE
  fcp.start < 1000 AND
  origin IN ('https://dev.to', 'https://example.com')
GROUP BY
  month,
  origin
ORDER BY
  month,
  origin
Enter fullscreen mode Exit fullscreen mode

By plotting the results in a chart (BigQuery can export to a Google Sheet for quick analysis), we can see that dev.to performance has been consistently good, but example.com has had a big fluctuation recently that brought it below its usual density of ~80%.

Chart of the fast FCP of the two origins

Ok but how about a solution without SQL?

I hear you. BigQuery is extremely powerful for writing custom queries to slice the data however you need, but using it for the first time can require some configuration and if you query more than 1 TB in a month you may need to pay for the overages. It also requires some experience with SQL, and without base queries to use as a starting point it's easy to get lost.

There are a few tools that can help. The first is the CrUX Dashboard. You can build your own dashboard by visiting g.co/chromeuxdash.

chrome ux dashboard for dev.to

This will generate a dashboard for you including the FCP distribution over time. No SQL required!

The dashboard also has a hidden super power. Because the data in CrUX is sliced by dimensions like the users' device and connection speed, we can even get aggregate data about the users themselves:

form factor distribution

This chart shows that users on dev.to are mostly on their phone and just about never on a tablet.

connection distribution

The connection chart shows a 90/10 split between 4G and 3G connection speeds. These classifications are the effective connection type. Meaning that a user on a 4G network experiencing speeds closer to 2G would be classified as 2G. A desktop user on fast WiFi would be classified as 4G.

The other cool thing about this dashboard is that it's customized just for you. You can modify it however you want, for example you can add the corresponding chart for example.com to compare stats side-by-side.

dashboard comparison of two origins

The dashboard is built with Data Studio, which also has convenient sharing capabilities like access control and embeddability.

What's the absolute easiest way to get CrUX data?

Using the PageSpeed Insights web UI, you can simply enter a URL or origin and immediately get a chart. For example, using the origin: prefix we can get the origin-wide FCP and DCL stats for dev.to:

dev.to on PageSpeed Insights

https://developers.google.com/speed/pagespeed/insights/?url=origin%3Ahttps%3A%2F%2Fdev.to

One important distinction to the PSI data is that it's updated daily using a rolling 30 day aggregation window, as opposed to the calendar month releases on BigQuery. This is nice because it enables you to get the latest data. Another important distinction is the availability of page-specific performance data:

dev.to open source perf

https://developers.google.com/speed/pagespeed/insights/?url=https%3A%2F%2Fdev.to%2Fben%2Fdevto-is-now-open-source-5n1

This link provides CrUX data for @ben's popular open source announcement. Keep in mind that not all pages (or even origins) are included in the CrUX dataset. If you see the "Unavailable" response then the page may not have sufficient data, as in this case with the AMA topic page:

speed not available

Also keep in mind that a page that has data for the current 30 day window may not have sufficient data in a future window. So that link to the PSI data for @ben's post may have Unavailable speed data in a few months, depending on the page's popularity. The exact threshold is kept secret to avoid saying too much about the unpopularity of pages outside of the CrUX report.

How do I get page-level data over time?

It's tricky because while PSI provides page-level data, it only gives you the latest 30-day snapshot, not a historical view like on BigQuery. But we can make that work!

PSI has an API. We can write a crafty little script that pings that API every day to extract the performance data.

  1. To get started, create a new Google Sheet
  2. Go into the "Tools" menu and select "Script Editor"
  3. From the Apps Script editor give the project a name like "PSI Monitoring", go into the "Resources" menu, select "Advanced Google Services", and click the link to enable services in "Google Cloud Platform API Dashboard" apps script services
  4. From the Google Cloud Platform console, search for the PageSpeed Insights API, enable it, and click "Create Credentials" to generate an API key
  5. From the Apps Script editor, go to "File", "Project properties" and create a new "Script property" named "PSI_API_KEY" and paste in your new API key

Now you're ready for the script. In Code.gs, paste this script:

// Created by Rick Viscomi (@rick_viscomi)
// Adapted from https://ithoughthecamewithyou.com/post/automate-google-pagespeed-insights-with-apps-script by Robert Ellison

var scriptProperties = PropertiesService.getScriptProperties();
var pageSpeedApiKey = scriptProperties.getProperty('PSI_API_KEY');
var pageSpeedMonitorUrls = [
  'origin:https://developers.google.com',
  'origin:https://developer.mozilla.org'
];

function monitor() {
  for (var i = 0; i < pageSpeedMonitorUrls.length; i++) {
    var url = pageSpeedMonitorUrls[i];
    var desktop = callPageSpeed(url, 'desktop');
    var mobile = callPageSpeed(url, 'mobile');
    addRow(url, desktop, mobile);
  }
}

function callPageSpeed(url, strategy) {
  var pageSpeedUrl = 'https://www.googleapis.com/pagespeedonline/v4/runPagespeed?url=' + url + '&fields=loadingExperience&key=' + pageSpeedApiKey + '&strategy=' + strategy;
  var response = UrlFetchApp.fetch(pageSpeedUrl);
  var json = response.getContentText();
  return JSON.parse(json);
}

function addRow(url, desktop, mobile) {
  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getSheetByName('Sheet1');
  sheet.appendRow([
    Utilities.formatDate(new Date(), 'GMT', 'yyyy-MM-dd'),
    url,
    getFastFCP(desktop),
    getFastFCP(mobile)
  ]);
}

function getFastFCP(data) {
  return data.loadingExperience.metrics.FIRST_CONTENTFUL_PAINT_MS.distributions[0].proportion;
}
Enter fullscreen mode Exit fullscreen mode

The script reads the API key that you assigned to the script properties and comes with two origins by default. If you want to change this to URL endpoints, simply omit the origin: prefix to get the specific pages.

The script will run each URL through the PSI API for both desktop and mobile and extract the proportion of fast FCP into the sheet (named "Sheet1", so keep any renames in sync).

You can test it out by opening the "Select function" menu, selecting "monitor", and clicking the triangular Run button. For your first run, you'll need to authorize the script to run the API. If all goes well, you can open up your sheet to see the results.

PSI API results

I've added a header row and formatted the columns (A is a date type, C and D are percentages) for easier reading. From there you can do all of the powerful things Sheets can do, like set up a pivot table or visualize the results.

But what about monitoring?

Luckily, with Apps Script you don't need to run this manually every day. You can set "triggers" to run the monitor function daily, which will append a new row for each URL every day. To set that up, go to the "Edit" menu, select "Current project's triggers" and add a new trigger with the following config:

  • run the monitor function
  • "Time-driven"
  • "Day timer"
  • select any hour for the script to run, or leave it on the default "Midnight to 1am"

After saving the trigger, you should be able to return to this sheet on a daily basis to see the latest performance stats for all of the monitored URLs or origins.

Just give me something I can clone!

Here's the sheet I made. Go to "File > Make a copy..." to clone it. Everything is now set up for you except for the API key property and the daily trigger. Follow the steps above to set those up. You can also clear the old data and overwrite the sample URLs to customize the analysis. Voila!

Wrapping up

This post explored four different ways to get real user insights out of the Chrome UX Report:

  1. BigQuery
  2. CrUX Dashboard
  3. PageSpeed Insights
  4. PageSpeed Insights API

These tools enable developers of all levels of expertise to make use of the dataset. In the future, I hope to see even more ambitious use of the data. For example, what if the tool you use to monitor your site's real user performance could also show you how it compares to your competition? Third party integrations like that could unlock some very powerful use cases to better understand the user experience.

I've also written about how we can combine this with other datasets to better understand the state of the web as a whole. For example, I joined CrUX with HTTP Archive data to analyze the real user performance of the top CMSs.

CMS performance

The dataset isn't even a year old yet and there's so much potential for it to be a real driver for change for a better user experience!

If you want to learn more about CrUX you should read the official developer documentation or watch the State of the Web video I made with more info about how it works. There's also the @ChromeUXReport account that I run if you have any questions.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .