Capturing Core Web Vitals Baselines Before Migration

Problem Statement

If you migrate without a frozen Core Web Vitals baseline you cannot prove whether a post-launch ranking dip came from your redirects or from a performance regression. Replatforming swaps render paths, hydration logic, and CDN behaviour all at once, and any one of them can degrade Largest Contentful Paint (LCP), Interaction to Next Paint (INP), or Cumulative Layout Shift (CLS). The new framework may inline critical CSS differently, defer JavaScript that previously blocked, or serve images from a different origin with different cache headers — none of which show up in a redirect audit. You must record both field and lab numbers, per template, before the freeze window closes, because once the old site is gone you can never recover its real-user data. This page is part of Crawl Baseline Generation; start there if you have not yet captured your URL inventory.

Field (CrUX) and lab (Lighthouse CI) sources merge into one per-template table that the post-launch run is diffed against.

When to Use This Approach

You are inside the freeze window before a domain change, replatform, or framework migration.
The new stack changes how pages render (server components, hydration, new CDN, new image pipeline).
The site has enough real traffic that CrUX reports field data at origin or URL level.
You need defensible evidence to separate performance regressions from redirect or indexation issues post-launch.
Stakeholders track Core Web Vitals as a KPI and will ask “did the migration make the site slower?”

Step-by-Step Instructions

1. Pull Field Data from CrUX

Field data reflects real Chrome users at the 75th percentile over a trailing 28 days, which is what search ranking signals actually consume. Pull it per origin and per key URL so single-page regressions are visible, not just the site average. URL-level CrUX only returns data for pages with enough traffic to clear the privacy threshold, so capture origin-level numbers as the guaranteed fallback and treat any URL-level data you do get as a bonus that pinpoints your most popular templates.

# CrUX API — origin-level p75 for LCP, INP, CLS (requires a Google API key)
curl -s "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=$CRUX_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"origin":"https://oldshop.example.com","metrics":["largest_contentful_paint","interaction_to_next_paint","cumulative_layout_shift"]}' \
  > crux_origin_baseline.json   # store raw response as the field snapshot

2. Capture Lab Data with Lighthouse CI

Lab data is repeatable and controllable, so it isolates code changes without network noise. Run Lighthouse CI against one representative URL per template (home, category, product, article) with a fixed throttling profile so runs are comparable before and after. Take the median of at least three runs per URL — a single Lighthouse run can swing by hundreds of milliseconds on LCP depending on CPU contention, and the median absorbs that noise. Pin the same Chrome version and throttling preset in both the baseline and the post-launch run, otherwise you are comparing two different measuring instruments rather than two versions of your site.

# Lighthouse CI — fixed mobile throttling, 3 runs per URL, median kept
lhci collect \
  --url=https://oldshop.example.com/ \
  --url=https://oldshop.example.com/category/shoes \
  --url=https://oldshop.example.com/product/sku-1234 \
  --numberOfRuns=3 \
  --settings.preset=desktop=false   # mobile lab profile for CWV parity
lhci upload --target=filesystem --outputDir=./lhci-baseline

3. Consolidate into a Per-Template Baseline Table

Merge field and lab numbers into one table keyed by template, not by URL, because the new site will have new URLs but the same template types. Record p75 field values and median lab values side by side. Keying by template is what makes the baseline survive the migration: the product page on oldshop.example.com/product/sku-1234 becomes shop.example.com/p/1234, but both are “product template” and should be compared as such. Add a column for the representative URL used for the lab run so anyone can reproduce the measurement later, and record the capture date so the 28-day CrUX window is unambiguous.

# Build a per-template baseline CSV from CrUX + Lighthouse outputs
import json, csv
crux = json.load(open('crux_origin_baseline.json'))['record']['metrics']
rows = [{
    'template': 'origin',
    'lcp_field_p75_ms': crux['largest_contentful_paint']['percentiles']['p75'],
    'inp_field_p75_ms': crux['interaction_to_next_paint']['percentiles']['p75'],
    'cls_field_p75': crux['cumulative_layout_shift']['percentiles']['p75'],
}]
with open('cwv_baseline.csv', 'w', newline='') as f:   # versioned baseline artefact
    w = csv.DictWriter(f, fieldnames=rows[0].keys()); w.writeheader(); w.writerows(rows)

4. Commit the Baseline and Set Thresholds

Version-control the baseline so it cannot be silently overwritten, and write explicit pass thresholds so the post-launch comparison is objective rather than a judgement call. Set the thresholds to the standard “good” ceilings for each metric, but also record your actual baseline values, because a template that was already at 2,400 ms LCP should not be allowed to quietly drift to 2,499 ms just because both pass the 2,500 ms gate. Wiring the same assertions into the post-launch CI run turns the comparison into an automatic pass or fail that anyone can read, removing the temptation to rationalise a regression as “probably fine”.

# lighthouserc.yml — assert post-launch must not regress beyond these ceilings
ci:
  assert:
    assertions:
      largest-contentful-paint: ["error", { maxNumericValue: 2500 }]  # LCP good ceiling
      interaction-to-next-paint: ["error", { maxNumericValue: 200 }]   # INP good ceiling
      cumulative-layout-shift: ["error", { maxNumericValue: 0.1 }]     # CLS good ceiling

Worked Example

A retailer on oldshop.example.com is migrating to a Next.js stack at shop.example.com. The CrUX origin pull returns LCP p75 of 2,180 ms, INP p75 of 165 ms, and CLS p75 of 0.07 — all in the “good” band. Lighthouse CI lab runs on the product template show a median LCP of 2,640 ms, slower than field because the lab uses a stricter throttle.

These numbers go into cwv_baseline.csv and are committed alongside the crawl snapshot from exporting full crawl data before migration. After launch, the same Lighthouse CI command runs against shop.example.com product URLs and reports LCP of 3,410 ms — a 770 ms regression on the same template, pushing it out of the “good” band and into “needs improvement”. Because the baseline existed, the team attributes the dip to an unoptimised hero image pipeline rather than to redirects, and fixes it before rankings move.

The diagnosis is fast precisely because the baseline isolated the variable: redirects were known-good from the QA worklist, field traffic was holding, and the only metric that moved was the lab LCP on one template. The fix — switching the new image CDN to serve AVIF with a higher cache TTL — drops LCP back to 2,510 ms, within 330 ms of the original. Four weeks later the CrUX field window rolls over and confirms the recovery at p75, closing the loop. The artefact also feeds the broader Pre-Migration Auditing & Risk Assessment record.

Verification

Confirm the baseline is complete and the thresholds actually fire before you rely on it.

# Baseline must contain all three metrics with non-null values
python -c "import csv;r=list(csv.DictReader(open('cwv_baseline.csv')));print('OK' if all(r[0].values()) else 'MISSING')"

# Re-run Lighthouse assertions against the baseline URL to confirm thresholds are wired
lhci assert --config=lighthouserc.yml || echo "thresholds tripped — review before launch"

# Confirm the committed snapshot is immutable
git log --oneline -- cwv_baseline.csv | head -1

Watch for these failures: comparing lab numbers against field thresholds (they differ by design); capturing only the homepage and missing slow product templates; and letting CrUX URL-level data fall back to origin silently when a page lacks enough traffic. A subtler trap is capturing the baseline too late — once the freeze begins and the old site stops receiving traffic, its CrUX window starts to decay, so pull the field data before the freeze, not during it. Treat the committed baseline as immutable: if you discover a template you missed, add a new row rather than editing the file in place, so the diff against the post-launch run stays honest.

FAQ

Should I trust CrUX field data or Lighthouse lab data for the baseline? Use both for different jobs. CrUX p75 field data is what search systems evaluate, so it is the authoritative pass/fail signal; Lighthouse CI lab data is repeatable and isolates code changes, so it is what you diff in CI. Record both per template and never compare a lab number against a field threshold.

My new URLs differ from the old ones, so how do I compare? Key the baseline by template type, not by URL. Map each old template (category, product, article) to its new equivalent and compare like-for-like, since the rendering path is the variable that actually changed.

How long after launch will CrUX reflect the new site? CrUX reports a trailing 28-day window, so the field data fully reflects the new stack roughly four weeks post-launch. Rely on Lighthouse CI lab runs for the first month to catch regressions early, then confirm with field data once the window rolls over.

← Back to Crawl Baseline Generation