ENTRY TYPE · framework

Customer health score

By Marius Bughiu Last updated 2026-06-06 Customer Success

A customer health score is a single composite number, usually 0-100 or a red/yellow/green band, that rolls up five families of signal — product usage, engagement, sentiment, support, and outcomes — into one indicator of how likely an account is to renew, expand, or churn. It exists so a CSM owning 40-80 accounts can triage attention without reading every account by hand, and so CS leadership can forecast net revenue retention from something other than gut feel.

What it is not: it is not a churn prediction model, and it is not NPS. A churn model outputs a probability from a trained classifier; a health score is a transparent, hand-weighted rollup a CSM can explain to a customer. NPS is one sentiment input into health, not a substitute for it. Treating the score as ground truth rather than a prioritization aid is the most common way teams misuse it.

The five signal families

Usage — logins, feature adoption breadth, seats activated vs. provisioned, depth on the features that map to your value proposition. The strongest leading signal for most SaaS.
Engagement — QBR attendance, email open/reply, exec sponsor responsiveness, community or training participation.
Sentiment — NPS, CSAT, CES, plus qualitative CSM-logged sentiment. The softest and most gameable input.
Support — ticket volume, severity, time-to-resolution, escalations, bug counts against this account.
Outcomes — has the customer hit the success plan milestones, realized the ROI they bought for, time-to-value (TTV) achieved? The hardest to instrument and the most predictive of renewal.

The formula

A health score is a weighted sum of normalized component scores:

Health = Σ (component_score_i × weight_i)   where Σ weight_i = 1.0

Each component is normalized to 0-100 first (so a raw login count and a raw NPS land on the same scale), then weighted. A defensible starting weight set for a seat-based B2B SaaS product:

Signal family	Weight
Usage	0.35
Outcomes	0.25
Engagement	0.15
Sentiment	0.15
Support	0.10

Banding: 70-100 green, 40-69 yellow, under 40 red. Calibrate the cutoffs against your own renewal data — run the score retrospectively against the last 12 months of renewals and churns, and move the green/yellow line to where it actually separates renewers from churners.

Leading vs. lagging

This is the distinction that makes a score useful. A leading signal moves before the renewal outcome and is intervenable — declining weekly active usage, a champion who left, slipping QBR attendance. A lagging signal confirms what already happened — a submitted CSAT after a bad quarter, a non-renewal notice. Weight leading signals higher: usage and outcome-progress are leading; a closed support ticket and a survey response are lagging. A score dominated by lagging inputs tells you an account is unhealthy the week it churns, which is too late to act.

Worked example

An account: usage normalized to 80, outcomes to 50, engagement to 90, sentiment to 70, support to 60.

Health = 80×0.35 + 50×0.25 + 90×0.15 + 70×0.15 + 60×0.10
       = 28 + 12.5 + 13.5 + 10.5 + 6
       = 70.5  → green (barely)

The score is green, but outcomes at 50 is the load-bearing weakness — strong product usage and a happy sponsor are masking the fact that the customer has not realized the ROI they bought. This is exactly the account a usage-only score would mislabel as safe. The CSM action is a success-plan reset, not a check-in.

Common pitfalls

Usage-only scores. Easy to instrument, so teams ship them and stop. A heavy power user mid-onboarding can show high usage while the renewal is already lost on a missing outcome. Guard: force a non-zero outcomes weight even if you can only proxy it (success-plan milestone completion).
Set-and-forget weights. Weights drift from reality as the product and segment mix change. Guard: re-run the score against actual renewal/churn outcomes quarterly and re-fit the weights; if the green band isn’t separating renewers from churners, it’s miscalibrated.
Score laundering. When CSM-entered sentiment is a heavy input, reps inflate it to keep their book green. Guard: cap subjective inputs at the 0.15-0.20 range and audit sentiment against objective signals.
One score for all segments. A 12-seat SMB account and a 4,000-seat enterprise account don’t share a usage curve. Guard: maintain per-segment weight sets and bands, not one global formula.
No action mapping. A score nobody acts on is a dashboard ornament. Guard: every band transition (green→yellow, yellow→red) fires a named play with an owner, not just a color change.

NRR vs GRR — the retention metrics a health score is built to predict
Gainsight and Planhat — platforms with configurable health scorecards
ChurnZero and Vitally — usage-driven health and playbook automation

Edit this page on GitHub