claude-skill

Vergütungs-Benchmarking mit Claude

Difficulty

Fortgeschritten

Setup time

30min

For

recruiter · compensation-analyst · hiring-manager

Recruiting & TA

Stack

Eine Claude Skill, die Level, Geographie und einen Comp-Survey-Export (Radford, Pave, Carta) einer Stelle entgegennimmt und eine strukturierte Gehalts-Band-Empfehlung pro Komponente (Grundgehalt, Eigenkapital, Bonus / OTE) mit benanntem Perzentil, Quell-Survey-Zitation und den Kalibrierungsnotizen produziert, die der Recruiter zum Angebotsgespräch mitbringt. Ersetzt das offene-Reiter-Tabellenkalkulations-Jonglieren des Recruiters durch ein einzelnes Dokument, das der Hiring Manager und der Finanzgenehmiger unterzeichnen können. Gibt die öffentlich sichtbare Spanne (NYC LL 32-A, CO/CA/WA-Gehaltstransparenz-konform) als separaten Output aus.

Wann einsetzen

Sie schreiben eine neue Stelle aus und benötigen eine öffentliche Spanne, die verteidigbar begründet ist (nicht die vage „Branchenstandard”-Formulierung, nicht „75. Perzentil” ohne Survey oder Geographie zu nennen).
Sie bereiten ein Angebot vor und benötigen die Band, die der Hiring Manager ohne einen halbtägigen Finanz-Hin-und-Her-Austausch genehmigen kann.
Sie prüfen vierteljährlich bestehende Comp-Bands und wollen einen strukturierten Vergleich von „was wir zahlen” vs. „was der Survey sagt” pro Rollenfamilie.

Wann NICHT einsetzen

Einseitige Comp-Entscheidungen außerhalb einer genehmigten Genehmigungskette. Die Skill produziert eine Empfehlung. Comp-Philosophie und Genehmigungsmatrix sind Eigentum von People Ops / Finance / Comp Committee. Die Skill informiert sie; sie ersetzt sie nicht.
Eigenkapital-Comp bei Pre-Series-B-Startups. Eigenkapital-Benchmarking in sehr frühen Phasen dreht sich mehr um die spezifische Cap Table und den Verwässerungspfad der Firma als um Marktdaten. Die Survey-Zahlen tragen dort nicht.
Verhandlungsskript-Generierung. Die Skill gibt eine Band aus; sie verfasst keine Verhandlungssprache. Automatisch generierte Comp-Verhandlungssprache wirkt kalt und schadet der Candidate Experience.
Kandidatenspezifische Ausnahme-Entscheidungen. „Können wir diesem Kandidaten 15 % über der Band anbieten?” ist eine Frage für den Hiring Manager und Finance, nicht für die Skill. Die Skill informiert durch Aufzeigen der Band; sie genehmigt keine Ausnahmen.
Geographien, in denen der Survey dünne Daten hat. Surveys decken die USA, die EU und wichtige APAC-Märkte gut ab; Schwellenmarktdaten (Latam, Afrika, kleinere APAC) sind dünner. Die Skill markiert Low-N-Geographien im Output.

Setup

Bundle einspielen. apps/web/public/artifacts/compensation-benchmark-skill/SKILL.md in Ihr Claude Code Skills-Verzeichnis platzieren.
Survey-Quelle konfigurieren. Die Skill liest Exports von Radford, Pave, Carta oder einem benutzerdefinierten CSV. Das Per-Source-Schema lebt in references/1-survey-source-schemas.md. Die Skill ruft Survey-APIs nicht direkt auf – Exports laufen durch den genehmigten Zugangsweg Ihres Comp-Analysten.
Comp-Philosophie der Firma festlegen. Auf welchem Perzentil zahlt die Firma (50., 60., 75.)? Summiert Grundgehalt + Eigenkapital auf ein Ziel-Perzentil, oder wird jedes separat kalibriert? Die Philosophie lebt in references/2-comp-philosophy-template.md und ist der Input, gegen den die Skill kalibriert.
Genehmigungsketten-Output konfigurieren. Die Skill gibt die öffentlich sichtbare Spanne als separaten Output aus (NYC LL 32-A, CO/CA/WA-Gehaltstransparenz-konform). Verdrahten Sie diesen Output mit Ihrem Stellenausschreibungs-Publikationsschritt (Greenhouse / Ashby-Stellenbeschreibung), oder kopieren Sie ihn manuell, je nach Prozess Ihres Teams.
Probelauf auf einem abgeschlossenen Angebot. Benchmarken Sie eine Stelle, die Sie letztes Quartal abgeschlossen haben. Vergleichen Sie die Band der Skill damit, was das Angebot tatsächlich war. Wenn die Abweichung groß ist, ist entweder der Survey-Export off-cycle oder die Philosophiedatei der Firma entspricht nicht dem, wie Angebote tatsächlich genehmigt werden.

Was die Skill tatsächlich tut

Fünf Schritte. Die Reihenfolge hält die deterministischen Survey-Lookups vor der LLM-gesteuerten Kalibrierung, weil das Paraphrasieren von Survey-Zahlen durch das Modell Drift einführt, den der Recruiter nicht auditieren kann.

Rollendefinition validieren. Prüfen, ob Level, Geographie und Funktion der Rolle vorhanden sind und mit Werten im Survey-Export übereinstimmen. Bei fehlenden oder mehrdeutigen Feldern anhalten („Senior Engineer” ohne Level auf der Firmenstufe ist mehrdeutig).
Survey-Perzentile nachschlagen. Deterministischer Lookup, kein LLM. Für jedes Grundgehalt, Eigenkapital (annualisiert) und Bonus / OTE die 25./50./60./75./90.-Perzentile aus dem Survey-Export für die übereinstimmende (Level, Geographie, Funktion)-Zelle ziehen. Wenn die Zelle weniger als den dokumentierten Stichproben-Größen-Schwellenwert des Surveys hat (variiert je Survey: Radford typischerweise 5+, Pave typischerweise 10+), Low-N markieren und eine Perzentil-basierte Band-Empfehlung ablehnen – auf die breitere (Level, Funktion) ohne Geographie oder auf erweiterter Geographie zurückfallen (z. B. „US-weit” statt „San Francisco Bay Area”).
Gegen Firmen-Philosophie kalibrieren. Comp-Philosophie der Firma lesen. Das Ziel-Perzentil auf die Survey-Zahlen anwenden. Der Output ist eine strukturierte Band pro Komponente:
- Grundgehalt: Ziel_Pct des Surveys, mit einem ±10 %-Bereich zur Aufnahme von Kandidaten-Level-Variation.
- Eigenkapital: gleich; in Dollar-Wert zum aktuellen Strike-Preis der Firma für neue Grants konvertieren, die Mathematik dokumentieren.
- Bonus / OTE: Ziel_Pct auf dem OTE; Grundgehalt/Variabler nach der Ratio der Firma für die Funktion aufteilen.
Öffentlich sichtbare Spanne zusammenstellen. Gemäß NYC LL 32-A und CO/CA/WA-Gehaltstransparenz-Anforderungen benötigt die öffentliche Ausschreibung eine Grundgehalts-Spanne. Standard: „Minimum des unteren Rands der Band bis Maximum des oberen Rands der Band, als einzelne Gehaltsspanne ausgedrückt.” Wenn die Stelle US-Bundesstaaten mit unterschiedlichen Transparenzgesetz-Schwellenwerten überspannt, gilt die breiteste Spanne. Die Skill gibt dies als separaten Output für die direkte Verwendung in der Stellenbeschreibung aus.
Empfehlungsbericht + Audit-Datensatz ausgeben. Der Bericht enthält: Per-Komponenten-Bands mit zitiertem Perzentil und Quell-Survey, Kalibrierungsnotizen, Low-N- oder Thin-Data-Warnungen und die öffentlich sichtbare Spanne. Der Audit-Datensatz ist eine JSONL-Zeile: Rolle, Geographie, Level, Ziel-Perzentil, Survey-Quelle, Survey-Export-Datum, empfohlene Band – für die Lohngleichheits-Prüfung der Firma später im Jahr.

Kostenrealität

Pro benchmarkter Stelle auf Claude Sonnet 4.6:

LLM-Tokens — typischerweise 5–8k Input (Rollendefinition + Survey-Export-Zeilen + Philosophie + Skill-Anweisungen) und 1–2k Output (strukturierter Bericht). Rund $0,04–0,08 pro Stelle. Vernachlässigbar.
Survey-Zugriffskosten — die Survey-Abonnements selbst sind die bindenden Kosten (Radford, Pave, Carta reichen von $15K–$80K+ jährlich je nach Abdeckung). Die Skill setzt voraus, dass der Comp-Analyst bereits Zugang hat; sie ändert diese Mathematik nicht.
Recruiter / Comp-Analyst-Zeit — der Gewinn. Das manuelle Zusammenstellen einer Comp-Empfehlung dauert 30–90 Minuten pro Stelle (Survey-Lookup + Tabellenkalkulations-Jonglieren + Philosophie-Anwendung + Schreiben der Kalibrierungsnotiz). Die Skill ist 5–10 Minuten einschließlich des Sanity-Check-Probelaufs.
Setup-Zeit — 30 Minuten einmalig für die Philosophiedatei und Survey-Export-Integration. Die Philosophiedatei wird selten überarbeitet; Survey-Exports werden vierteljährlich aktualisiert.

Erfolgsmetrik

Verfolgen Sie drei Zahlen vierteljährlich:

Angebots-Annahmerate innerhalb von 3 Wochen — kalibrierte Vergütung treibt die Annahme. Unter 60 % in Ihrer Geographie und Sie zahlen zu wenig; über 90 % zahlen Sie möglicherweise zu viel. Beide Richtungen sind wichtig; die richtige Zahl hängt von der Comp-Philosophie der Firma ab (Hochkapital-Startups akzeptieren niedrigeres Grundgehalt; Hochgrundgehalt-Mid-Stage-Firmen akzeptieren höheres Grundgehalt).
Comp-Band-Bearbeitungsrate nach der Skill — Anteil der von der Skill empfohlenen Bands, die der Hiring Manager oder Finance vor der Genehmigung bearbeitet. Sollte bei 10–25 % liegen. Über 40 % bedeutet, die Philosophiedatei spiegelt das tatsächliche Genehmigungsverhalten nicht wider; unter 5 % bedeutet, das Gremium stempelt ab (der Fehlerfall, gegen den die Skill ausgelegt ist).
Lohngleichheits-Audit-Drift — Korrelieren bei der jährlichen Lohngleichheits-Überprüfung die Empfehlungen der Skill mit dem Landeplatz tatsächlicher Angebote? Wenn die Prüfung Lücken aufzeigt, die die Empfehlungen der Skill geschlossen hätten, tut die Skill ihren Job; wenn die Prüfung Lücken aufzeigt, die die Empfehlungen der Skill vergrößert hätten, ist die Philosophiedatei oder die Kalibrierung voreingenommen.

Vergleich mit Alternativen

vs. Pave / Carta / Radford / Mercer-Reports direkt. Die Reports sind die Quelldaten; die Skill setzt sie in eine Per-Rollen-Empfehlung zusammen. Wählen Sie die Reports allein, wenn Ihr Comp-Analyst in ihnen lebt und der Recruiter nur „sag mir das 75.” konsumiert. Wählen Sie die Skill, wenn der Recruiter die Kalibrierungsnotiz + öffentliche Spanne + Audit-Datensatz ohne den Analysten in der Schleife für jede Stelle braucht.
vs. ChatGPT-ähnlichem „Was sollte ich einem Senior Engineer in NYC zahlen.” Generischer Chat gibt paraphrasierte Survey-Daten ohne Audit-Trail und ohne versionsgebundene Quelle zurück – das ist bei einer Lohngleichheits-Prüfung nicht verteidigbar. Die Skill zitiert den Survey-Export nach Name und Datum.
vs. Tabellenkalkulationsvorlagen. Vorlagen sind in Ordnung, bis sich die Philosophie der Firma ändert oder der Survey-Export aktualisiert wird; dann werden jede gespeicherte Vorlage stillschweigend veraltet. Die Skill liest bei jedem Durchlauf aus aktuellen Quellen.
vs. kein Benchmarking. Der Standard bei vielen kleineren Firmen. Vorhersehbarer Fehlerfall: Lohngleichheitslücken tauchen bei der jährlichen Prüfung auf, und dem Recruiter wird für individuelle Angebote, die innerhalb der normalen Praxis der Firma lagen, die Schuld gegeben. Verteidigbares Benchmarking ist die günstigste Intervention dagegen.

Wichtige Hinweise

Survey-Export-Veralterung. Guard: Die Skill liest die datierten Metadaten des Exports und warnt, wenn der Export älter als 6 Monate ist. Survey-Daten verschieben sich schneller als jährlich; vierteljährliche Aktualisierung ist die Untergrenze.
Geographie-Fehlzuordnung. Guard: Die Skill gleicht die Geographie der Stelle explizit gegen die Geographie-Taxonomie des Surveys ab (Paves „SF Bay Area” ist nicht dieselbe Zelle wie Radfords „San Francisco MSA”). Bei mehrdeutiger Übereinstimmung hält die Skill an und bittet den Recruiter zu disambiguieren statt einen Standard zu wählen.
Low-N-Zelle. Guard: Die Skill lehnt eine Perzentil-basierte Band-Empfehlung ab, wenn die Survey-Zelle weniger Respondenten hat als der dokumentierte Schwellenwert des Surveys. Sie fällt auf eine breitere Zelle zurück (breitere Funktion, breitere Geographie) und vermerkt den Fallback.
Eigenkapital-Vergleichs-Drift. Guard: Eigenkapitalwerte werden annualisiert und zum aktuellen Strike-Preis der Firma konvertiert. Die Konvertierungsmathematik ist im Bericht dokumentiert. Der Audit-Datensatz speichert die rohen und konvertierten Werte, damit zukünftige Prüfungen sie ableiten können.
Zu enge öffentlich sichtbare Spanne. Guard: Wenn die öffentliche Spanne so eng ist, dass sie als eine einzige Zahl funktioniert, warnt die Skill. Das Posten von „$140K–$145K” ist eine Verletzung des Geistes (und wohl des Buchstabens) von NYC LL 32-A, die eine „gutgläubige” Spanne erfordert. Die Skill erzwingt eine Mindest-Band-Breite pro Geographie.
Bias-Propagierung durch historische Vergütung. Guard: Wenn die Philosophiedatei der Firma durch „passen Sie, was wir in dieser Band zuvor bezahlt haben” kalibriert wird, propagiert die Skill alle Lohnlücken, die in historischen Daten existieren. Die Skill markiert dies, wenn die Philosophieabstimmung historischer Vergütung statt Survey-Perzentilen eng folgt, und empfiehlt dem Comp-Analysten, eine separate Lohngleichheitsprüfung durchzuführen.

Stack

Das Skill-Bundle liegt unter apps/web/public/artifacts/compensation-benchmark-skill/ und enthält:

SKILL.md — die Skill-Definition
references/1-survey-source-schemas.md — per-Quell-Export-Schemas (Radford, Pave, Carta, benutzerdefiniertes CSV)
references/2-comp-philosophy-template.md — ausfüllbare per-Firma-Philosophiedatei

Tools, die der Workflow voraussetzt: Claude (das Modell), Ashby oder Greenhouse (das ATS, für das Posten der öffentlichen Spanne).

Diese Seite auf GitHub bearbeiten

Files in this artifact

Download all (.zip)

---
name: compensation-benchmark
description: Take a role's level/geography/function plus a comp-survey export (Radford, Pave, Carta, or custom CSV), and produce a structured pay-band recommendation per component (base, equity, OTE) with cited percentiles, calibration against the firm's philosophy, and a public-facing range compliant with NYC LL 32-A and CO/CA/WA pay-transparency requirements. Never approves an offer; never auto-publishes.
---

# Compensation benchmark

## When to invoke

Use this skill when a recruiter or comp analyst needs a per-role pay band based on a survey export and the firm's comp philosophy. Take a role definition, a survey export, and the philosophy file as input and return a structured benchmark report plus a public-facing range.

Do NOT invoke this skill for:

- **Unilateral comp decisions outside the firm's approval matrix.** The skill recommends; People Ops / Finance / Comp Committee approve.
- **Equity at pre-Series-B startups.** Survey data is too thin and firm-cap-table-specific at that stage.
- **Negotiation-script generation.** Different workflow.
- **Approving exception bands** ("can we go 15% above?"). The skill informs; the hiring manager and finance approve.

## Inputs

- Required: `role_definition` — JSON with `level` (firm's ladder, e.g. `L5`), `geography` (e.g. `San Francisco MSA`), `function` (e.g. `software-engineering`).
- Required: `survey_export` — path to a survey export. Schema must match one in `references/1-survey-source-schemas.md`.
- Required: `philosophy` — path to the firm's compensation philosophy file. See `references/2-comp-philosophy-template.md`.
- Optional: `candidate_signal` — free-text note about the candidate (current comp, competing offers, etc.). Used in the calibration note, NOT to skew the recommended band.

## Reference files

- `references/1-survey-source-schemas.md` — per-source schemas with field mapping.
- `references/2-comp-philosophy-template.md` — fillable philosophy file.

## Method

Five steps.

### 1. Validate the role definition

Confirm `level`, `geography`, `function` are present and match values in the survey export. If `level` is on the firm's ladder but the survey uses a different ladder, look up the mapping in the philosophy file. If no mapping exists, halt and ask the user to add it.

If `geography` is ambiguous (e.g. "Bay Area" — does that include South Bay, East Bay, North Bay, the entire MSA?), halt and ask the user to specify against the survey's geography taxonomy.

### 2. Look up survey percentiles

Deterministic lookup — do NOT paraphrase the survey. For each of `base_salary`, `equity_annualized`, `ote` (or `bonus` if non-sales), pull the 25th / 50th / 60th / 75th / 90th percentiles for the matched (level, geography, function) cell.

Check the cell's sample size. If it's below the survey's documented threshold (Radford 5+, Pave 10+, Carta 15+ for equity, custom CSV per the schema's `min_sample_size` field), flag low-N. Fall back to a broader cell:

- First fallback: same level, same function, broader geography (e.g. US-wide).
- Second fallback: same level, same function, all geographies.

Document the fallback chain in the report. Do NOT silently fall back without surfacing.

### 3. Calibrate against firm philosophy

Read the philosophy file. The philosophy specifies the target percentile per component (e.g. base at 60th percentile, equity at 75th percentile, OTE at 50th percentile for sales).

For each component, compute:

- Recommended midpoint = survey's `target_percentile` for the cell
- Band width = midpoint × ±10% (default; configurable per component in the philosophy)
- Lower edge = midpoint × 0.9, upper edge = midpoint × 1.1

If the philosophy specifies a different band-width policy (e.g. wider band for senior roles where individual variance is larger), use that instead.

For equity: convert annualized survey value to dollar grant size at the firm's current strike price. Document the math in the report (`grant_value = annualized_value × vesting_period / strike_price`).

### 4. Compose the public-facing range

Compute the public-facing base salary range:

- Lower edge of public range = lower edge of base band
- Upper edge of public range = upper edge of base band
- Format: e.g. `$170,000-$210,000 USD per year`

Validate band width against the geography's pay-transparency requirements:

- NYC (LL 32-A): "good faith" range required; band narrower than ~15% width raises legal exposure.
- CO (Equal Pay for Equal Work Act): range required, no specific width threshold but functional good-faith requirement.
- CA (SB 1162): range required for postings if the role is to be performed in CA.
- WA (Pay Transparency Act): range required.

If the role straddles multiple jurisdictions, the broadest range applies. If the range is below 15% width, emit a warning (the band is at the edge of "good faith" — consider widening before publishing).

### 5. Emit the report + audit record

Write the report to stdout (or the calling environment's report destination). Append one JSONL line to `audit/<YYYY-MM>.jsonl` with: `role`, `geography`, `level`, `function`, `survey_source`, `survey_export_date`, `philosophy_version`, `target_percentiles`, `recommended_bands`, `public_range`, `low_n_flag`, `fallback_chain` (if any).

The audit record supports the firm's annual pay-equity audit. No PII; this is about the band, not about a specific candidate.

## Output format

```markdown
# Comp benchmark — {role} — {level} — {geography}

Generated: {ISO timestamp} · Skill v1.0 · Model: claude-sonnet-4-6
Survey: {Radford 2026-Q2 / Pave 2026-04 / etc.} · Philosophy: {firm-philosophy.json v3}

{LOW-N WARNING if any component fell back}

## Recommended bands

### Base salary (target: 60th percentile)

- Survey 60th percentile: $185,000
- Recommended band: $166,500 - $203,500
- Calibration note: Tight band (±10%); widen to ±15% for cross-level candidates.

### Equity (target: 75th percentile, 4-year vest)

- Survey 75th percentile annualized value: $90,000
- Total grant value: $360,000 over 4 years
- At firm strike $5.20: 69,231 shares
- Recommended band: 62,300 - 76,200 shares (±10%)

### Cash bonus (target: 50th percentile)

- Survey 50th percentile: $20,000 (annual target)
- Recommended band: $18,000 - $22,000

## Public-facing range (NYC LL 32-A / CO/CA/WA compliant)

`$166,500 - $203,500 USD per year, plus equity grant and target bonus`

Band width: 22% — within "good faith" thresholds.

## Provenance

- Survey: Radford Q2-2026 (export dated 2026-04-15)
- Survey cell sample size: 42 (above Radford's 5+ threshold)
- Philosophy: firm-philosophy.json v3 (updated 2026-01-10)
- Geography mapping: San Francisco MSA matched directly in Radford taxonomy
- Audit record: `audit/2026-05.jsonl` line {N}

## Calibration notes

- The candidate signal noted "competing offer at top of band from peer-tier company" — this is informational; the recommended band did NOT shift in response. If an exception is needed, escalate to the comp committee with the competing offer details.
- This role's geography has a pay-equity gap of -3.2% vs. firm-wide for the same level (per last quarterly audit); recommended band is at the firm's stated philosophy. Audit will surface whether the gap closes.
```

## Watch-outs

- **Survey-export staleness.** *Guard:* warns at >6 months on the export's dated metadata.
- **Geography mis-mapping.** *Guard:* halts on ambiguous geography rather than defaulting.
- **Low-N cell.** *Guard:* refuses to use a low-N cell; falls back with the chain documented.
- **Equity drift.** *Guard:* conversion math documented in the report; raw and converted values both stored in audit.
- **Public range too tight.** *Guard:* warns at <15% band width per pay-transparency-law functional thresholds.
- **Historical-pay bias propagation.** *Guard:* if philosophy is calibrated against historical pay rather than survey percentile, flag and recommend a separate pay-equity check.

# Survey source schemas

The compensation-benchmark skill reads survey exports in one of three supported formats: Radford, Pave, Carta. A custom CSV schema is also supported for in-house surveys or for sources not on this list.

## Radford

Radford ships exports as CSV or XLSX. The skill reads the CSV form (re-export from XLSX if needed).

### Required columns

| Column | Type | Notes |
|---|---|---|
| `level_radford` | string | Radford ladder code (e.g. `P4`). The philosophy file maps this to firm levels. |
| `function_radford` | string | Radford function code (e.g. `Software Engineering`). |
| `geography_radford` | string | Radford geography (e.g. `San Francisco MSA`). |
| `sample_size` | integer | Number of survey respondents in this cell. Skill requires ≥5. |
| `base_salary_p25` | number | 25th percentile base salary, USD. |
| `base_salary_p50` | number | 50th percentile. |
| `base_salary_p60` | number | 60th percentile. |
| `base_salary_p75` | number | 75th percentile. |
| `base_salary_p90` | number | 90th percentile. |
| `equity_annual_p25` | number | 25th percentile annualized equity value, USD. |
| `equity_annual_p50` | number | ... |
| `equity_annual_p60` | number | ... |
| `equity_annual_p75` | number | ... |
| `equity_annual_p90` | number | ... |
| `bonus_target_p50` | number | Target annual cash bonus, USD. (Radford reports target, not actual.) |

### Notes

- Radford's `level` codes (P1-P8 for IC, M1-M5 for management) need a firm-level mapping in the philosophy file. The mapping lives once, used everywhere.
- Geography taxonomy: Radford uses MSAs (e.g. `San Francisco MSA`, `New York MSA`) plus international country/city combos. The skill matches by exact string; "Bay Area" does not match `San Francisco MSA`.
- Sample size <5 → low-N flag. Radford itself suppresses cells below 3.

## Pave

Pave exports as CSV via the API or via UI download.

### Required columns

| Column | Type | Notes |
|---|---|---|
| `level_pave` | string | Pave's level normalization (e.g. `Senior IC`). |
| `function_pave` | string | Pave's function (e.g. `Engineering - Software`). |
| `location` | string | Pave's location string. |
| `n_employees` | integer | Number of employees in the cell. |
| `base_p25`, `base_p50`, `base_p75`, `base_p90` | number | Base salary percentiles. (Pave does not publish p60.) |
| `equity_p25`, `equity_p50`, `equity_p75`, `equity_p90` | number | Annualized equity in USD. |
| `total_comp_p50`, `total_comp_p75` | number | Total comp percentiles, useful for OTE calibration. |

### Notes

- Pave uses its own level normalization across firms; mapping to firm levels lives in the philosophy file.
- Pave's coverage is strongest for tech in the US and EU; APAC and emerging-market data is thinner.
- Sample size <10 → low-N flag (Pave's own threshold).
- `total_comp_p50` includes base + bonus + equity at the median. Useful for the public-range sanity check.

## Carta

Carta's compensation product exports in two flavors: cash-comp report (similar to Pave) and equity-comp report (cap-table-aware).

### Required columns (cash report)

| Column | Type | Notes |
|---|---|---|
| `role` | string | Carta's normalized role label. |
| `seniority` | string | `Junior`, `Mid`, `Senior`, `Staff`, `Principal`. |
| `location` | string | Carta's location string. |
| `n_companies`, `n_employees` | integer | Cell sample sizes (both required). |
| `base_p50`, `base_p75` | number | Base salary percentiles. |
| `total_cash_p50`, `total_cash_p75` | number | Base + bonus. |

### Required columns (equity report)

| Column | Type | Notes |
|---|---|---|
| `role` | string | Same as cash report. |
| `seniority` | string | Same. |
| `location` | string | Same. |
| `company_stage` | string | `Seed`, `Series A`, `Series B`, etc. |
| `equity_pct_p25`, `equity_pct_p50`, `equity_pct_p75`, `equity_pct_p90` | number | Equity as percentage of fully diluted shares outstanding. |

### Notes

- Carta's coverage is strongest for early-stage US startups. For mid-stage and public-company benchmarking, Radford or Pave are stronger.
- Equity reported as `equity_pct_p*` (percent of company), not dollar value. The skill converts using the firm's most recent valuation.
- Sample sizes <15 for equity → low-N flag (equity is more variance-heavy than cash).

## Custom CSV

For in-house surveys or sources not on the list above. The skill reads any CSV with the following minimum columns:

| Column | Type | Required | Notes |
|---|---|---|---|
| `level` | string | yes | Whatever ladder; must map to firm ladder via philosophy file. |
| `function` | string | yes | Whatever taxonomy; must match role definition. |
| `geography` | string | yes | Free-text or coded; must match exactly. |
| `sample_size` | integer | yes | Used for low-N flag. |
| `base_p50` | number | yes | Median base salary, USD. |
| `base_p25`, `base_p75`, `base_p90` | number | recommended | More percentiles enable wider band-targeting options. |
| `equity_value_p50` | number | for equity-bearing roles | Annualized equity value, USD. |
| `bonus_p50` or `ote_p50` | number | for sales / variable-comp roles | Target. |
| `min_sample_size` | integer | yes | The threshold below which the skill flags low-N. Set per-survey based on the survey methodology. |

### Notes

- Custom CSVs are useful for mid-cycle re-benchmarks against a peer cohort (your own team's data plus a few comparable firms) or for in-house comp-committee internal reviews.
- The `min_sample_size` field is critical — without it the skill cannot calibrate the low-N threshold and falls back to a conservative default (15).

## Adding a new survey source

To add a new source:

1. Document the source's export schema in this file with the same shape as the entries above.
2. Update the skill's source detector to recognize the new format (filename pattern, header pattern, or both).
3. Add the source's documented sample-size threshold.
4. If the source uses a different geography or level taxonomy, document the mapping in the philosophy file.

## Refresh cadence

Survey data shifts faster than yearly. The benchmark skill warns at >6 months on the export's dated metadata; that's the floor. Quarterly refresh is the operating norm for serious comp programs.

For Radford: Q1, Q2, Q3, Q4 standard cycles.
For Pave: monthly refresh available via API.
For Carta: quarterly equity reports, monthly cash updates available.

# Compensation philosophy file template

The compensation-benchmark skill calibrates survey data against the firm's compensation philosophy. This file is the philosophy. Copy the JSON below to `philosophy.json` (or wherever your skill config points), fill it in, and version it in git.

The philosophy is rarely revised — usually annually at most, often less. When it changes, every benchmark recommendation post-change uses the new philosophy. The skill captures the philosophy version in the audit record so future audits can reproduce the recommendation.

## JSON shape

```json
{
  "philosophy_version": "2026.1",
  "effective_from": "2026-01-10",
  "approver": "Comp Committee minutes 2026-01-08",
  "target_percentiles": {
    "engineering": {
      "base": 60,
      "equity": 75,
      "bonus_or_ote": 50
    },
    "sales": {
      "base": 50,
      "equity": 50,
      "bonus_or_ote": 75
    },
    "go_to_market_other": {
      "base": 60,
      "equity": 60,
      "bonus_or_ote": 60
    },
    "g_and_a": {
      "base": 60,
      "equity": 50,
      "bonus_or_ote": 50
    }
  },
  "band_widths_pct": {
    "default": 10,
    "by_level": {
      "junior": 8,
      "senior": 12,
      "staff": 15,
      "principal": 18,
      "executive": 25
    }
  },
  "level_mapping": {
    "firm_to_radford": {
      "L3": "P3",
      "L4": "P4",
      "L5": "P4",
      "L6": "P5",
      "L7": "P6",
      "L8": "P7"
    },
    "firm_to_pave": {
      "L3": "Mid IC",
      "L4": "Senior IC",
      "L5": "Senior IC",
      "L6": "Staff IC",
      "L7": "Principal IC",
      "L8": "Senior Principal IC"
    }
  },
  "geography_adjustments": {
    "remote_us": 0,
    "remote_intl": -15,
    "san_francisco_msa": 0,
    "new_york_msa": 0,
    "seattle_msa": -3,
    "austin_msa": -10,
    "london": -8,
    "toronto": -12,
    "remote_latam": -35
  },
  "current_strike_price_usd": 5.20,
  "vesting_period_years": 4,
  "equity_grant_type": "RSU",
  "exception_band": {
    "max_above_top": 15,
    "approval_required": "comp_committee"
  },
  "public_range_policy": {
    "minimum_band_width_pct": 15,
    "include_equity_target_dollar": true,
    "include_bonus_target_dollar": true
  },
  "pay_equity_audit_cadence_months": 12,
  "last_pay_equity_audit": "2026-02-15"
}
```

## Field-by-field

### `philosophy_version`, `effective_from`, `approver`

Versioning. The skill captures `philosophy_version` in the audit record so the recommendation is reproducible against a specific version of this file. `effective_from` is the date the philosophy applies to NEW recommendations — recommendations made before that date used the prior philosophy. `approver` cites the approval source (Comp Committee minutes, board resolution, etc.).

### `target_percentiles`

Per function family, the target percentile per component. The most common patterns:

- **Engineering**: 60th base, 75th equity (founder-friendly equity to attract IC talent), 50th bonus.
- **Sales**: 50th base, 50th equity, 75th OTE (the variable comp is the lever).
- **G&A**: 60th base across components (predictable, market-rate).

If the firm's strategy is "we pay top of market across the board," set everything to 75th. If the firm's strategy is "we pay base at market and over-index on equity," set base to 50th and equity to 75th-90th.

### `band_widths_pct`

The recommended band width as a percentage of the midpoint. Default 10% (recommended midpoint ±10%). Per-level overrides absorb the wider individual variance at senior levels.

If a single number is too rigid, the skill respects the per-level overrides.

### `level_mapping`

Mapping from the firm's internal ladder to each survey's ladder. The skill cannot infer this — it has to be specified per survey the firm uses. If a level mapping is missing, the skill halts and asks the user to add it.

This is the single most-edited part of the philosophy; it's also the most consequential, because the wrong mapping shifts every recommendation by a percentile-band.

### `geography_adjustments`

Per-geography multiplier. `0` means use the survey's value for that geography directly. `-15` means apply a 15% reduction (e.g. for `remote_intl`). The adjustments must be defensible — random adjustments here are how pay-equity gaps creep in.

If the firm has a published location-based pay policy, this section should match it line-for-line.

### `current_strike_price_usd`, `vesting_period_years`, `equity_grant_type`

Used for converting annualized survey equity values to grant size. Strike price is the most-recent 409A or option-grant strike. Vesting is typically 4 years. Grant type matters for tax framing but not for the band math.

### `exception_band`

When the skill is asked about a band-exception ("can we offer above the top?"), the philosophy says how high (`max_above_top: 15` means up to 15% above the top of the recommended band) and who approves. The skill itself does NOT approve exceptions; it surfaces the policy.

### `public_range_policy`

Compliance posture for NYC LL 32-A, CO/CA/WA pay-transparency requirements. `minimum_band_width_pct: 15` is the firm's "good faith" floor — the skill warns if a recommended band falls below this width.

### `pay_equity_audit_cadence_months`, `last_pay_equity_audit`

For the audit-record metadata. The skill notes the cadence and last audit so the recommendation can be flagged if the firm is overdue for an equity audit.

## When to revise the philosophy

- **Strategy shift** — the firm decides to over-index on equity vs. cash. Update target percentiles.
- **New geography** — opening a new region. Add to `geography_adjustments` based on local market data.
- **New survey added** — add a level mapping for the new survey.
- **Pay-equity audit findings** — if the audit surfaces gaps, the philosophy may need revision (band widths, geography adjustments).

Each revision bumps `philosophy_version`. Old audit records remain interpretable against their respective version.

## What the philosophy is NOT

- It is NOT a candidate-by-candidate negotiation guide.
- It is NOT a one-time setup; it evolves with the firm.
- It is NOT confidential to the recruiter — the philosophy should be visible to every hiring manager, ideally documented internally for transparency.