claude-skill

Puntuar leads contra una rúbrica ICP usando Claude

Dificultad

intermedio

Tiempo de setup

30min

Para

revops

RevOps

Stack

Un Claude Skill que toma cualquier fila de lead, la corre contra la rúbrica ICP de tu equipo, y devuelve un score 0-10, una justificación por-criterio citando la rúbrica, una acción recomendada por tier y una bandera de escalación para casos borderline. Diseñado para enchufarse a una columna AI de Clay, una custom-code action de HubSpot o un run standalone de CLI sobre un CSV. Reemplaza la matriz de scoring en spreadsheet que nadie ha actualizado desde el año pasado — sin pretender que también puede hacer scoring de intent o conductual, que no puede.

El bundle se entrega en apps/web/public/artifacts/lead-scoring-icp-rubric-skill/ y contiene SKILL.md más tres plantillas de referencia que el usuario adapta antes del primer run.

Cuándo usarlo

Usa este skill cuando tengas MQLs inbound apilándose más rápido de lo que tu equipo de SDRs puede triar, y el scoring existente o no existe (“todo es un lead”) o está stale (“matriz de scoring de HubSpot calibrada por última vez en 2023, nadie confía en ella”). También es útil para outbound: puntúa una lista cold enriquecida antes de asignarla, y dejas de quemar tiempo de SDR en empresas fuera-de-ICP que se ven superficialmente bien.

El skill es scoring de fit, no scoring de intent. Responde “¿es este el tipo correcto de empresa para nosotros?” — no “¿están en-mercado esta semana?”. Esa distinción importa: si solo puntúas por fit, vas a meter en secuencia cuentas con gran fit que no tienen necesidad actual e ignorar cuentas con mal fit que están comprando activamente. Empareja este skill con lo que sea que señale conducta in-market — Bombora, 6sense, tus propios eventos de uso de producto, hits a la página de pricing — para rutear correctamente.

Concretamente, invócalo desde:

Una columna AI de Clay que dispara en cada nueva fila de una tabla de leads, escribiendo el score y la justificación de vuelta a dos columnas.
Una custom-code action de HubSpot en un workflow disparado por Lifecycle stage = MQL, que llama al skill y escribe tanto el score como la justificación a propiedades del lead.
Una CLI standalone sobre un export CSV — útil para scoring de listas one-off antes del lanzamiento de una campaña.

Cuándo NO usarlo

Salta este skill cuando:

Quieres auto-rechazar leads sin humano en el loop. El output es una recomendación. El skill etiqueta explícitamente los casos borderline con escalate: needs_human_review, pero si lo cableas para borrar leads puntuados C o menos, vas a destruir pipeline en silencio cada vez que la rúbrica drifte. Mantén siempre un path de revisión SDR para al menos el tier C.
Tu “rúbrica” es vibes. El skill se niega a puntuar contra una rúbrica que no tiene pesos explícitos y valores de tier. Si tu equipo no ha tenido la pelea sobre qué es realmente una industria tier-A, ten esa pelea primero. El skill no puede hacer la rúbrica defendible si la fuente no lo es.
Necesitas scoring conductual o de intent. Esto es solo scoring de fit. Tratar de codificar “engagement score” o “última visita al sitio” en la rúbrica te fuerza a actualizarla constantemente; usa una herramienta dedicada de intent para las señales que varían en el tiempo y deja este skill para las de fit estáticas.
Operas en un dominio regulado que requiere explicabilidad más allá de la justificación por-criterio. Los outputs por-criterio son auditables pero no son lo mismo que un model card defendible ante un regulador. Si necesitas eso, invierte en un servicio de scoring propiamente, no en un Claude Skill.

Setup

El setup toma alrededor de 30 minutos una vez que tienes la rúbrica drafteada. La rúbrica en sí toma más — usualmente una sesión de trabajo de 60 minutos con el manager de SDRs, un AE y alguien de RevOps para discutir pesos.

Instala el Skill. Mete apps/web/public/artifacts/lead-scoring-icp-rubric-skill/SKILL.md y la carpeta references/ en tu directorio .claude/skills/lead-scoring/ (o súbelo como Skill en claude.ai). El name y description del frontmatter son lo que dispara el Skill en prompts relevantes.
Reemplaza la plantilla de rúbrica. Abre references/1-icp-rubric-template.md y reemplaza las filas placeholder en “Criteria” con tus criterios reales, pesos (1-5) y valores de tier (A / B / C). Rellena la sección “Hard disqualifiers” — estos corren como checks deterministas antes de cualquier llamada al LLM. Actualiza “Last edited” para que el SHA-256 que el skill imprime en cada footer del output refleje quién es dueño de la versión actual.
Reemplaza la matriz tier-a-acción. Abre references/2-tier-to-action-matrix.md y reemplaza las filas de ejemplo con lo que tu equipo realmente hace en cada combinación (tier, source_of_lead). Los defaults son razonables pero no son los tuyos.
Cablea la fuente de input. En Clay, apunta una columna AI al Skill, pasa la fila enriquecida del lead como lead, el archivo de rúbrica como rubric, y la columna fuente como source_of_lead. En HubSpot, envuelve el Skill en una custom-code action que lea las propiedades de contact + company en un objeto lead y postea el output estructurado de vuelta. En un script, glob el CSV, postea cada fila, escribe el score y la justificación a dos columnas nuevas.
Configura el destino. Tanto el score como la justificación van al lead. Score en una propiedad numérica (para lógica de routing), justificación en una propiedad long-text (para el SDR que la leerá antes de la call). Cablea el campo escalate a una propiedad boolean o enum separada para que el manager de SDRs pueda filtrar para revisión.
Calibra. Antes de prenderlo, corre el skill sobre 20 leads closed-won y 20 leads closed-lost de los últimos 6 meses. La distribución de score debe separar claramente las dos cohortes. Si no lo hace, la rúbrica es el problema, no el skill — vuelve al paso 2 y re-discute pesos.

Qué hace realmente el skill

El skill corre cuatro pasos en orden fijo. Los pasos tempranos gatean los siguientes; no paralelices.

Paso 1 — checks firmográficos deterministas. Antes de cualquier llamada al LLM, código plano corre los hard disqualifiers de la rúbrica (país sancionado, industria descalificada, headcount bajo tu piso, dominio de free-mail) y el check de campos requeridos (email y company_domain deben estar presentes). Los hits devuelven inmediatamente — disqualified con la cita, o escalate: insufficient_data con los campos faltantes. Por qué deterministas primero: es gratis, rápido y nunca alucina. Quemar tokens para confirmar que una peluquería de 3 personas no está en tu ICP de SaaS enterprise es un desperdicio.

Paso 2 — scoring por-criterio del LLM con pesado explícito. Para cada criterio restante, el modelo emite un tier (A / B / C) y una justificación de una frase citando la fila de la rúbrica. El skill multiplica tier (A=3, B=2, C=1) por el peso del criterio y suma. Por qué por-criterio en lugar de un prompt holístico: los outputs holísticos mezclan criterios en silencio y pierdes la habilidad de debugear por qué un lead sacó 8 en lugar de 5. Por qué pesado explícito en lugar de dejar al modelo balancear: los pesos declarados son la única forma de que la rúbrica siga siendo la fuente de verdad. Si el modelo decide su propio balance, las revisiones de rúbrica se vuelven teatro.

Paso 3 — fallback borderline a revisión humana. Si el score final está dentro de 0.5 de un borde de tier, o si más de 3 criterios fueron puntuados sobre data faltante o inferida, el skill setea escalate: needs_human_review y nombra los campos faltantes. La falla de scoring más cara no es un tier equivocado en un lead confiado — es un tier equivocado en un lead que siempre fue borderline.

Paso 4 — ensamblado de output. El skill emite el markdown descrito en references/3-sample-output.md: score titular y tier, acción recomendada joineada desde la matriz tier-a-acción, tabla por-criterio con razones, check de disqualifier, lista de data-gaps, y un footer con el SHA-256 de la rúbrica y la fecha last-edited.

Realidad de costos

El costo de tokens por-lead depende del tamaño de la rúbrica, pero para una rúbrica típica de 6 criterios con output estructurado por-criterio, espera aproximadamente 1,500-2,500 tokens de input y 400-700 tokens de output por lead. A precios de Claude Sonnet 4.x (aproximadamente $3 por millón de input y $15 por millón de output a finales de 2026), eso es alrededor de $0.01-0.02 por lead puntuado.

Un equipo corriendo 5,000 MQLs inbound por mes gasta aproximadamente $50-100/mes en tokens de Claude. Un equipo corriendo 50,000 leads outbound enriquecidos por mes gasta $500-1,000/mes — punto en el cual batching, prompt caching de la rúbrica y pre-filtrado con el paso determinista importan mucho. El skill defaultea a un único prompt estructurado por lead (en lugar de 6-10 prompts pequeños) precisamente para mantener el uso de tokens acotado.

Los costos no-token son más grandes. Construir la rúbrica es una sesión de trabajo de 60 minutos que haces una vez y rehaces trimestralmente. Calibrar contra 20 leads closed-won + 20 closed-lost es otra hora. Cablear la integración de Clay o HubSpot es medio día. Después de eso el skill es hands-off hasta que la rúbrica drifte.

Métrica de éxito

La métrica a vigilar es correlación score-a-conversión: de los leads puntuados A en los últimos 90 días, ¿qué fracción convirtió a oportunidades? ¿De los puntuados B? ¿C? Si la curva es monotónica — A convierte a una tasa más alta que B, B a una tasa más alta que C — la rúbrica está haciendo trabajo. Si C convierte a una tasa similar a B, la rúbrica no separa fit de no-fit y necesita re-discutirse.

Métrica secundaria: tiempo-al-primer-touch del SDR sobre leads tier-A. Un sistema de scoring que funciona colapsa esto a menos de 1 hora para inbound. Si los leads tier-A siguen sentados en una cola por 24h, el routing — no el scoring — es el cuello de botella.

vs alternativas

vs HubSpot Predictive Lead Scoring. El score predictivo built-in de HubSpot es una caja negra entrenada sobre tu data histórica de conversión. Funciona una vez que tienes suficiente volumen closed-won (HubSpot recomienda alrededor de 500 deals cerrados como mínimo). Para equipos bajo ese umbral, el modelo no tiene nada de qué aprender y el score es ruido. Este skill funciona desde el día uno porque la rúbrica es escrita a mano, no aprendida. El trade-off: el modelo de HubSpot capta patrones que un autor de rúbrica perdería; este skill solo sabe lo que escribiste. Corre ambos si tienes el volumen — usa el score de HubSpot para “qué me sorprende” y la justificación por-criterio de este skill para “por qué este está rankeado aquí”.

vs scoring conductual de Marketo. Marketo (o el scoring conductual de HubSpot) trackea señales de engagement — opens de email, page views, form submissions — y suma puntos. Eso es scoring de intent, no de fit, y los dos responden preguntas distintas. Una cuenta con gran fit que no ha abierto un email sigue siendo una cuenta con gran fit. Una cuenta con mal fit que se atragantó tu blog sigue siendo una cuenta con mal fit. Usa scoring conductual además de este skill, no en su lugar; rutea sobre la señal combinada (alto fit + alto intent → AE directo; alto fit + bajo intent → nurture; bajo fit + alto intent → fit-call de SDR antes del AE).

vs revisión manual del SDR. Para menos de 50 leads inbound por semana, la revisión manual por un manager de SDRs es genuinamente competitiva — los humanos captan matices (“esta empresa acaba de adquirir a nuestro cliente, prioriza”) que el skill se perderá. Por encima de ~200 leads por semana, la revisión manual se vuelve el cuello de botella y la consistencia baja. El skill escala linealmente con el budget de tokens; los humanos no.

A vigilar

Drift de rúbrica. Alguien edita el markdown de la rúbrica, shipea el cambio, y los SDRs leyendo los nuevos scores nunca ven un diff. Seis semanas después, el equipo se da cuenta de que el peso de headcount fue bajado de 4 a 2 por accidente y 200 cuentas stretch-tier fueron silenciosamente bajadas a C. Guard: el skill registra el SHA-256 de la rúbrica en cada footer del output y antepone un banner “Rubric updated YYYY-MM-DD” cada vez que el hash cambia entre runs. Un recordatorio de calendario trimestral fuerza una revisión incluso si no hay edits.
Amplificación de sesgo de fuente. Una rúbrica construida desde tu set closed-won codifica a quién ya le has vendido. Puntuar contra ella te ciega al ICP adyacente y tu pipeline se estrecha con el tiempo a lookalikes de los clientes del año pasado. Guard: cada trimestre, samplea 20 leads que el skill puntuó como tier C y haz que un AE revise manualmente si alguno es realmente fit. Si más de 3 están mal clasificados, añade una fila “stretch ICP” a la rúbrica y recalibra.
Falsa confianza sobre data delgada. Cuando el enrichment está faltando 4 de 6 campos de criterios, un score de 7.4 es mayormente ruido pero se lee como autoritativo. Los SDRs lo tratarán como un tier-A confiado y saltarán el prep de la call. Guard: el skill setea escalate: needs_human_review cada vez que más de 3 criterios son puntuados sobre data faltante o inferida, y añade una sección “Data gaps” listando los campos ausentes. Los SDRs son entrenados a leer la sección de gaps antes del número titular.
Proxies de clase protegida. Incluso con buena intención, una rúbrica que pesa “geografía” puede colapsar en nacionalidad, e “industria” puede colapsar en proxies de demografía de la empresa de formas que tu equipo legal no va a amar. Guard: el skill rechaza campos que reconoce como proxies de clase protegida (género derivado del nombre, foto, señales de edad). Revisa la rúbrica anualmente con alguien que pueda detectar los proxies menos obvios.

Stack

Claude — motor de scoring y generador de justificación. Sonnet 4.x es el sweet spot para costo vs calidad de razonamiento en esta tarea; Haiku funciona para el path solo-determinista pero pierde calidad de justificación en el paso del LLM.
Clay — capa preferida de fuente de leads y enrichment para outbound y scoring de listas cold. La columna AI es un punto de integración limpio.
HubSpot — CRM destino para score, justificación, bandera escalate y fuente. Las custom-code actions son el punto de integración para scoring de MQLs inbound.
Un editor de markdown y un calendario — las piezas no-glamorosas. La rúbrica vive en markdown, la revisión trimestral vive en el calendario de alguien, y ambas importan más que la elección de modelo.

Editar esta página en GitHub

Archivos de este artefacto

Descargar todo (.zip)

---
name: lead-scoring-icp-rubric
description: Score a single lead or a batch of leads against an explicit ICP rubric. Returns a 0-10 score per lead, a per-criterion rationale citing the rubric, a recommended next action by tier, and an escalation flag for borderline cases. Use when triaging inbound or routing enriched outbound leads — not as a substitute for behavioral or intent-based scoring.
---

# Lead scoring (ICP rubric)

## When to invoke

Invoke whenever you need to score a single lead — or a CSV/JSON batch of leads — against your team's ICP rubric. Typical entry points: a Clay table column, a HubSpot custom-code action firing on a new MQL, a standalone CLI run over a marketing-list export, or a manual paste during deal-desk triage.

The skill takes structured lead data plus the rubric and returns a 0-10 score, per-criterion rationale, a recommended next action by tier, and an escalation flag when the data is too thin to score confidently.

Do NOT invoke this skill for:

- **Auto-rejecting leads.** The output is a recommendation. Disqualifying a lead from outreach without an SDR seeing the rationale silently destroys pipeline when the rubric is wrong (and the rubric is sometimes wrong).
- **Scoring on protected-class proxies.** Do not pass fields like name-derived gender, photo, age, country-of-origin signals. Even if your rubric weights "geography" legitimately for support-hours fit, never collapse that into ethnicity or nationality. The skill refuses fields it recognizes as protected-class proxies.
- **Replacing intent-based or behavioral scoring entirely.** This is fit scoring, not intent. A great-fit account that has not visited your pricing page in 90 days is still a great fit but not a hot lead. Pair this skill with whatever signals "they are in-market right now" — Bombora, 6sense, your own product-usage events.

## Inputs

Required:

- `lead` — a structured lead record. Minimum fields: `email`, `company_domain`. Strongly preferred: `headcount`, `industry`, `country`, `job_title`, `tech_stack` (array), `funding_stage`. Pass whatever your enrichment layer (Clay, Apollo, ZoomInfo, Clearbit) returns.
- `rubric` — path to or inline contents of the ICP rubric markdown (see `references/1-icp-rubric-template.md`). Must contain explicit criterion + weight + tier-value rows. The skill refuses to score against a rubric that has no weights — vibes are not a rubric.

Optional:

- `source_of_lead` — free-text or enum: `inbound_demo`, `inbound_content`, `outbound_sequence`, `partner_referral`, `event`, `cold_list`. Used to bias the recommended-next-action mapping (a partner referral with a B-tier score still gets a human reach-out; a cold-list lead at the same tier does not).
- `batch_size_hint` — when scoring more than one lead, the caller can pass an integer so the skill paces token usage and returns progress markers. Default: process serially, no progress markers.

## Reference files

Always load these from `references/` before scoring. They are the leverage point — a tight rubric makes a defensible score, a vague rubric makes a vibes score that an AE will (correctly) ignore.

- `references/1-icp-rubric-template.md` — the rubric template. Replace placeholder rows with the actual criteria, weights, and tier values your team has agreed on.
- `references/2-tier-to-action-matrix.md` — maps the four tiers (A / B / C / disqualified) and the `source_of_lead` enum to a recommended next action. Edit this once with your team's routing reality, not the defaults.
- `references/3-sample-output.md` — a literal example of the markdown the skill produces, for one fictional lead. Use as the reference when wiring downstream parsers.

## Method

The skill runs these steps in order. Earlier steps gate later steps — do not parallelize.

### 1. Deterministic firmographic checks (no LLM)

Before any LLM call, run plain code over the lead record:

- Hard disqualifiers from the rubric (e.g. `country in ["{sanctioned-country}"]`, `industry in {disqualified-industries}`, `headcount < 10` if the rubric sets that floor) → return tier `disqualified` with the citation, no LLM call.
- Required-field check: if `email` and `company_domain` are missing, return `escalate: insufficient_data`.

Why: deterministic checks are free, fast, and never hallucinate. Burning tokens to confirm that a 3-person hairdresser is not in your enterprise-SaaS ICP is wasteful and slightly embarrassing.

### 2. Per-criterion LLM scoring with explicit rubric weighting

For each remaining criterion in the rubric, prompt the model to produce a tier value (A / B / C) and a one-sentence rationale that cites the rubric row. The skill multiplies the tier-value (A=3, B=2, C=1) by the criterion's weight and sums.

Why per-criterion rather than one holistic prompt: holistic scoring blends criteria silently and you lose the ability to debug why a lead got an 8 instead of a 5. Per-criterion outputs make the score auditable. The cost is roughly 6-10 short prompts per lead (or a single prompt that emits a structured per-criterion response — both work; the skill defaults to a single structured prompt with explicit per-criterion fields to keep tokens down).

Why explicit weighting rather than "let the model balance them": stated weights are the only way the rubric stays the source of truth. If the model invents its own balance, the rubric stops being authoritative and rubric reviews become theatre.

### 3. Borderline case fallback to human review

If the final score is within `+/- 0.5` of a tier boundary, OR if the rubric has more than 3 criteria where the data was missing/insufficient, set `escalate: needs_human_review` with a note naming the missing fields.

Why: the most expensive scoring failure is not a wrong tier on a confident lead — it is a wrong tier on a lead that was always borderline. Surfacing those for human review preserves trust in the confident scores.

### 4. Output assembly

Render the markdown described in "Output format" below. Score is the headline number. Rationale is the per-criterion table. Next action comes from the tier-to-action matrix, joined with `source_of_lead` if provided. Escalation flag is surfaced at the top when set.

## Output format

Literal markdown the skill emits for a single lead:

```markdown
# Lead score — jane.doe@acme.com (acme.com)

**Score:** 7.4 / 10 — Tier B
**Source:** inbound_content
**Escalate:** no

## Recommended next action

Tier B + inbound_content → SDR personalized email within 24h, no auto-sequence. Reference content piece they engaged with.

## Rationale (per criterion)

| Criterion | Weight | Tier | Reason |
|---|---|---|---|
| Industry | 5 | A | "Vertical SaaS / RevOps" matches in-ICP row in rubric. |
| Headcount | 4 | B | 240 employees — in stretch range (200-500), not core (500-2000). |
| Geo | 3 | A | HQ US-east, in supported region. |
| Tech stack | 4 | B | Salesforce + Marketo present (fit signals); no data warehouse cited. |
| Funding stage | 2 | C | Bootstrapped — out of preferred Series B-D band. |
| Job title | 4 | A | "Director, RevOps" matches champion-target pattern. |

## Disqualifier check

None triggered.

## Data gaps

- `revenue` field not provided by enrichment.
```

For batch input, the skill emits one such block per lead, separated by `\n---\n`, plus a top-level summary table (`email | tier | escalate`).

## Watch-outs

- **Rubric drift.** The rubric is a markdown file that someone edits. Edits are silent — no diff is shown to the SDRs reading scores. **Guard:** the skill records the rubric's SHA-256 in every output footer and prepends a "Rubric updated {date}, last verified by {name}" line if the hash differs from the previous run's. A weekly job (or a calendar reminder, if you are not that fancy) opens a PR-style review of the rubric every quarter.
- **Source-bias amplification.** If the rubric was built from your closed-won set, it encodes who you have already sold to. Repeatedly scoring against it narrows your pipeline to lookalikes and makes you blind to adjacent ICP. **Guard:** every quarter, sample 20 leads the skill scored as C-tier and have an AE review whether any are actually fit. If more than 3 are misclassified, the rubric is over-fit and needs a "stretch ICP" row added.
- **False confidence on thin data.** When enrichment is missing 4 of the 6 criteria fields, a 7.4 score is mostly noise. **Guard:** the skill sets `escalate: needs_human_review` whenever more than 3 criteria are scored on missing/inferred data, and adds a "Data gaps" section listing the absent fields. SDRs are trained to read the gaps section before the headline number.

# ICP rubric — TEMPLATE

> Replace this template's contents with your team's actual ICP rubric.
> The lead-scoring skill scores each criterion against this rubric. Vague
> rows (no weights, no tier values) cause the skill to refuse the run.

## How the skill reads this file

- Each row in "Criteria" must have an explicit `weight` (1-5) and three tier values (A / B / C). Anything else is treated as malformed and the skill returns an error rather than guessing.
- Rows in "Hard disqualifiers" run as deterministic checks before any LLM call. Keep them tight; one wrong row here silently kills good pipeline.
- The "Last edited" line is hashed into the SHA-256 the skill records in every output footer. Update it when you make material changes so SDRs reading scores can see the rubric moved.

## Criteria

| Criterion | Weight | A (best fit) | B (stretch) | C (poor fit) |
|---|---|---|---|---|
| Industry | 5 | {industries you win in} | {adjacent industries} | {everything else} |
| Headcount | 4 | {core range, e.g. 500-2000} | {stretch range, e.g. 200-500 or 2000-5000} | {below/above stretch} |
| Geo | 3 | {primary regions} | {secondary regions} | {regions you do not support} |
| Tech stack | 4 | {tools that signal fit, e.g. Salesforce + Marketo} | {one of the fit tools present} | {competing system of record} |
| Funding stage | 2 | {preferred stages, e.g. Series B-D} | {adjacent stages} | {unfit, e.g. pre-seed or post-IPO} |
| Job title | 4 | {champion-target patterns} | {adjacent titles} | {non-buying-committee titles} |

## Hard disqualifiers

Single signals that drop a lead to `disqualified` regardless of other criteria. Run as deterministic checks before LLM scoring.

- `country in [{sanctioned-or-unsupported-list}]`
- `industry in [{disqualified-industries — e.g. adult, gambling if you do not serve them}]`
- `headcount < {floor — e.g. 10}` (if you have a floor)
- `email_domain in [{free-mail providers if your motion blocks them}]`

## Tier thresholds

The skill maps the weighted sum to a tier. Defaults shown — adjust to your team's calibration run.

| Score | Tier |
|---|---|
| 8.0 - 10.0 | A |
| 6.0 - 7.99 | B |
| 4.0 - 5.99 | C |
| < 4.0 | disqualified |

## Last edited

{YYYY-MM-DD} — by {name}

# Tier-to-action matrix — TEMPLATE

> Replace this template's contents with your team's actual routing reality.
> The lead-scoring skill joins the score's tier with the lead's
> `source_of_lead` to pick a recommended next action. Edit once with your
> SDR/AE manager so the recommendations match what your reps actually do.

## How the skill reads this file

- Rows are `(tier, source_of_lead) → action`. The skill picks the row whose tier matches the score and whose source matches the input. If the source is missing or unrecognized, it falls back to the row marked `*` (any source).
- An action is one short imperative sentence. The skill emits this verbatim under "Recommended next action" — keep it copy-pasteable.

## Matrix

| Tier | Source | Action |
|---|---|---|
| A | inbound_demo | Round-robin to AE within 5 minutes; book meeting in same business day. |
| A | inbound_content | SDR call within 1 hour; reference content piece. Auto-sequence as backup if no answer in 24h. |
| A | outbound_sequence | Move to high-touch sequence; SDR adds 2 personalized steps. |
| A | partner_referral | AE handles directly. Loop in partner manager for warm intro. |
| A | event | SDR call within 24h referencing the event session/booth conversation. |
| A | cold_list | Treat as outbound: enrich further, hand to SDR for personalized first touch. |
| A | * | SDR personalized outreach within 24h. |
| B | inbound_demo | SDR qualification call within 4 hours before AE handoff. |
| B | inbound_content | SDR personalized email within 24h, no auto-sequence. Reference content piece. |
| B | outbound_sequence | Standard outbound sequence, no escalation. |
| B | partner_referral | SDR call within 48h; loop in partner if no response. |
| B | event | SDR email + follow-up call within 48h. |
| B | cold_list | Standard outbound sequence. |
| B | * | SDR email within 48h. |
| C | inbound_demo | SDR fit-call within 24h; many will self-disqualify on the call. |
| C | inbound_content | Add to nurture; no SDR touch unless engagement signals appear. |
| C | outbound_sequence | Pause sequence; do not waste SDR cycles. |
| C | partner_referral | SDR courtesy call within 1 week (relationship cost of ignoring). |
| C | event | Add to nurture only. |
| C | cold_list | Drop. |
| C | * | Nurture only. |
| disqualified | * | Mark `Disqualified — out of ICP` with rubric citation. Do not auto-delete; archive for audit. |

## Escalation overrides

When the skill emits `escalate: needs_human_review`, the action above is replaced with:

> Hold for SDR manager review. Lead is borderline (within 0.5 of tier boundary) or scored on thin data. See "Data gaps" section.

When the skill emits `escalate: insufficient_data`, the action is:

> Re-enrich lead and re-score. Required fields missing: {list}.

## Last edited

{YYYY-MM-DD} — by {SDR manager name}

# Sample output — for parser wiring

> A literal example of what the skill emits for one fictional lead. Use
> this when wiring the downstream parser (Clay AI column → property
> mapping, HubSpot custom-code action → property writeback, CSV
> post-processor). The schema below is what the skill commits to; the
> values are illustrative.

## Single-lead output

```markdown
# Lead score — jane.doe@northwind.com (northwind.com)

**Score:** 7.4 / 10 — Tier B
**Source:** inbound_content
**Escalate:** no

## Recommended next action

Tier B + inbound_content → SDR personalized email within 24h, no auto-sequence. Reference content piece they engaged with.

## Rationale (per criterion)

| Criterion | Weight | Tier | Reason |
|---|---|---|---|
| Industry | 5 | A | "Vertical SaaS / RevOps" matches in-ICP row in rubric. |
| Headcount | 4 | B | 240 employees — in stretch range (200-500), not core (500-2000). |
| Geo | 3 | A | HQ US-east, in supported region. |
| Tech stack | 4 | B | Salesforce + Marketo present (fit signals); no data warehouse cited. |
| Funding stage | 2 | C | Bootstrapped — out of preferred Series B-D band. |
| Job title | 4 | A | "Director, RevOps" matches champion-target pattern. |

## Disqualifier check

None triggered.

## Data gaps

- `revenue` field not provided by enrichment.

---

_Rubric SHA-256: 4f9c...a812 | Last edited 2025-12-15 by Sam Patel_
```

## Batch output

For a batch of N leads, the skill prepends a summary table and emits one block per lead separated by `\n---\n`:

```markdown
# Batch summary (12 leads)

| Email | Tier | Score | Escalate |
|---|---|---|---|
| jane.doe@northwind.com | B | 7.4 | no |
| ahmed@tailspintoys.io | A | 8.9 | no |
| j.smith@gmail.com | disqualified | 0 | hard_disqualifier:free_email |
| ... | ... | ... | ... |

---

# Lead score — jane.doe@northwind.com (northwind.com)
...
---
# Lead score — ahmed@tailspintoys.io (tailspintoys.io)
...
```

## Field contract for parsers

If you write a parser instead of consuming the markdown, these are the stable fields:

- `email` — string, lowercased
- `domain` — string, lowercased
- `score` — float, 0.0 to 10.0, one decimal
- `tier` — enum: `A` / `B` / `C` / `disqualified`
- `source` — pass-through of the input `source_of_lead`, or `unknown`
- `escalate` — enum: `no` / `needs_human_review` / `insufficient_data` / `hard_disqualifier:{reason}`
- `next_action` — string, single sentence
- `rationale[]` — list of `{criterion, weight, tier, reason}`
- `data_gaps[]` — list of strings (field names)
- `rubric_sha256` — string, 8-character prefix in the markdown footer; full hash available via the skill's structured-output mode