Una Claude Skill que toma los scorecards de entrevista de un candidato rechazado (y, cuando está disponible, transcripciones de BrightHire o Metaview), redacta un email de rechazo o notas para una llamada del reclutador con base en evidencia, y produce las notas del lado del reclutador para la llamada. Reemplaza la carta tipo de rechazo que daña el candidate experience con un feedback personalizado que el candidato puede realmente aprovechar — y se niega a redactar cuando falta la rúbrica, el loop no convergió, o el caso está marcado por jurisdicción.
Cuándo usarlo
El candidato llegó al menos a un onsite o a un loop de etapa final, donde, según el costo del recruiting funnel, el candidato ya invirtió suficiente tiempo como para merecer una respuesta real.
El equipo tiene al menos dos scorecards firmados sobre el candidato (Ashbysubmitted: true, Greenhousestatus: complete, Leverstate: completed). Un scorecard es la visión de un solo entrevistador; la skill se niega a sintetizar feedback desde una sola perspectiva porque eso expone a la firma a reclamos de evidencia selectiva.
Existe una rúbrica de rol en rubrics/<role_id>.yaml con anclajes conductuales por dimensión (la misma fuente que lee la interview debrief skill). La skill califica contra los anclajes de la rúbrica, no contra la prosa libre del scorecard.
El candidato pidió feedback explícitamente (registrado por escrito en el ATS), O la jurisdicción de residencia del candidato es una donde lo específico no solicitado no acarrea riesgo documentado según la guía del HR-counsel del usuario.
Un reclutador revisa y edita cada borrador antes de enviar. La skill escribe los borradores a disco y se detiene; no define ninguna acción send.
Cuándo NO usarlo
Envío automático sin revisión del reclutador. El feedback de rechazo redactado y enviado por IA es la forma más confiable de producir un incidente con la EEOC, la ADA o la ley estatal de empleo. El reclutador es la puerta. Si tu objetivo es sacar al humano del loop, este es el workflow equivocado.
Candidatos que no pidieron feedback en jurisdicciones deny. Francia (riesgo del Code du travail sobre razones de rechazo documentadas), Alemania (cambio de carga probatoria del § 22 AGG), y cualquier jurisdicción que el HR counsel del usuario haya marcado unsolicited_feedback: deny en el archivo de política. La skill rechaza lo específico en esos casos y escribe la plantilla genérica de rechazo en su lugar. No edites el archivo de política para que un caso de jurisdicción deny pase.
Casos marcados por legal. Disputa activa, solicitud de adecuación no atendida, o una queja en registro. La skill devuelve un borrador de rechazo genérico y le hace ver la marca al reclutador. Lo específico en un caso marcado se convierte en evidencia en la disputa.
Rechazos en etapas tempranas (resume screen, recruiter screen). El rechazo con plantilla es la herramienta correcta ahí; el costo de modelo por candidato y el tiempo de revisión del reclutador no se amortizan a la escala del top-of-funnel. La skill es para candidatos que llegaron al menos a un onsite.
Ranking comparativo (fuiste nuestra segunda elección, tuvimos candidatos más fuertes). La skill se va a negar a redactar esto — el mapping rúbrica-a-feedback no contiene ese lenguaje y la lista negra de frases lo filtra. El ranking comparativo es lo que convierte un rechazo constructivo en un post de Glassdoor.
Pedidos de mejora de proceso (pedirle al candidato feedback sobre la entrevista, una referencia, o un testimonial). Pedidos a la inversa en un email de rechazo son riesgo de declaración de testigo ante la EEOC y daño al candidate experience. La lista negra los atrapa.
Setup
Coloca el bundle. Pon apps/web/public/artifacts/rejection-feedback-claude-skill/SKILL.md en tu directorio de skills de Claude Code (o en custom Skills de claude.ai, con autorización de Tier A para datos de candidato según la AI policy).
Configura la fuente de la rúbrica. La skill lee las rúbricas de rol desde rubrics/<role_id>.yaml — el mismo path que usa la interview debrief skill. Si la rúbrica no existe, la skill se niega a correr. Structured interviewing es el prerrequisito, no esta skill.
Completa el mapping rúbrica-a-feedback. Copia references/1-rubric-to-feedback-mapping.md y reemplaza el wording del template con el lenguaje aprobado por tu equipo para hablarle al candidato por cada dimensión de la rúbrica. Consigue el sign-off del HR counsel sobre el wording aprobado una sola vez; el log de auditoría captura el SHA-256 del mapping por corrida, así que las revisiones son visibles en retro.
Escribe el archivo de política por jurisdicción. Un archivo YAML con un bloque por cada jurisdicción donde tu firma contrata. Cada bloque setea unsolicited_feedback: allow o deny y referencia el memo de guía relevante del HR counsel. El bundle trae un template; los deny por defecto son Francia, Alemania y cualquier jurisdicción con guía activa de derecho laboral en contra de razones de rechazo documentadas.
Configura la API del ATS. Token de API de Ashby, Greenhouse o Lever con scope de lectura sobre scorecards y candidatos. La skill jala scorecards por candidate_id; no acepta texto de scorecard pegado porque el texto pegado no se puede auditar de vuelta a la entrevistadora fuente.
Opcional: configura el bundle de transcripciones. Acceso a la API de BrightHire o Metaview. Cuando se pasa un transcript_id, la skill cruza las afirmaciones del scorecard contra los turnos de la transcripción en el paso 4.
Dry-run sobre un candidato cerrado. Corre sobre un candidato que ya fue rechazado el trimestre pasado. Compara el borrador de la skill con lo que el reclutador efectivamente envió. Ajusta el mapping rúbrica-a-feedback si la calibración deriva — el mapping, no el modelo, suele ser la palanca.
Lo que hace realmente la skill
Seis pasos, en orden. El orden importa: el gating por jurisdicción y la validación de scorecards van antes de que el LLM lea contenido del candidato, porque dejar al modelo correr sobre el texto del scorecard en un caso de jurisdicción deny deja una entrada en el log de model calls con datos identificatorios del candidato que la firma no necesitaba retener.
Valida la política por jurisdicción y el consentimiento. Busca la jurisdicción del candidato en el archivo de política. Si la política es unsolicited_feedback: deny y el candidato no pidió feedback por escrito, frena lo específico y cambia a la plantilla genérica de rechazo. Elegir gatillar por consentimiento antes de jalar los scorecards mantiene limpia la narrativa de minimización de datos para el GDPR Art. 5(1)(c).
Jala los scorecards (y la transcripción opcional). Fetch vía la API del ATS. Descarta drafts. Si el loop tiene menos de dos scorecards firmados, frena — feedback sintetizado desde la visión de un solo entrevistador es una opinión, no feedback, y expone a la firma a reclamos de evidencia selectiva.
Identifica dimensiones y evidencia. Calcula la media y la desviación estándar entre entrevistadores por dimensión de la rúbrica. Saca a la luz dimensiones donde la media ≥ 4 (fortaleza, apertura cálida) y la media ≤ 2 (gap del candidato). Niégate a sacar a la luz cualquier dimensión con desviación estándar entre entrevistadores ≥ 1.5 — el loop no convergió, y feedback sobre una dimensión no convergente no sobreviviría un “pero el entrevistador X me puso un 5”. Para cada dimensión expuesta, jala citas textuales de evidencia de los scorecards (o de la transcripción, cuando esté disponible). Sin string textual → la dimensión no se saca a la luz.
Redacta contra el mapping rúbrica-a-feedback. Traduce como mucho una fortaleza y un gap a lenguaje para el candidato usando references/1-rubric-to-feedback-mapping.md. Tope de uno cada uno para que el borrador no se lea como una lista defensiva. Los slots de sustitución del mapping se llenan desde campos estructurados (scorecard, anclaje de rúbrica) o desde la lista de temas aprobados — el LLM nunca redacta libremente un valor de sustitución, lo cual es la guarda contra lo específico falso.
Screening de sesgo y de lo específico falso. Grep del borrador contra references/2-banned-phrase-blocklist.md. Cualquier hit frena la corrida con el string ofensor expuesto. Verifica que cada afirmación específica se mapee de vuelta a un string textual de evidencia del paso 3 — las afirmaciones sin fuente frenan. Este es un pase separado del paso 4 por diseño; el pase de screening solo ve el texto del borrador, sin awareness de los scorecards subyacentes, así que no puede racionalizar una frase prohibida como “pero el entrevistador quiso decir X”.
Escribe a disco y al log de auditoría. Escribe drafts/<candidate-id>.md y (para route: call) drafts/<candidate-id>-call-notes.md según el formato en references/3-output-format.md. Agrega una línea JSONL a audit/<YYYY-MM>.jsonl con candidate_id_hash (SHA-256, no el ID crudo), rubric_sha256, blocklist_sha256, mapping_sha256, dimensiones expuestas, hits de la lista negra, model ID, timestamp. Nada de texto libre identificatorio del candidato en la línea de auditoría.
El formato literal del email, el fallback de rechazo genérico y la plantilla de notas para la llamada viven en references/3-output-format.md. El formato es fijo porque los consumidores aguas abajo — reclutador, candidato y cualquier futura revisión de auditoría — necesitan lenguaje predecible sin drift específico del reclutador.
Costo real
Por borrador de rechazo, con Claude Sonnet 4.5:
Tokens del LLM — típicamente 12-25k tokens de entrada (YAML de rúbrica + scorecards + instrucciones de la skill + archivos de referencia) y 0.5-1.5k tokens de salida (el borrador más las notas para la llamada). Con Sonnet 4.5 son aproximadamente 5-10 centavos por borrador. Un equipo de reclutamiento que corre 200 borradores de rechazo al mes gasta 10-20 dólares en costo de modelo.
Costo de API del ATS — cero en Ashby (API gratuita), Greenhouse (incluido en el tier), Lever (incluido). Los fetches de transcripción contra BrightHire o Metaview cuentan contra el plan por asiento; los fetches de feedback de rechazo son solo lectura y no consumen nuevos créditos de transcripción.
Tiempo del reclutador — la ganancia está acá. Redactar a mano un email de rechazo bien pensado y con base en evidencia desde los scorecards son 20-30 minutos por candidato cuando el reclutador lo hace bien, o 3 minutos cuando pega una carta tipo (que es lo que la mayoría de los equipos termina haciendo a escala). La skill produce el borrador de 20 minutos en menos de 30 segundos; el reclutador revisa y edita en 4-7 minutos. El ahorro neto es de aproximadamente 15-20 minutos por rechazo al estándar de borrador bien pensado — digamos 50-60 horas al mes en un equipo que corre 200 rechazos.
Tiempo de setup — 30 minutos para el mapping rúbrica-a-feedback y la política por jurisdicción si tu equipo ya tiene wording aprobado para el candidato en algún lado; más si el HR counsel todavía no opinó sobre el lenguaje del feedback de rechazo (en cuyo caso esa conversación es el prerrequisito, no esta skill).
El retorno compuesto en candidate experience. Los candidatos rechazados con feedback específico y con base en evidencia tienen más probabilidad de volver a postularse, más probabilidad de referir a otros, y sustancialmente menos probabilidad de dejar reviews dañinas en Glassdoor — los datos comúnmente citados en la literatura de recruiting están en el rango de 30-50% para la intención de volver a postular, aunque no tenemos fuente primaria para esos números y los tratamos como direccionales. El retorno compuesto se ve en densidad de pipeline a un año, no en el mes que se envió el borrador.
Métrica de éxito
Trackea tres números al mes, en el ATS:
Edit-distance del reclutador por borrador. El número de caracteres que el reclutador cambia entre el borrador de la skill y el mensaje enviado. Si la edit-distance tiende a cero, el reclutador está sellando con goma — saca esto en retro y revisita el mapping rúbrica-a-feedback. Si la edit-distance es consistentemente alta, el mapping está mal calibrado.
Tasa de respuesta del candidato al rechazo. Las respuestas a un email de rechazo suelen ser notas de gracias y futura postulación (buena señal) o notas de escalado (mala señal). Trackea la tasa de escalado como porcentaje de los rechazos enviados. Un equipo de baseline corriendo cartas tipo suele ver menos de 1% de escalado; el objetivo con esta skill es mantenerse en o por debajo de ese baseline, no por encima. Si la tasa de escalado sube, el mapping rúbrica-a-feedback está produciendo lenguaje que cae mal — recalibra.
Tasa de re-postulación dentro de 12 meses. Candidatos rechazados a través de esta skill versus candidatos rechazados con la carta tipo legacy, medido en los siguientes 12 meses. El beneficio compuesto aparece acá, no en el gasto de modelo ni siquiera en el hilo del rechazo.
vs alternativas
vs las plantillas de rechazo nativas de Ashby. Ashby (y Greenhouse, Lever) traen plantillas de rechazo con merge fields para nombre del candidato y rol. Son plantillas, no feedback — los merge fields no jalan evidencia del scorecard y no hay una capa de lenguaje anclada en rúbrica. Usa las plantillas de Ashby para rechazos de top-of-funnel donde la plantilla es honesta. Usa esta skill para rechazos de etapa tardía donde la plantilla se lee como menosprecio al tiempo que el candidato invirtió.
vs emails de rechazo genéricos. El rechazo genérico es la respuesta correcta en casos de jurisdicción deny, cuando no hubo consentimiento, y cuando la rúbrica no sacó a la luz un específico defendible. La skill escribe la plantilla genérica de rechazo byte por byte en esos casos. La diferencia es que la skill toma la decisión de forma determinística según la política por jurisdicción y la salida de rúbrica, en vez de que el reclutador caiga en lo genérico por fatiga.
vs notas escritas a mano por el reclutador. Las notas a mano son el gold standard para candidatos sénior o referidos VIP donde el reclutador tiene el contexto de la relación y el tiempo. La skill se gana el sueldo en el volumen — el 80% de los rechazos de etapa tardía donde el reclutador, si no, pegaría una carta tipo porque redactar a mano a escala no entra en el día. Para el tier sénior, el archivo de notas para la llamada le da al reclutador un punto de partida estructurado para la llamada, y el reclutador improvisa desde ahí.
vs un LLM sin archivo de rúbrica y sin lista negra. Este es el modo de falla contra el cual se construye la skill. Un LLM redactando desde scorecards solos, sin anclaje en rúbrica, sin lista negra de frases y sin log de auditoría, produce texto de rechazo rápido, confiado y de apariencia plausible — y aproximadamente uno de cada veinte borradores va a contener una cita alucinada, un ranking comparativo o un proxy de clase protegida. Los archivos de checklist del bundle son lo que mueve la tasa de falla a casi cero.
A qué prestar atención
Lenguaje que implica a la EEOC. Cubierto por la lista negra de frases en references/2-banned-phrase-blocklist.md, que corre como pase separado en el paso 5 sin awareness de los scorecards subyacentes. Los hits frenan la corrida con el string ofensor expuesto. No edites la lista negra para que un borrador pase — arregla la rúbrica o el lenguaje del scorecard en su lugar.
Lo específico falso del LLM. Cubierto por la regla “sin cita textual no hay síntesis” del paso 3. Cada afirmación del borrador debe rastrearse a un string textual de un scorecard firmado o de una transcripción. Sin string textual → la dimensión no se saca a la luz. Esta es la guarda contra el modo de falla más común del feedback redactado por LLM — citas de apariencia plausible que ningún entrevistador escribió, citadas de vuelta al candidato como hecho.
Lenguaje de ranking comparativo. Cubierto por el mapping rúbrica-a-feedback en references/1-rubric-to-feedback-mapping.md, que no contiene wording comparativo, y por la lista negra del paso 5 que lo atrapa si se cuela. El ranking comparativo es lo que convierte un rechazo constructivo en un post de Glassdoor.
Riesgo de evidencia selectiva. Cubierto por el paso 2 (frenar si el loop tiene menos de dos scorecards firmados) y el paso 3 (negarse a sacar a la luz dimensiones con desviación estándar entre entrevistadores ≥ 1.5). El desacuerdo entre entrevistadores no se convierte en feedback al candidato.
Drift de envío automático. Cubierto por la ausencia de toda acción send en la skill. Los borradores se escriben en drafts/<candidate-id>.md para que el reclutador los revise, edite y envíe desde el outbox del ATS. El reclutador es la puerta.
Daño del boilerplate genérico. Cubierto por la negativa del paso 3 a sacar a la luz una dimensión sin evidencia textual — cuando la rúbrica no saca nada seguro para compartir, la skill escribe la plantilla de rechazo genérico en lugar de sintetizar específicos débiles. El rechazo genérico es honesto; los específicos débiles son peores que ningún específico.
PII en el log de auditoría. Cubierto porque el paso 6 escribe solo candidate_id_hash (SHA-256), nunca el candidate ID crudo, el nombre o el texto del scorecard. El log de auditoría es para reproducibilidad de la corrida, no para retención de datos del candidato. Los borradores para el candidato viven en drafts/ bajo la propia política de retención del reclutador.
Drift de calibración entre roles y seniority. Cubierto por YAMLs de rúbrica por rol y porque el mapping rúbrica-a-feedback está versionado por equipo. Los rechazos de senior leadership necesitan un framing distinto al de entry level; ese ajuste vive en el archivo de mapping, no en el código de la skill.
Privacidad y residencia de datos. Verifica que la skill opera dentro de IA enterprise Tier A según la AI policy. El contenido de la entrevista es sensible; el candidato no consintió que sea procesado por un modelo de terceros a menos que tu AI policy y tu lenguaje de consentimiento de recolección de scorecards lo cubran explícitamente.
Stack
El bundle de la skill vive en apps/web/public/artifacts/rejection-feedback-claude-skill/ y contiene:
SKILL.md — la definición de la skill
references/1-rubric-to-feedback-mapping.md — completar por equipo, wording aprobado por HR-counsel por dimensión de rúbrica
references/2-banned-phrase-blocklist.md — pre-flight checks sobre el borrador (no editar para que pasen borradores sesgados)
references/3-output-format.md — los formatos literales de email, de rechazo genérico y de notas para la llamada
Herramientas que el workflow asume que ya usas: Claude (el modelo), Ashby o Greenhouse o Lever (el ATS donde viven los scorecards), y opcionalmente BrightHire o Metaview (transcripciones de entrevista para un anclaje de evidencia más rico). Workflow hermano que comparte la fuente de rúbrica: la interview debrief skill.
---
name: rejection-feedback
description: Take a rejected candidate's interview scorecards and (where available) transcripts, draft an evidence-grounded rejection email or recruiter-call talking points, and produce the recruiter-side notes for the call. Always stops at a recruiter-review gate; never sends. Refuses to draft when the rubric is missing or the case is jurisdiction-flagged.
---
# Rejection feedback
## When to invoke
Use this skill when a recruiter needs to send personalized post-interview feedback to a candidate who reached at least an onsite or final-stage loop, and the team has structured scorecards plus a role rubric on file. Take the candidate's scorecards (across all interviewers), the role rubric, the recruiter-relationship context (was feedback explicitly offered? requested?), and the candidate's residency jurisdiction as input. Produce a Markdown rejection email draft, optional recruiter-call talking-point notes, and a one-line routing recommendation.
Do NOT invoke this skill for:
- **Auto-sending without recruiter review.** The skill writes drafts to disk and stops. There is no `send` action defined anywhere in this skill. Auto-sent rejection feedback is the single most reliable way to produce an inappropriate-content incident under EEOC, ADA, or state employment law. The recruiter is the gate.
- **Candidates who have not requested feedback in jurisdictions where unsolicited feedback creates risk.** Specifically: France (Code du travail risk on documented rejection reasons), Germany (AGG §22 evidentiary shift), and any jurisdiction where the recruiter's HR-counsel guidance disallows unsolicited specifics. The skill reads the `jurisdiction_policy.yaml` file and refuses to draft specifics for any jurisdiction marked `unsolicited_feedback: deny`.
- **EEOC-implicating language or protected-class proxies.** "Cultural fit", age inferences from graduation year, family-status references, national-origin references, accent commentary, gendered descriptors ("aggressive", "abrasive", "soft"), pregnancy-status references, disability or accommodation references. The banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as the final check before the draft is written. Any hit halts the run with the offending string surfaced.
- **Cases legal has flagged.** If the candidate file has a flag for active dispute, accommodation request unaddressed, or a complaint on record, the skill returns "decline to provide specific feedback — legal flag present" and writes a generic-decline draft instead.
- **Rejections from earlier stages** (resume screen, recruiter screen). Templated decline is the right tool there. This skill is for candidates who invested significant time and earned a real answer, per the [recruiting funnel](/en/learn/recruiting-funnel-metrics/) cost.
## Inputs
- Required: `candidate_id` — the ATS record ID ([Ashby](/en/tools/ashby/), [Greenhouse](/en/tools/greenhouse/), or [Lever](/en/tools/lever/)). The skill pulls scorecards via the ATS API; it does not accept pasted scorecard text, because pasted text cannot be audited back to the source interviewer.
- Required: `role_id` — used to load the role's rubric from `rubrics/<role_id>.yaml` (same source the [interview debrief skill](/en/workflows/interview-debrief-summary-skill/) reads). Without a rubric the skill refuses to run; ungrounded feedback is how false specifics get drafted.
- Required: `jurisdiction` — ISO 3166 country code for the candidate's residency at time of application. Drives which jurisdiction-policy block applies.
- Required: `feedback_requested` — boolean. `true` only if the candidate explicitly asked for feedback (in writing, captured in the ATS). `false` defaults to a generic-decline draft in jurisdictions where the policy file flags unsolicited specifics as risk.
- Optional: `transcript_id` — pointer to a [BrightHire](/en/tools/brighthire/) or [Metaview](/en/tools/metaview/) transcript bundle for the loop. When present, the skill cross-references scorecard claims against transcript evidence; when absent, the skill works from scorecards alone and labels the draft accordingly.
- Optional: `route` — one of `email`, `call`, `auto`. `auto` (default) picks based on stage reached and seniority per the routing rules in `references/3-output-format.md`.
## Reference files
Always read the following from `references/` before drafting. Without them the draft is generic, ungrounded, and risks tripping a banned phrase.
- `references/1-rubric-to-feedback-mapping.md` — the mapping from rubric dimensions to safely-sharable, candidate-facing feedback language. Replace the template placeholders with your team's approved phrasing before first use.
- `references/2-banned-phrase-blocklist.md` — the blocklist the skill greps the draft against in step 5. Patterns include EEOC-implicating terms, protected-class proxies, comparative-ranking language, and unverifiable specifics. Do not edit this file to make a draft pass.
- `references/3-output-format.md` — the literal email and call-notes format, including the routing rules.
## Method
Run these six steps in order. Steps 1-3 are deterministic gating; steps 4-5 use the LLM for synthesis and screening; step 6 is the audit log. The order matters — letting the LLM draft against unchecked scorecards produces fast, confident, EEOC-implicating output.
### 1. Validate jurisdiction policy and consent
Open `references/jurisdiction-policy.yaml` (user-supplied; template shipped in the bundle). Look up the candidate's `jurisdiction`. If `unsolicited_feedback: deny` and `feedback_requested: false`, halt specifics and switch to the generic-decline template at the top of `references/3-output-format.md`. Log the reason in the audit line.
The choice to gate on consent before pulling scorecards is deliberate: specifics drafted and then discarded still leave a model-call log entry with candidate-identifying scorecard text. Gating up front keeps the data-minimization story clean for GDPR Art. 5(1)(c).
### 2. Pull scorecards and (optional) transcript
Fetch all scorecards for `candidate_id` via the ATS API. Validate that every scorecard is signed-off (Ashby `submitted: true`, Greenhouse `status: complete`, Lever `state: completed`). Drop drafts. If the loop has fewer than two completed scorecards, halt — feedback synthesized from one interviewer's view is not feedback, it is an opinion, and exposes the firm to selective-evidence claims.
When `transcript_id` is provided, fetch the transcript bundle. The skill will cite scorecard claims against transcript turns in step 4.
### 3. Identify dimensions and evidence
For each rubric dimension, compute the cross-interviewer mean score and the standard deviation. Flag dimensions where:
- mean ≥ 4 (candidate strength, surface as the warm opening)
- mean ≤ 2 (candidate gap, candidate for feedback if safe)
- standard deviation ≥ 1.5 (interviewer disagreement — do NOT cite this dimension; the loop did not converge and the feedback would not survive a "but interviewer X scored me 5" challenge)
For each surfaced dimension, pull the verbatim evidence quotes from the scorecards (or transcript, when available). Every claim in the final draft must cite a verbatim string from the evidence pool. No verbatim string → the dimension is not surfaced.
The "no synthesis without verbatim citation" rule is the guard against false specifics. LLMs drafting feedback from scorecards will, without this rule, invent quotes that sound plausible — "the candidate struggled with system-design tradeoffs" — that no interviewer ever wrote. False specifics cited back to the candidate are how rejection-feedback workflows generate complaint emails.
### 4. Draft against the rubric-to-feedback mapping
Translate at most one strength and one gap into candidate-facing language using `references/1-rubric-to-feedback-mapping.md`. Cap at one of each so the draft does not read as a defensive list. Comparative ranking ("we had stronger candidates", "you were our second choice") is forbidden — the mapping file does not contain the language and step 5 greps it out.
For `route: call`, also draft recruiter-side talking points: bullet-point observations, the suggested phrasing for the gap, and two to three pre-prepared responses to likely candidate questions ("Was there anything I could have done differently?", "Will you keep me in mind for future roles?", "Can I get a second look?").
### 5. Bias and false-specifics screening
Grep the draft against `references/2-banned-phrase-blocklist.md`. Any hit halts the run with the offending string surfaced. Then verify that every specific claim in the draft maps back to a verbatim evidence string from step 3 — if a claim has no source, halt.
This is a separate pass from step 4 by design. The screening pass sees only the draft text, with no awareness of the underlying scorecards, so it cannot rationalize a banned phrase as "but the interviewer meant X".
### 6. Write to disk and audit log
Write the draft to `drafts/<candidate-id>.md` per the format in `references/3-output-format.md`. Write the call notes (if applicable) to `drafts/<candidate-id>-call-notes.md`. Append one JSONL line to `audit/<YYYY-MM>.jsonl` containing: `run_id`, `candidate_id_hash` (SHA-256, not raw ID), `role_id`, `jurisdiction`, `feedback_requested`, `route`, `rubric_sha256`, `dimensions_surfaced`, `blocklist_hits` (zero on success), `model_id`, `timestamp`. No candidate-identifying free text in this line.
Surface the path to the recruiter and exit. The recruiter reviews, edits, and sends from the ATS or their own outbox.
## Output format
Literal example of the email draft the skill writes to `drafts/<candidate-id>.md` for a candidate who reached an onsite for a Senior Backend Engineer role and explicitly requested feedback:
```markdown
Subject: Update on your Senior Backend Engineer interview at Acme
Hi Jamie,
Thank you for the time you invested in our interview process — the
take-home, the system-design loop, and the conversations with the
team. We appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
You asked for feedback, so here is what stood out from the loop:
- **What went well.** Your take-home submission was clear, well-tested,
and included a thoughtful note on the failure-mode tradeoffs. Two
interviewers cited the test coverage specifically.
- **Where the team landed differently.** In the system-design round,
the discussion of consistency-vs-availability tradeoffs at the
database layer did not surface the read-replica option that the
role frequently requires reasoning about. This was the dimension
that drove the team's decision.
This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.
If a future role at Acme matches your background, we would welcome
your application.
Best,
{Recruiter name}
```
Literal example of the recruiter call-notes file written to `drafts/<candidate-id>-call-notes.md`:
```markdown
# Call notes — Jamie L. (Senior Backend Engineer)
## Frame
- Open with thanks for the time invested.
- Lead with the take-home strength (specific: test coverage note).
- Single gap: system-design read-replica reasoning. One sentence,
no piling on.
## Suggested phrasing for the gap
"In the system-design conversation, the team was looking for the
read-replica option as part of the consistency-availability tradeoff,
and that did not come up. That was the dimension that drove the
decision for this specific role."
## Likely candidate questions
Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.
Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.
Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.
## Off-script
If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.
```
Literal example of the routing recommendation appended to the draft file:
```markdown
---
Routing: call (stage: onsite, seniority: senior, prior referrer: yes)
Recruiter review required before send.
```
## Watch-outs
- **EEOC-implicating language.** *Guard:* the banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as a separate pass in step 5, with no awareness of the underlying scorecards, so it cannot rationalize a hit. Any hit halts the run with the offending string surfaced. Do not edit the blocklist to make a draft pass — fix the rubric or the scorecard language instead.
- **False specifics from the LLM.** *Guard:* the "no synthesis without verbatim citation" rule in step 3. Every claim in the draft must trace to a verbatim string from a signed-off scorecard or transcript. No verbatim string → the dimension is not surfaced. This is the guard against the most common failure mode of LLM-drafted feedback — plausible-sounding quotes that no interviewer actually wrote.
- **Comparative ranking language.** *Guard:* the rubric-to-feedback mapping in `references/1-rubric-to-feedback-mapping.md` does not contain comparative phrasing ("stronger candidates", "second choice"), and the blocklist in step 5 catches it if it slips in. Comparative ranking is what turns a constructive rejection into a Glassdoor post.
- **Selective-evidence risk.** *Guard:* step 2 halts if the loop has under two signed-off scorecards. Step 3 refuses to surface dimensions with cross-interviewer standard deviation at or above 1.5 — interviewer disagreement does not become candidate feedback.
- **Auto-send drift.** *Guard:* the skill defines no `send` action. Drafts are written to `drafts/<candidate-id>.md` for the recruiter to review, edit, and send from the ATS outbox. AI-drafted-and-sent rejection feedback without review damages [candidate experience](/en/learn/candidate-experience/) and produces incidents.
- **PII in the audit log.** *Guard:* step 6 writes only `candidate_id_hash` (SHA-256), never the raw candidate ID, name, or scorecard text. The audit line is for run reproducibility, not candidate data retention.
- **Generic boilerplate harm.** *Guard:* if step 3 cannot surface a rubric dimension that has both mean ≤ 2 and a verbatim evidence string, the skill writes the generic-decline template from `references/3-output-format.md` rather than synthesizing weak specifics. Generic decline is honest; weak specifics are worse than no specifics.
# Rubric-to-feedback mapping — TEMPLATE
> Replace this template with your team's approved candidate-facing
> phrasing per rubric dimension. The rejection-feedback skill reads
> this file in step 4 to translate scorecard language (which is
> internal, often blunt) into candidate-facing language (which must
> be specific, evidence-grounded, and EEOC-safe). Without this file
> the skill will not draft specifics — it falls back to the generic
> decline template.
## How this file is used
The skill matches each surfaced dimension (from step 3) against the `dimension_id` below, then uses the `candidate_facing_phrasing` template, substituting in the verbatim evidence string from the scorecard or transcript.
If a dimension is surfaced by step 3 but has no entry below, the skill will NOT draft specifics for it — the dimension is dropped. This forces the team to deliberate on candidate-facing phrasing once, in writing, rather than letting the LLM improvise per run.
## Dimension entries
### dimension_id: technical_depth
**internal_label**: Technical depth (1-5)
**rubric_anchors**:
- 5: Reasons fluently across multiple layers of the stack; explores tradeoffs unprompted.
- 4: Reasons clearly within their primary layer; surfaces tradeoffs when asked.
- 3: Recalls correct patterns; tradeoff reasoning needs prompting.
- 2: Recalls patterns inconsistently; tradeoff reasoning absent or shallow.
- 1: Patterns incorrect or contradicted under follow-up.
**candidate_facing_phrasing** (used for mean ≤ 2):
```
In the {round_name} round, the team was looking for {specific_topic}
as part of {specific_decision_context}, and that did not come up.
That was the dimension that drove the decision for this specific
role.
```
Substitution sources:
- `{round_name}` → from scorecard `interview_round` field
- `{specific_topic}` → from `references/2-banned-phrase-blocklist.md` approved-topics list (NEVER free-text from the LLM)
- `{specific_decision_context}` → from rubric anchor text
**candidate_facing_phrasing** (used for mean ≥ 4, opening only):
```
{Strength_observation}. {Interviewer_count_phrase} cited
{specific_evidence} specifically.
```
---
### dimension_id: system_design
**internal_label**: System design (1-5)
**rubric_anchors**:
- 5: Drives the design conversation; surfaces consistency, availability, and operational tradeoffs unprompted.
- 4: Engages with tradeoffs when prompted; covers most major axes.
- 3: Engages with tradeoffs when prompted; covers one or two axes.
- 2: Tradeoff reasoning shallow; misses major axes that the role requires.
- 1: Cannot construct a system that meets the stated requirements.
**candidate_facing_phrasing** (used for mean ≤ 2):
Same template as `technical_depth`.
---
### dimension_id: collaboration
**internal_label**: Collaboration (1-5)
**rubric_anchors**:
- 5: Specific examples of cross-functional work, named tradeoffs, named outcomes.
- 4: Specific examples, less explicit on tradeoff reasoning.
- 3: General examples, no specifics on tradeoffs or outcomes.
- 2: Vague examples or examples that do not show collaboration evidence.
- 1: No relevant examples surfaced.
**candidate_facing_phrasing** (used for mean ≤ 2):
Same template as `technical_depth`. **Constraint:** never use the words "communication", "fit", "soft skills", or "executive presence" in the candidate-facing draft for this dimension. Those terms are on the banned-phrase blocklist because they correlate with bias claims.
---
## Constraints across all dimensions
- One strength and one gap per draft, maximum. The skill caps at one of each in step 4.
- Every substitution slot is filled from a structured field (scorecard, transcript, rubric anchor) or from the approved-topics list. The LLM never free-texts a substitution value.
- Comparative ranking is not in this file and is on the blocklist. If you find yourself adding "vs other candidates" phrasing, stop and revisit the rubric anchors instead.
- Update this file when the team revises rubric anchors. The skill's audit log captures `rubric_sha256` per run, so revisions are visible in retro.
## Last edited
{YYYY-MM-DD}
# Banned-phrase blocklist
> The rejection-feedback skill greps the final draft against every
> pattern below in step 5 (bias and false-specifics screening). Any
> hit halts the run with the offending string surfaced. Do NOT edit
> this file to make a draft pass — fix the rubric, the scorecard
> language, or the rubric-to-feedback mapping instead.
## A. EEOC-implicating language
A1. **Protected-class proxies.** Any of the following terms or patterns in the draft halts the run:
- `culture fit`, `cultural fit`, `culture add` (without an accompanying behavioral-anchor citation)
- `team fit`, `not a fit` (when used as the substantive reason)
- `personality`, `chemistry`, `vibes`
- `executive presence`, `leadership presence`, `gravitas`
- `polish`, `polished`, `lacks polish`
- `aggressive`, `abrasive`, `pushy` (gendered descriptors)
- `soft`, `nice`, `quiet`, `meek` (inverse gendered descriptors)
- `mature`, `seasoned`, `young`, `energetic`, `digital native` (age proxies)
- `accent`, `articulate`, `well-spoken` (national-origin proxies)
- `family`, `kids`, `pregnant`, `maternity`, `paternity`, `parental` (family-status proxies)
- `accommodation`, `disability`, `health` (any reference to accommodation discussions in the rejection text)
- `religion`, `church`, `prayer`
- `marital`, `married`, `single`
- `name origin`, `surname` (any commentary on the candidate's name)
- `school`, `university`, `Ivy`, `tier-1`, `top-N` (when used as the substantive reason — schools may appear in factual context but not as the rejection driver)
A2. **Comparative ranking language.** Halts the run:
- `stronger candidates`, `better candidates`, `more qualified`
- `second choice`, `runner-up`, `not the top choice`
- `closer fit elsewhere`, `closer match`
- `pool was strong`, `competitive pool`
- `we found someone`, `we hired someone`, `the role is filled` (these belong in a separate sentence about the role status, not framed as a candidate ranking)
- Any phrase that implies a relative ordering of the candidate against unnamed others.
A3. **Defamation-risk language.** Halts the run:
- `dishonest`, `misleading`, `lied`, `lying`
- `unprepared`, `did not try`, `did not care`
- `arrogant`, `entitled`, `difficult`
- `concerning`, `red flag`, `worrying`
- Any subjective-character claim that could be cited against the firm in a defamation action.
## B. False-specifics patterns
B1. **Quote markers without source.** Halts the run if the draft contains any quoted string (`"…"` or `'…'`) that does not appear verbatim in the scorecard or transcript pool from step 2.
B2. **Numeric claims without source.** Halts if the draft contains a numeric claim (`scored X`, `Y out of Z`, `X% of`) — interview scores are internal calibration data, not candidate-facing content.
B3. **Interviewer-identifying claims.** Halts if the draft names an interviewer, references an interviewer's role beyond the generic "the team", or attributes a quote to a specific person. Interviewer identities are protected and naming them creates retaliation risk.
B4. **Round-identifying claims that could not have happened.** Halts if the draft references a round (`take-home`, `system design`, `behavioral`, `pair programming`) that is not present in the scorecard set for this candidate. The skill validates round names against the loop's actual structure.
## C. Process-risk language
C1. **Promises about the future.** Halts the run:
- `we will reach out`, `we'll be in touch`, `next time`
- `definitely apply again`, `you will get an offer`
- `keep your resume on file` (varies by jurisdiction whether this is permissible — neutral phrasing is "we welcome a future application")
- Any timeline commitment.
C2. **Process-improvement requests from the candidate.** Halts if the draft asks the candidate for feedback, a referral, or a testimonial. Reverse asks in a rejection email are an EEOC-witness-statement risk and a candidate-experience harm.
C3. **Unsolicited specifics in deny-jurisdiction cases.** The skill's step 1 should have caught this, but as a defense-in-depth check: if the run's `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`, the draft must match the generic-decline template byte-for-byte. Any deviation halts.
## D. Approved-topics list (positive list, used by step 4)
The rubric-to-feedback mapping's `{specific_topic}` substitution slot pulls from this list. The LLM never free-texts a topic string.
- `consistency-availability tradeoffs`
- `read-replica reasoning`
- `caching layer reasoning`
- `failure-mode reasoning`
- `test coverage`
- `error-handling specificity`
- `data-modeling tradeoffs`
- `query-pattern reasoning`
- `migration sequencing`
- `deployment sequencing`
- `cross-team coordination examples`
- `tradeoff reasoning under time pressure`
Add to this list only after team review. Topics added here are permitted to appear in candidate-facing drafts.
## E. Maintenance
This file is version-controlled. The skill captures the SHA-256 of this file in the audit log per run, so the blocklist used on a given date is reproducible. If a candidate raises a claim against a specific draft, the audit log answers "was the blocklist of date X in effect at the time of the draft" — yes or no, no judgment call.
## Last edited
{YYYY-MM-DD}
# Output format
> The rejection-feedback skill writes drafts in exactly the formats
> below. The recruiter reviews and edits in their own outbox or in
> the ATS; the skill never sends.
## Routing rules
The skill picks a route per the matrix below. The recruiter can override.
| Stage reached | Seniority | feedback_requested | Default route |
|---|---|---|---|
| onsite | senior+ | true | call |
| onsite | senior+ | false | email (generic if jurisdiction denies) |
| onsite | mid / junior | true | email (specific) |
| onsite | mid / junior | false | email (generic) |
| final loop | any | any | call (overrides above) |
| referred-by-VIP | any | any | call (recruiter judgment) |
| earlier than onsite | any | any | OUT OF SCOPE — use templated decline |
`senior+` = staff, principal, manager, director. `referred-by-VIP` = candidate has a `referrer_priority: high` flag in the ATS.
## Email format — specific feedback (consent + safe jurisdiction)
```markdown
Subject: Update on your {role_title} interview at {company_name}
Hi {candidate_first_name},
Thank you for the time you invested in our interview process — the
{round_1_label}, {round_2_label}, and the conversations with the
team. We appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
You asked for feedback, so here is what stood out from the loop:
- **What went well.** {strength_phrasing_from_mapping}.
- **Where the team landed differently.** {gap_phrasing_from_mapping}.
This was the dimension that drove the team's decision.
This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.
If a future role at {company_name} matches your background, we would
welcome your application.
Best,
{recruiter_first_name}
```
Constraints baked into this template:
- One strength, one gap. No more.
- The phrase "not a ranking against other candidates" is mandatory, because it pre-empts the most common candidate response loop ("how did I compare").
- The phrase "not a comment on your overall engineering ability" is mandatory, because it isolates the feedback to this loop and pre-empts the "you said I am bad at engineering" escalation.
- "We would welcome your application" — neutral future language. Not "we will reach out", not "next time".
## Email format — generic decline (deny jurisdiction OR no consent OR no surfacable specific)
```markdown
Subject: Update on your {role_title} interview at {company_name}
Hi {candidate_first_name},
Thank you for the time you invested in our interview process. We
appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
If a future role at {company_name} matches your background, we
would welcome your application.
Best,
{recruiter_first_name}
```
This is the safe default. The skill writes this template byte-for-byte when:
- `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`
- step 3 surfaced no rubric dimension with both `mean ≤ 2` AND a verbatim evidence string
- a legal flag on the candidate file is present
- the loop has under two signed-off scorecards
Generic decline is honest. Weak specifics are worse than no specifics.
## Call-notes format
```markdown
# Call notes — {candidate_first_name} {candidate_last_initial}. ({role_title})
## Frame
- Open with thanks for the time invested.
- Lead with the strength: {strength_phrasing_from_mapping}.
- Single gap: {gap_topic_from_approved_list}. One sentence, no piling
on.
## Suggested phrasing for the gap
"{gap_phrasing_from_mapping}"
## Likely candidate questions
Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.
Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.
Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.
Q: "Who else interviewed?"
A: Decline. Interviewer identities are protected. "I cannot share
that, but I can tell you the team weighed the input from every
round."
Q: "What did interviewer X think?"
A: Decline. Same reason. "I cannot break out individual scores; the
decision was a team decision."
## Off-script
If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.
## Call duration target
10-15 minutes. Past 20 minutes, the call is no longer feedback —
it is an extended negotiation about the decision, and that is not
a useful place to be.
```
## Audit-log line format
One JSON object per line in `audit/<YYYY-MM>.jsonl`:
```json
{
"run_id": "uuid-v4",
"candidate_id_hash": "sha256-of-candidate-id",
"role_id": "role-slug",
"jurisdiction": "US-CA",
"feedback_requested": true,
"route": "email",
"rubric_sha256": "abcdef...",
"blocklist_sha256": "abcdef...",
"mapping_sha256": "abcdef...",
"dimensions_surfaced": ["technical_depth"],
"blocklist_hits": 0,
"model_id": "claude-sonnet-4-5",
"timestamp": "2026-05-03T14:00:00Z"
}
```
No raw candidate ID, no candidate name, no scorecard text, no draft text. The audit log is for run reproducibility, not data retention. Candidate-facing drafts live in `drafts/<id>.md` under the recruiter's own retention policy.
## Last edited
{YYYY-MM-DD}