A Claude Skill that takes a rejected candidate’s interview scorecards (and, when available, BrightHire or Metaview transcripts), drafts an evidence-grounded rejection email or recruiter-call talking-point notes, and produces the recruiter-side notes for the call. Replaces the form-letter rejection that damages candidate experience with personalized feedback the candidate can actually use — and refuses to draft when the rubric is missing, the loop did not converge, or the case is jurisdiction-flagged.
When to use
The candidate reached at least an onsite or final-stage loop, where per recruiting funnel cost the candidate has invested enough time to deserve a real answer.
The team has at least two signed-off scorecards on the candidate (Ashbysubmitted: true, Greenhousestatus: complete, Leverstate: completed). One scorecard is one interviewer’s view; the skill refuses to synthesize feedback from a single perspective because that exposes the firm to selective-evidence claims.
A role rubric exists at rubrics/<role_id>.yaml with behavioral anchors per dimension (the same source the interview debrief skill reads). The skill scores against rubric anchors, not against free-text scorecard prose.
The candidate explicitly requested feedback (captured in writing in the ATS), OR the candidate’s residency jurisdiction is one where unsolicited specifics carry no documented risk per the user’s HR-counsel guidance.
A recruiter reviews and edits every draft before sending. The skill writes drafts to disk and stops; it defines no send action.
When NOT to use
Auto-sending without recruiter review. AI-drafted-and-sent rejection feedback is the single most reliable way to produce an EEOC, ADA, or state-employment-law incident. The recruiter is the gate. If your goal is to remove the human from the loop, this is the wrong workflow.
Candidates who have not requested feedback in deny-jurisdictions. France (Code du travail risk on documented rejection reasons), Germany (AGG paragraph 22 evidentiary shift), and any jurisdiction the user’s HR counsel has marked unsolicited_feedback: deny in the policy file. The skill refuses specifics in those cases and writes the generic-decline template instead. Do not edit the policy file to make a deny-jurisdiction case pass.
Cases legal has flagged. Active dispute, unaddressed accommodation request, or a complaint on record. The skill returns a generic-decline draft and surfaces the flag to the recruiter. Specifics on a flagged case become evidence in the dispute.
Earlier-stage rejections (resume screen, recruiter screen). Templated decline is the right tool there; the per-candidate model cost and the recruiter review time do not pay back at top-of-funnel scale. The skill is for candidates who reached at least an onsite.
Comparative ranking (you were our second choice, we had stronger candidates). The skill will refuse to draft this — the rubric-to-feedback mapping does not contain the language and the banned-phrase blocklist greps it out. Comparative ranking is what turns a constructive rejection into a Glassdoor post.
Process-improvement asks (asking the candidate for feedback on the interview, a referral, or a testimonial). Reverse asks in a rejection email are an EEOC-witness-statement risk and a candidate-experience harm. The blocklist catches them.
Setup
Drop the bundle. Place apps/web/public/artifacts/rejection-feedback-claude-skill/SKILL.md into your Claude Code skills directory (or claude.ai custom Skills, with Tier-A authorization for candidate data per AI policy).
Configure the rubric source. The skill reads role rubrics from rubrics/<role_id>.yaml — same path the interview debrief skill uses. If the rubric does not exist, the skill refuses to run. Structured interviewing is the prerequisite, not this skill.
Fill in the rubric-to-feedback mapping. Copy references/1-rubric-to-feedback-mapping.md and replace the template phrasing with your team’s approved candidate-facing language per rubric dimension. Get HR counsel sign-off on the approved phrasing once; the audit log captures the mapping’s SHA-256 per run, so revisions are visible in retro.
Write the jurisdiction-policy file. A YAML file with one block per jurisdiction your firm hires in. Each block sets unsolicited_feedback: allow or deny and references the relevant HR-counsel guidance memo. The bundle ships a template; the deny defaults are France, Germany, and any jurisdiction with active employment-law guidance against documented rejection reasons.
Configure the ATS API. Ashby, Greenhouse, or Lever API token with read scope on scorecards and candidates. The skill pulls scorecards by candidate_id; it does not accept pasted scorecard text because pasted text cannot be audited back to the source interviewer.
Optional: configure the transcript bundle. BrightHire or Metaview API access. When a transcript_id is provided, the skill cross-references scorecard claims against transcript turns in step 4.
Dry-run on a closed candidate. Run on a candidate who was already rejected last quarter. Compare the skill’s draft to what the recruiter actually sent. Tune the rubric-to-feedback mapping if the calibration drifts — the mapping, not the model, is usually the lever.
What the skill actually does
Six steps, in order. The order matters: jurisdiction gating and scorecard validation come before the LLM ever reads candidate content, because letting the model loose on scorecard text in a deny-jurisdiction case leaves a model-call log entry with candidate-identifying data that the firm did not need to retain.
Validate jurisdiction policy and consent. Look up the candidate’s jurisdiction in the policy file. If the policy is unsolicited_feedback: deny and the candidate did not request feedback in writing, halt specifics and switch to the generic-decline template. The choice to gate on consent before pulling scorecards keeps the data-minimization story clean for GDPR Art. 5(1)(c).
Pull scorecards (and optional transcript). Fetch via the ATS API. Drop drafts. If the loop has under two signed-off scorecards, halt — feedback synthesized from one interviewer’s view is an opinion, not feedback, and exposes the firm to selective-evidence claims.
Identify dimensions and evidence. Compute the cross-interviewer mean and standard deviation per rubric dimension. Surface dimensions where mean ≥ 4 (strength, warm opening) and mean ≤ 2 (gap candidate). Refuse to surface any dimension with cross-interviewer standard deviation ≥ 1.5 — the loop did not converge, and feedback on a non-converged dimension would not survive a “but interviewer X scored me 5” challenge. For every surfaced dimension, pull verbatim evidence quotes from the scorecards (or transcript, when available). No verbatim string → the dimension is not surfaced.
Draft against the rubric-to-feedback mapping. Translate at most one strength and one gap into candidate-facing language using references/1-rubric-to-feedback-mapping.md. Cap at one each so the draft does not read as a defensive list. The mapping’s substitution slots are filled from structured fields (scorecard, rubric anchor) or the approved-topics list — the LLM never free-texts a substitution value, which is the guard against false specifics.
Bias and false-specifics screening. Grep the draft against references/2-banned-phrase-blocklist.md. Any hit halts the run with the offending string surfaced. Verify every specific claim maps back to a verbatim evidence string from step 3 — claims without source halt. This is a separate pass from step 4 by design; the screening pass sees only the draft text, with no awareness of the underlying scorecards, so it cannot rationalize a banned phrase as “but the interviewer meant X”.
Write to disk and audit log. Write drafts/<candidate-id>.md and (for route: call) drafts/<candidate-id>-call-notes.md per the format in references/3-output-format.md. Append one JSONL line to audit/<YYYY-MM>.jsonl with candidate_id_hash (SHA-256, not raw ID), rubric_sha256, blocklist_sha256, mapping_sha256, dimensions surfaced, blocklist hits, model ID, timestamp. No candidate-identifying free text in the audit line.
The literal email format, generic-decline fallback, and call-notes template live in references/3-output-format.md. The format is fixed because downstream consumers — recruiter, candidate, and any future audit reviewer — need predictable language with no recruiter-specific drift.
Cost reality
Per rejection draft, on Claude Sonnet 4.5:
LLM tokens — typically 12-25k input tokens (rubric YAML + scorecards + skill instructions + reference files) and 0.5-1.5k output tokens (the draft plus call notes). On Sonnet 4.5 that is roughly 5-10 cents per draft. A recruiter team running 200 rejection drafts a month spends 10-20 dollars in model cost.
ATS API cost — zero on Ashby (free API), Greenhouse (included in tier), Lever (included). Transcript fetches against BrightHire or Metaview count against the per-seat plan; rejection-feedback fetches are read-only and do not consume new transcript credits.
Recruiter time — the win is here. Manual drafting of a thoughtful, evidence-grounded rejection email from scorecards is 20-30 minutes per candidate when the recruiter does it well, or 3 minutes when they paste a form letter (which is what most teams end up doing at scale). The skill produces the 20-minute draft in under 30 seconds; the recruiter reviews and edits in 4-7 minutes. Net saving is roughly 15-20 minutes per rejection at the thoughtful-draft quality bar — call it 50-60 hours a month on a team running 200 rejections.
Setup time — 30 minutes for the rubric-to-feedback mapping and jurisdiction policy if your team already has approved candidate-facing phrasing somewhere; longer if HR counsel has not yet weighed in on rejection-feedback language (in which case that conversation is the prerequisite, not this skill).
The candidate-experience compounding return. Candidates rejected with specific, evidence-grounded feedback are more likely to re-apply, more likely to refer others, and substantially less likely to leave damaging Glassdoor reviews — claims commonly cited in the recruiting literature are in the 30-50% range for re-apply intent, though we do not have a primary source for those numbers and treat them as directional. The compounding return shows up in pipeline density a year out, not in the month the draft was sent.
Success metric
Track three numbers per month, in the ATS:
Recruiter edit-distance per draft. The number of characters the recruiter changes between the skill’s draft and the sent message. If edit distance trends to zero, the recruiter is rubber-stamping — surface this in retro and revisit the rubric-to-feedback mapping. If edit distance is consistently high, the mapping is miscalibrated.
Candidate response rate to the rejection. Replies to a rejection email are usually thanks-and-future-application notes (good signal) or escalation notes (bad signal). Track the escalation rate as a percentage of rejections sent. A baseline team running form letters typically sees under 1% escalation; the goal with this skill is to stay at or below that baseline, not above. If escalation rate climbs, the rubric-to-feedback mapping is producing language that lands wrong — re-tune.
Re-application rate within 12 months. Candidates rejected through this skill versus candidates rejected through the legacy form letter, measured over the next 12 months. The compounding benefit shows up here, not in model spend or even in the rejection thread itself.
vs alternatives
vs Ashby’s built-in rejection templates. Ashby (and Greenhouse, Lever) ship rejection templates with merge fields for candidate name and role. They are templates, not feedback — the merge fields do not pull scorecard evidence and there is no rubric-grounded language layer. Use Ashby templates for top-of-funnel rejections where templated is honest. Use this skill for late-stage rejections where templated reads as dismissive of the time the candidate invested.
vs generic decline emails. Generic decline is the right answer in deny-jurisdiction cases, when consent was not given, and when the rubric did not surface a defensible specific. The skill writes the generic-decline template byte-for-byte in those cases. The difference is the skill makes the choice deterministically per jurisdiction policy and rubric output, rather than the recruiter defaulting to generic out of fatigue.
vs manual recruiter-written notes. Manual notes are the gold standard for senior or VIP-referred candidates where the recruiter has the relationship context and the time. The skill earns its keep on volume — the 80% of late-stage rejections where the recruiter would otherwise paste a form letter because manual drafting at scale does not fit the day. For the senior tier, the call-notes file gives the recruiter a structured starting point for the call, and the recruiter improvises from there.
vs an LLM with no rubric file and no blocklist. This is the failure mode the skill is built against. An LLM drafting from scorecards alone, with no rubric grounding, no banned-phrase blocklist, and no audit log, produces fast, confident, plausible- sounding rejection text — and roughly one in twenty drafts will contain a hallucinated quote, a comparative ranking, or a protected-class proxy. The bundle’s checklist files are what move the failure rate to near zero.
Watch-outs
EEOC-implicating language. Guarded by the banned-phrase blocklist in references/2-banned-phrase-blocklist.md, which runs as a separate pass in step 5 with no awareness of the underlying scorecards. Hits halt the run with the offending string surfaced. Do not edit the blocklist to make a draft pass — fix the rubric or the scorecard language instead.
False specifics from the LLM. Guarded by the “no synthesis without verbatim citation” rule in step 3. Every claim in the draft must trace to a verbatim string from a signed-off scorecard or transcript. No verbatim string → the dimension is not surfaced. This is the guard against the most common failure mode of LLM-drafted feedback — plausible-sounding quotes that no interviewer actually wrote, cited back to the candidate as fact.
Comparative ranking language. Guarded by the rubric-to-feedback mapping in references/1-rubric-to-feedback-mapping.md, which does not contain comparative phrasing, and by the blocklist in step 5 which catches it if it slips in. Comparative ranking is what turns a constructive rejection into a Glassdoor post.
Selective-evidence risk. Guarded by step 2 (halt if the loop has under two signed-off scorecards) and step 3 (refuse to surface dimensions with cross-interviewer standard deviation at or above 1.5). Interviewer disagreement does not become candidate feedback.
Auto-send drift. Guarded by the absence of any send action in the skill. Drafts are written to drafts/<candidate-id>.md for the recruiter to review, edit, and send from the ATS outbox. The recruiter is the gate.
Generic-boilerplate harm. Guarded by step 3’s refusal to surface a dimension without verbatim evidence — when the rubric surfaces nothing safe to share, the skill writes the generic decline template rather than synthesizing weak specifics. Generic decline is honest; weak specifics are worse than no specifics.
PII in the audit log. Guarded by step 6 writing only candidate_id_hash (SHA-256), never the raw candidate ID, name, or scorecard text. The audit log is for run reproducibility, not candidate data retention. Candidate-facing drafts live in drafts/ under the recruiter’s own retention policy.
Calibration drift across roles and seniority. Guarded by per-role rubric YAMLs and by the rubric-to-feedback mapping being versioned per team. Senior-leadership rejections need different framing than entry-level; the mapping file is where that lives, not the skill code.
Privacy and data residency. Verify the skill operates within Tier A enterprise AI per AI policy. Interview content is sensitive; the candidate did not consent to it being processed by a third-party model unless your AI policy and your scorecard-collection consent language explicitly cover it.
Stack
The skill bundle lives at apps/web/public/artifacts/rejection-feedback-claude-skill/ and contains:
SKILL.md — the skill definition
references/1-rubric-to-feedback-mapping.md — fill in per team, HR-counsel approved phrasing per rubric dimension
references/2-banned-phrase-blocklist.md — pre-flight checks on the draft (do not edit to make biased drafts pass)
references/3-output-format.md — the literal email, generic- decline, and call-notes formats
Tools the workflow assumes you already use: Claude (the model), Ashby or Greenhouse or Lever (the ATS the scorecards live in), and optionally BrightHire or Metaview (interview transcripts for richer evidence grounding). Sibling workflow that shares the rubric source: the interview debrief skill.
---
name: rejection-feedback
description: Take a rejected candidate's interview scorecards and (where available) transcripts, draft an evidence-grounded rejection email or recruiter-call talking points, and produce the recruiter-side notes for the call. Always stops at a recruiter-review gate; never sends. Refuses to draft when the rubric is missing or the case is jurisdiction-flagged.
---
# Rejection feedback
## When to invoke
Use this skill when a recruiter needs to send personalized post-interview feedback to a candidate who reached at least an onsite or final-stage loop, and the team has structured scorecards plus a role rubric on file. Take the candidate's scorecards (across all interviewers), the role rubric, the recruiter-relationship context (was feedback explicitly offered? requested?), and the candidate's residency jurisdiction as input. Produce a Markdown rejection email draft, optional recruiter-call talking-point notes, and a one-line routing recommendation.
Do NOT invoke this skill for:
- **Auto-sending without recruiter review.** The skill writes drafts to disk and stops. There is no `send` action defined anywhere in this skill. Auto-sent rejection feedback is the single most reliable way to produce an inappropriate-content incident under EEOC, ADA, or state employment law. The recruiter is the gate.
- **Candidates who have not requested feedback in jurisdictions where unsolicited feedback creates risk.** Specifically: France (Code du travail risk on documented rejection reasons), Germany (AGG §22 evidentiary shift), and any jurisdiction where the recruiter's HR-counsel guidance disallows unsolicited specifics. The skill reads the `jurisdiction_policy.yaml` file and refuses to draft specifics for any jurisdiction marked `unsolicited_feedback: deny`.
- **EEOC-implicating language or protected-class proxies.** "Cultural fit", age inferences from graduation year, family-status references, national-origin references, accent commentary, gendered descriptors ("aggressive", "abrasive", "soft"), pregnancy-status references, disability or accommodation references. The banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as the final check before the draft is written. Any hit halts the run with the offending string surfaced.
- **Cases legal has flagged.** If the candidate file has a flag for active dispute, accommodation request unaddressed, or a complaint on record, the skill returns "decline to provide specific feedback — legal flag present" and writes a generic-decline draft instead.
- **Rejections from earlier stages** (resume screen, recruiter screen). Templated decline is the right tool there. This skill is for candidates who invested significant time and earned a real answer, per the [recruiting funnel](/en/learn/recruiting-funnel-metrics/) cost.
## Inputs
- Required: `candidate_id` — the ATS record ID ([Ashby](/en/tools/ashby/), [Greenhouse](/en/tools/greenhouse/), or [Lever](/en/tools/lever/)). The skill pulls scorecards via the ATS API; it does not accept pasted scorecard text, because pasted text cannot be audited back to the source interviewer.
- Required: `role_id` — used to load the role's rubric from `rubrics/<role_id>.yaml` (same source the [interview debrief skill](/en/workflows/interview-debrief-summary-skill/) reads). Without a rubric the skill refuses to run; ungrounded feedback is how false specifics get drafted.
- Required: `jurisdiction` — ISO 3166 country code for the candidate's residency at time of application. Drives which jurisdiction-policy block applies.
- Required: `feedback_requested` — boolean. `true` only if the candidate explicitly asked for feedback (in writing, captured in the ATS). `false` defaults to a generic-decline draft in jurisdictions where the policy file flags unsolicited specifics as risk.
- Optional: `transcript_id` — pointer to a [BrightHire](/en/tools/brighthire/) or [Metaview](/en/tools/metaview/) transcript bundle for the loop. When present, the skill cross-references scorecard claims against transcript evidence; when absent, the skill works from scorecards alone and labels the draft accordingly.
- Optional: `route` — one of `email`, `call`, `auto`. `auto` (default) picks based on stage reached and seniority per the routing rules in `references/3-output-format.md`.
## Reference files
Always read the following from `references/` before drafting. Without them the draft is generic, ungrounded, and risks tripping a banned phrase.
- `references/1-rubric-to-feedback-mapping.md` — the mapping from rubric dimensions to safely-sharable, candidate-facing feedback language. Replace the template placeholders with your team's approved phrasing before first use.
- `references/2-banned-phrase-blocklist.md` — the blocklist the skill greps the draft against in step 5. Patterns include EEOC-implicating terms, protected-class proxies, comparative-ranking language, and unverifiable specifics. Do not edit this file to make a draft pass.
- `references/3-output-format.md` — the literal email and call-notes format, including the routing rules.
## Method
Run these six steps in order. Steps 1-3 are deterministic gating; steps 4-5 use the LLM for synthesis and screening; step 6 is the audit log. The order matters — letting the LLM draft against unchecked scorecards produces fast, confident, EEOC-implicating output.
### 1. Validate jurisdiction policy and consent
Open `references/jurisdiction-policy.yaml` (user-supplied; template shipped in the bundle). Look up the candidate's `jurisdiction`. If `unsolicited_feedback: deny` and `feedback_requested: false`, halt specifics and switch to the generic-decline template at the top of `references/3-output-format.md`. Log the reason in the audit line.
The choice to gate on consent before pulling scorecards is deliberate: specifics drafted and then discarded still leave a model-call log entry with candidate-identifying scorecard text. Gating up front keeps the data-minimization story clean for GDPR Art. 5(1)(c).
### 2. Pull scorecards and (optional) transcript
Fetch all scorecards for `candidate_id` via the ATS API. Validate that every scorecard is signed-off (Ashby `submitted: true`, Greenhouse `status: complete`, Lever `state: completed`). Drop drafts. If the loop has fewer than two completed scorecards, halt — feedback synthesized from one interviewer's view is not feedback, it is an opinion, and exposes the firm to selective-evidence claims.
When `transcript_id` is provided, fetch the transcript bundle. The skill will cite scorecard claims against transcript turns in step 4.
### 3. Identify dimensions and evidence
For each rubric dimension, compute the cross-interviewer mean score and the standard deviation. Flag dimensions where:
- mean ≥ 4 (candidate strength, surface as the warm opening)
- mean ≤ 2 (candidate gap, candidate for feedback if safe)
- standard deviation ≥ 1.5 (interviewer disagreement — do NOT cite this dimension; the loop did not converge and the feedback would not survive a "but interviewer X scored me 5" challenge)
For each surfaced dimension, pull the verbatim evidence quotes from the scorecards (or transcript, when available). Every claim in the final draft must cite a verbatim string from the evidence pool. No verbatim string → the dimension is not surfaced.
The "no synthesis without verbatim citation" rule is the guard against false specifics. LLMs drafting feedback from scorecards will, without this rule, invent quotes that sound plausible — "the candidate struggled with system-design tradeoffs" — that no interviewer ever wrote. False specifics cited back to the candidate are how rejection-feedback workflows generate complaint emails.
### 4. Draft against the rubric-to-feedback mapping
Translate at most one strength and one gap into candidate-facing language using `references/1-rubric-to-feedback-mapping.md`. Cap at one of each so the draft does not read as a defensive list. Comparative ranking ("we had stronger candidates", "you were our second choice") is forbidden — the mapping file does not contain the language and step 5 greps it out.
For `route: call`, also draft recruiter-side talking points: bullet-point observations, the suggested phrasing for the gap, and two to three pre-prepared responses to likely candidate questions ("Was there anything I could have done differently?", "Will you keep me in mind for future roles?", "Can I get a second look?").
### 5. Bias and false-specifics screening
Grep the draft against `references/2-banned-phrase-blocklist.md`. Any hit halts the run with the offending string surfaced. Then verify that every specific claim in the draft maps back to a verbatim evidence string from step 3 — if a claim has no source, halt.
This is a separate pass from step 4 by design. The screening pass sees only the draft text, with no awareness of the underlying scorecards, so it cannot rationalize a banned phrase as "but the interviewer meant X".
### 6. Write to disk and audit log
Write the draft to `drafts/<candidate-id>.md` per the format in `references/3-output-format.md`. Write the call notes (if applicable) to `drafts/<candidate-id>-call-notes.md`. Append one JSONL line to `audit/<YYYY-MM>.jsonl` containing: `run_id`, `candidate_id_hash` (SHA-256, not raw ID), `role_id`, `jurisdiction`, `feedback_requested`, `route`, `rubric_sha256`, `dimensions_surfaced`, `blocklist_hits` (zero on success), `model_id`, `timestamp`. No candidate-identifying free text in this line.
Surface the path to the recruiter and exit. The recruiter reviews, edits, and sends from the ATS or their own outbox.
## Output format
Literal example of the email draft the skill writes to `drafts/<candidate-id>.md` for a candidate who reached an onsite for a Senior Backend Engineer role and explicitly requested feedback:
```markdown
Subject: Update on your Senior Backend Engineer interview at Acme
Hi Jamie,
Thank you for the time you invested in our interview process — the
take-home, the system-design loop, and the conversations with the
team. We appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
You asked for feedback, so here is what stood out from the loop:
- **What went well.** Your take-home submission was clear, well-tested,
and included a thoughtful note on the failure-mode tradeoffs. Two
interviewers cited the test coverage specifically.
- **Where the team landed differently.** In the system-design round,
the discussion of consistency-vs-availability tradeoffs at the
database layer did not surface the read-replica option that the
role frequently requires reasoning about. This was the dimension
that drove the team's decision.
This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.
If a future role at Acme matches your background, we would welcome
your application.
Best,
{Recruiter name}
```
Literal example of the recruiter call-notes file written to `drafts/<candidate-id>-call-notes.md`:
```markdown
# Call notes — Jamie L. (Senior Backend Engineer)
## Frame
- Open with thanks for the time invested.
- Lead with the take-home strength (specific: test coverage note).
- Single gap: system-design read-replica reasoning. One sentence,
no piling on.
## Suggested phrasing for the gap
"In the system-design conversation, the team was looking for the
read-replica option as part of the consistency-availability tradeoff,
and that did not come up. That was the dimension that drove the
decision for this specific role."
## Likely candidate questions
Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.
Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.
Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.
## Off-script
If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.
```
Literal example of the routing recommendation appended to the draft file:
```markdown
---
Routing: call (stage: onsite, seniority: senior, prior referrer: yes)
Recruiter review required before send.
```
## Watch-outs
- **EEOC-implicating language.** *Guard:* the banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as a separate pass in step 5, with no awareness of the underlying scorecards, so it cannot rationalize a hit. Any hit halts the run with the offending string surfaced. Do not edit the blocklist to make a draft pass — fix the rubric or the scorecard language instead.
- **False specifics from the LLM.** *Guard:* the "no synthesis without verbatim citation" rule in step 3. Every claim in the draft must trace to a verbatim string from a signed-off scorecard or transcript. No verbatim string → the dimension is not surfaced. This is the guard against the most common failure mode of LLM-drafted feedback — plausible-sounding quotes that no interviewer actually wrote.
- **Comparative ranking language.** *Guard:* the rubric-to-feedback mapping in `references/1-rubric-to-feedback-mapping.md` does not contain comparative phrasing ("stronger candidates", "second choice"), and the blocklist in step 5 catches it if it slips in. Comparative ranking is what turns a constructive rejection into a Glassdoor post.
- **Selective-evidence risk.** *Guard:* step 2 halts if the loop has under two signed-off scorecards. Step 3 refuses to surface dimensions with cross-interviewer standard deviation at or above 1.5 — interviewer disagreement does not become candidate feedback.
- **Auto-send drift.** *Guard:* the skill defines no `send` action. Drafts are written to `drafts/<candidate-id>.md` for the recruiter to review, edit, and send from the ATS outbox. AI-drafted-and-sent rejection feedback without review damages [candidate experience](/en/learn/candidate-experience/) and produces incidents.
- **PII in the audit log.** *Guard:* step 6 writes only `candidate_id_hash` (SHA-256), never the raw candidate ID, name, or scorecard text. The audit line is for run reproducibility, not candidate data retention.
- **Generic boilerplate harm.** *Guard:* if step 3 cannot surface a rubric dimension that has both mean ≤ 2 and a verbatim evidence string, the skill writes the generic-decline template from `references/3-output-format.md` rather than synthesizing weak specifics. Generic decline is honest; weak specifics are worse than no specifics.
# Rubric-to-feedback mapping — TEMPLATE
> Replace this template with your team's approved candidate-facing
> phrasing per rubric dimension. The rejection-feedback skill reads
> this file in step 4 to translate scorecard language (which is
> internal, often blunt) into candidate-facing language (which must
> be specific, evidence-grounded, and EEOC-safe). Without this file
> the skill will not draft specifics — it falls back to the generic
> decline template.
## How this file is used
The skill matches each surfaced dimension (from step 3) against the `dimension_id` below, then uses the `candidate_facing_phrasing` template, substituting in the verbatim evidence string from the scorecard or transcript.
If a dimension is surfaced by step 3 but has no entry below, the skill will NOT draft specifics for it — the dimension is dropped. This forces the team to deliberate on candidate-facing phrasing once, in writing, rather than letting the LLM improvise per run.
## Dimension entries
### dimension_id: technical_depth
**internal_label**: Technical depth (1-5)
**rubric_anchors**:
- 5: Reasons fluently across multiple layers of the stack; explores tradeoffs unprompted.
- 4: Reasons clearly within their primary layer; surfaces tradeoffs when asked.
- 3: Recalls correct patterns; tradeoff reasoning needs prompting.
- 2: Recalls patterns inconsistently; tradeoff reasoning absent or shallow.
- 1: Patterns incorrect or contradicted under follow-up.
**candidate_facing_phrasing** (used for mean ≤ 2):
```
In the {round_name} round, the team was looking for {specific_topic}
as part of {specific_decision_context}, and that did not come up.
That was the dimension that drove the decision for this specific
role.
```
Substitution sources:
- `{round_name}` → from scorecard `interview_round` field
- `{specific_topic}` → from `references/2-banned-phrase-blocklist.md` approved-topics list (NEVER free-text from the LLM)
- `{specific_decision_context}` → from rubric anchor text
**candidate_facing_phrasing** (used for mean ≥ 4, opening only):
```
{Strength_observation}. {Interviewer_count_phrase} cited
{specific_evidence} specifically.
```
---
### dimension_id: system_design
**internal_label**: System design (1-5)
**rubric_anchors**:
- 5: Drives the design conversation; surfaces consistency, availability, and operational tradeoffs unprompted.
- 4: Engages with tradeoffs when prompted; covers most major axes.
- 3: Engages with tradeoffs when prompted; covers one or two axes.
- 2: Tradeoff reasoning shallow; misses major axes that the role requires.
- 1: Cannot construct a system that meets the stated requirements.
**candidate_facing_phrasing** (used for mean ≤ 2):
Same template as `technical_depth`.
---
### dimension_id: collaboration
**internal_label**: Collaboration (1-5)
**rubric_anchors**:
- 5: Specific examples of cross-functional work, named tradeoffs, named outcomes.
- 4: Specific examples, less explicit on tradeoff reasoning.
- 3: General examples, no specifics on tradeoffs or outcomes.
- 2: Vague examples or examples that do not show collaboration evidence.
- 1: No relevant examples surfaced.
**candidate_facing_phrasing** (used for mean ≤ 2):
Same template as `technical_depth`. **Constraint:** never use the words "communication", "fit", "soft skills", or "executive presence" in the candidate-facing draft for this dimension. Those terms are on the banned-phrase blocklist because they correlate with bias claims.
---
## Constraints across all dimensions
- One strength and one gap per draft, maximum. The skill caps at one of each in step 4.
- Every substitution slot is filled from a structured field (scorecard, transcript, rubric anchor) or from the approved-topics list. The LLM never free-texts a substitution value.
- Comparative ranking is not in this file and is on the blocklist. If you find yourself adding "vs other candidates" phrasing, stop and revisit the rubric anchors instead.
- Update this file when the team revises rubric anchors. The skill's audit log captures `rubric_sha256` per run, so revisions are visible in retro.
## Last edited
{YYYY-MM-DD}
# Banned-phrase blocklist
> The rejection-feedback skill greps the final draft against every
> pattern below in step 5 (bias and false-specifics screening). Any
> hit halts the run with the offending string surfaced. Do NOT edit
> this file to make a draft pass — fix the rubric, the scorecard
> language, or the rubric-to-feedback mapping instead.
## A. EEOC-implicating language
A1. **Protected-class proxies.** Any of the following terms or patterns in the draft halts the run:
- `culture fit`, `cultural fit`, `culture add` (without an accompanying behavioral-anchor citation)
- `team fit`, `not a fit` (when used as the substantive reason)
- `personality`, `chemistry`, `vibes`
- `executive presence`, `leadership presence`, `gravitas`
- `polish`, `polished`, `lacks polish`
- `aggressive`, `abrasive`, `pushy` (gendered descriptors)
- `soft`, `nice`, `quiet`, `meek` (inverse gendered descriptors)
- `mature`, `seasoned`, `young`, `energetic`, `digital native` (age proxies)
- `accent`, `articulate`, `well-spoken` (national-origin proxies)
- `family`, `kids`, `pregnant`, `maternity`, `paternity`, `parental` (family-status proxies)
- `accommodation`, `disability`, `health` (any reference to accommodation discussions in the rejection text)
- `religion`, `church`, `prayer`
- `marital`, `married`, `single`
- `name origin`, `surname` (any commentary on the candidate's name)
- `school`, `university`, `Ivy`, `tier-1`, `top-N` (when used as the substantive reason — schools may appear in factual context but not as the rejection driver)
A2. **Comparative ranking language.** Halts the run:
- `stronger candidates`, `better candidates`, `more qualified`
- `second choice`, `runner-up`, `not the top choice`
- `closer fit elsewhere`, `closer match`
- `pool was strong`, `competitive pool`
- `we found someone`, `we hired someone`, `the role is filled` (these belong in a separate sentence about the role status, not framed as a candidate ranking)
- Any phrase that implies a relative ordering of the candidate against unnamed others.
A3. **Defamation-risk language.** Halts the run:
- `dishonest`, `misleading`, `lied`, `lying`
- `unprepared`, `did not try`, `did not care`
- `arrogant`, `entitled`, `difficult`
- `concerning`, `red flag`, `worrying`
- Any subjective-character claim that could be cited against the firm in a defamation action.
## B. False-specifics patterns
B1. **Quote markers without source.** Halts the run if the draft contains any quoted string (`"…"` or `'…'`) that does not appear verbatim in the scorecard or transcript pool from step 2.
B2. **Numeric claims without source.** Halts if the draft contains a numeric claim (`scored X`, `Y out of Z`, `X% of`) — interview scores are internal calibration data, not candidate-facing content.
B3. **Interviewer-identifying claims.** Halts if the draft names an interviewer, references an interviewer's role beyond the generic "the team", or attributes a quote to a specific person. Interviewer identities are protected and naming them creates retaliation risk.
B4. **Round-identifying claims that could not have happened.** Halts if the draft references a round (`take-home`, `system design`, `behavioral`, `pair programming`) that is not present in the scorecard set for this candidate. The skill validates round names against the loop's actual structure.
## C. Process-risk language
C1. **Promises about the future.** Halts the run:
- `we will reach out`, `we'll be in touch`, `next time`
- `definitely apply again`, `you will get an offer`
- `keep your resume on file` (varies by jurisdiction whether this is permissible — neutral phrasing is "we welcome a future application")
- Any timeline commitment.
C2. **Process-improvement requests from the candidate.** Halts if the draft asks the candidate for feedback, a referral, or a testimonial. Reverse asks in a rejection email are an EEOC-witness-statement risk and a candidate-experience harm.
C3. **Unsolicited specifics in deny-jurisdiction cases.** The skill's step 1 should have caught this, but as a defense-in-depth check: if the run's `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`, the draft must match the generic-decline template byte-for-byte. Any deviation halts.
## D. Approved-topics list (positive list, used by step 4)
The rubric-to-feedback mapping's `{specific_topic}` substitution slot pulls from this list. The LLM never free-texts a topic string.
- `consistency-availability tradeoffs`
- `read-replica reasoning`
- `caching layer reasoning`
- `failure-mode reasoning`
- `test coverage`
- `error-handling specificity`
- `data-modeling tradeoffs`
- `query-pattern reasoning`
- `migration sequencing`
- `deployment sequencing`
- `cross-team coordination examples`
- `tradeoff reasoning under time pressure`
Add to this list only after team review. Topics added here are permitted to appear in candidate-facing drafts.
## E. Maintenance
This file is version-controlled. The skill captures the SHA-256 of this file in the audit log per run, so the blocklist used on a given date is reproducible. If a candidate raises a claim against a specific draft, the audit log answers "was the blocklist of date X in effect at the time of the draft" — yes or no, no judgment call.
## Last edited
{YYYY-MM-DD}
# Output format
> The rejection-feedback skill writes drafts in exactly the formats
> below. The recruiter reviews and edits in their own outbox or in
> the ATS; the skill never sends.
## Routing rules
The skill picks a route per the matrix below. The recruiter can override.
| Stage reached | Seniority | feedback_requested | Default route |
|---|---|---|---|
| onsite | senior+ | true | call |
| onsite | senior+ | false | email (generic if jurisdiction denies) |
| onsite | mid / junior | true | email (specific) |
| onsite | mid / junior | false | email (generic) |
| final loop | any | any | call (overrides above) |
| referred-by-VIP | any | any | call (recruiter judgment) |
| earlier than onsite | any | any | OUT OF SCOPE — use templated decline |
`senior+` = staff, principal, manager, director. `referred-by-VIP` = candidate has a `referrer_priority: high` flag in the ATS.
## Email format — specific feedback (consent + safe jurisdiction)
```markdown
Subject: Update on your {role_title} interview at {company_name}
Hi {candidate_first_name},
Thank you for the time you invested in our interview process — the
{round_1_label}, {round_2_label}, and the conversations with the
team. We appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
You asked for feedback, so here is what stood out from the loop:
- **What went well.** {strength_phrasing_from_mapping}.
- **Where the team landed differently.** {gap_phrasing_from_mapping}.
This was the dimension that drove the team's decision.
This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.
If a future role at {company_name} matches your background, we would
welcome your application.
Best,
{recruiter_first_name}
```
Constraints baked into this template:
- One strength, one gap. No more.
- The phrase "not a ranking against other candidates" is mandatory, because it pre-empts the most common candidate response loop ("how did I compare").
- The phrase "not a comment on your overall engineering ability" is mandatory, because it isolates the feedback to this loop and pre-empts the "you said I am bad at engineering" escalation.
- "We would welcome your application" — neutral future language. Not "we will reach out", not "next time".
## Email format — generic decline (deny jurisdiction OR no consent OR no surfacable specific)
```markdown
Subject: Update on your {role_title} interview at {company_name}
Hi {candidate_first_name},
Thank you for the time you invested in our interview process. We
appreciated the care you put into each stage.
After the team's debrief, we have decided not to move forward with
your candidacy for this role.
If a future role at {company_name} matches your background, we
would welcome your application.
Best,
{recruiter_first_name}
```
This is the safe default. The skill writes this template byte-for-byte when:
- `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`
- step 3 surfaced no rubric dimension with both `mean ≤ 2` AND a verbatim evidence string
- a legal flag on the candidate file is present
- the loop has under two signed-off scorecards
Generic decline is honest. Weak specifics are worse than no specifics.
## Call-notes format
```markdown
# Call notes — {candidate_first_name} {candidate_last_initial}. ({role_title})
## Frame
- Open with thanks for the time invested.
- Lead with the strength: {strength_phrasing_from_mapping}.
- Single gap: {gap_topic_from_approved_list}. One sentence, no piling
on.
## Suggested phrasing for the gap
"{gap_phrasing_from_mapping}"
## Likely candidate questions
Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.
Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.
Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.
Q: "Who else interviewed?"
A: Decline. Interviewer identities are protected. "I cannot share
that, but I can tell you the team weighed the input from every
round."
Q: "What did interviewer X think?"
A: Decline. Same reason. "I cannot break out individual scores; the
decision was a team decision."
## Off-script
If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.
## Call duration target
10-15 minutes. Past 20 minutes, the call is no longer feedback —
it is an extended negotiation about the decision, and that is not
a useful place to be.
```
## Audit-log line format
One JSON object per line in `audit/<YYYY-MM>.jsonl`:
```json
{
"run_id": "uuid-v4",
"candidate_id_hash": "sha256-of-candidate-id",
"role_id": "role-slug",
"jurisdiction": "US-CA",
"feedback_requested": true,
"route": "email",
"rubric_sha256": "abcdef...",
"blocklist_sha256": "abcdef...",
"mapping_sha256": "abcdef...",
"dimensions_surfaced": ["technical_depth"],
"blocklist_hits": 0,
"model_id": "claude-sonnet-4-5",
"timestamp": "2026-05-03T14:00:00Z"
}
```
No raw candidate ID, no candidate name, no scorecard text, no draft text. The audit log is for run reproducibility, not data retention. Candidate-facing drafts live in `drafts/<id>.md` under the recruiter's own retention policy.
## Last edited
{YYYY-MM-DD}