A Claude Skill that ingests an AE’s last ten Gong calls and outputs a coaching note in a fixed three-things-working / two-things-to-tighten / one-specific-exercise shape. Built for managers who want consistent weekly coaching but do not have ten hours a week to listen back to calls. The skill writes the draft; the manager edits, contextualizes with off-call signal, and delivers in the next one-on-one. Auto-send is explicitly out of scope.
When to use
You are a sales manager with 4-10 direct AE reports. You want every rep to get the same depth of coaching every week, not just the loud ones with messy deals. You have Gong, your team has a call rubric, and you can carve thirty minutes a week per rep to review and edit the draft. The skill collapses the “listen to ten calls plus take notes plus structure feedback” loop from ~3 hours per rep to ~30 minutes.
Use it weekly. Daily is too much (no rep changes behavior in 24 hours). Monthly is too rare (the calls being cited are stale and the patterns harder to remember). Friday afternoon delivery is the sweet spot — the AE has the weekend to absorb without it eating their selling day.
When NOT to use
Performance Improvement Plans, formal HR processes, comp decisions, or termination paperwork. This is a coaching note, not a personnel record. PIPs require HR involvement, signed acknowledgment, and a defensibility bar this Skill is not designed to meet. The artifact bundle’s references/03-escalation-criteria.md defines the binary signals that route a rep out of the coaching path entirely.
A rep you do not directly manage. The Skill verifies manager-of-record against Gong and refuses if there is no match. Wrong-manager output would expose call content the invoker should not see — the highest-impact failure mode this Skill could enable if unguarded.
Brand-new ramping reps in their first 30 days. Ramp coaching is its own conversation: shadowing, role-play, certification scoring. A weekly rubric-scored note from ten thin calls produces noise. Wait until the rep has three weeks of real-customer call volume.
A week with fewer than three usable calls. The Skill returns Insufficient call data rather than padding. Two calls do not establish a pattern; pretending they do is hindsight bias laundered through markdown.
Setup
Drop the bundle in. The full Skill is in apps/web/public/artifacts/ae-rep-coaching-skill/SKILL.md along with three reference templates. Copy the directory into ~/.claude/skills/ae-rep-coaching/ (or your team’s project-level .claude/skills/) so Claude Code picks it up.
Auth Gong. API key with read access to the AE’s calls and transcripts. The Skill respects Gong’s permission model; if the AE’s calls are restricted, the Skill cannot see them, which is the correct behavior.
Replace the templates with your real artifacts. The bundle ships three placeholder reference files. Each one is generic until you fill it in with your team’s actual content:
references/01-coaching-rubric-template.md — your per-call-type rubric (discovery, demo, negotiation, closing). The pilot ships with five criteria per type as a starting shape.
references/02-coaching-note-format.md — the exact Markdown shape every weekly note uses. Fixed format is deliberate so the AE scans for what changed week-over-week.
references/03-escalation-criteria.md — the binary signals that cause the Skill to refuse to write a coaching note and surface a performance-conversation flag instead.
Set the rep ID and the date range. Default is the last ten calls or the last fourteen days, whichever is shorter. Cap the window at thirty days; older calls reflect a different version of the rep.
Decide the delivery cadence and channel. Weekly, Friday afternoon, in the next 1:1 or as a DM the rep can read before the 1:1. Get the AE’s buy-in on the channel before the first note lands cold.
What the skill actually does
Six steps, in order, no parallelization:
Verify manager-of-record. Hard refusal if the invoking user is not the rep’s direct manager in Gong.
Pull recent calls filtered for length (≥5 min), attendee count (≤5 external), and audio quality. Stop at fewer than three usable calls.
Classify each call as discovery / demo / negotiation / closing. Per-call classification (not a blanket assumption) is the engineering choice that prevents the noisiest early failure mode: applying a discovery rubric to a negotiation call.
Score against the matching rubric section with a transcript citation per score. No score without a citation — this guard rules out gut-feel feedback dressed up as analysis.
Run the escalation check against 03-escalation-criteria.md. If any criterion fires (misrepresentation of pricing, hostile tone, side-deal promises, etc.) the Skill stops and returns the escalation block instead of the coaching note.
Aggregate into three / two / one with a fixed-format render. Patterns require ≥2 supporting calls each; the Skill returns one “tighten” instead of two when only one is supported. Never pads.
The exercise at the end is the most important output. “Improve discovery” is useless. “On your next three discovery calls, ask the budget question before minute twenty” is actionable, observable in Gong next week, and auditable. The Skill is prompted hard to make the exercise concrete and measurable.
Cost reality
Per coaching note (one rep, ten calls, ~10k tokens of transcript per call after filtering, Claude Sonnet 4.5):
~120k input tokens (transcripts + rubric + format + escalation reference) → ~$0.36 at current Sonnet pricing.
~2k output tokens (the note itself) → ~$0.03.
~$0.40 per rep per week, $1.60 per rep per month.
For a manager of eight reps, that is ~$13/month in token cost.
Time saved per manager per week: roughly 2.5 hours per rep collapses to 30 minutes (review + edit). For eight reps, that is ~16 hours back per week. The realistic floor is closer to ~10 hours back once you account for the additional 1:1 time spent talking through the note with each rep — which is the point.
Gong cost is not incremental; if you have Gong already (the Skill’s hard dependency) the API access is bundled.
Success metric
Watch one number for one quarter: percentage of weekly 1:1s where the AE references their previous week’s exercise unprompted. If that number is not above 50% by week 6, the exercises are not specific enough — go back and tighten the wording in 02-coaching-note-format.md so “exercise” cannot collapse into “general advice.”
Secondary signals (slower, noisier): close-rate movement on the specific call type the rep is being coached on, ramp time for new reps once the loop is established, manager-rated 1:1 quality.
vs alternatives
Manager-written notes from scratch. Better fidelity if the manager actually does it. The catch is consistency: under load, manager-written notes either skip a week, get reduced to “good job, keep going,” or get written for the deal-in-trouble rep while the steady performers go uncoached. The Skill does not produce better notes than a great manager with infinite time; it produces better notes than the same manager under realistic load.
Gong’s built-in scorecards and coaching features. Gong scores individual calls against a rubric and surfaces aggregate trends. Useful, complementary, and cheaper than this skill if you only need scoring. What it does not do well is synthesize across calls into the three / two / one shape with a specific exercise. You can stack: use Gong scorecards for per-call grading, use this Skill for the weekly synthesis.
Membrain, Spekit, or other dedicated sales coaching platforms. Heavier setup, additional license, broader capability (skill libraries, learning paths). Right answer for large enterprise sales orgs with dedicated enablement headcount. Wrong answer for a single sales manager who just wants weekly notes that do not eat Friday.
Status quo. “I’ll get to coaching after pipeline review.” Pipeline review is never done.
Watch-outs
Bias amplification. Rubric scoring against transcripts can encode reviewer bias — verbose reps score higher on “rapport,” non-native English speakers score lower on “discovery flow,” reps with louder customers score higher on “asks for reaction” even when the rep did nothing. Guard: every score requires a transcript citation, not a vibe; the rubric is reviewed quarterly with the full team rather than maintained by one manager in isolation; the Skill warns when the rubric is older than 90 days. See apps/web/public/artifacts/ae-rep-coaching-skill/references/01-coaching-rubric-template.md.
Wrong-manager data leakage. A manager pulls a coaching note on a rep they do not manage and ends up reading call content they should not see. Guard: step 1 of the Skill verifies manager-of-record against Gong’s permission model before any transcript loads. Hard refusal on mismatch — no partial output, no workaround suggestion.
Stale rubric. A rubric written 18 months ago for cold outbound discovery applied to today’s product-led inbound demos produces irrelevant feedback that the AE quietly stops reading. Guard: each rubric file carries a last_edited date; the Skill prepends a warning to the coaching note when the matching rubric section is older than 90 days, and a quarterly rubric review is part of the rollout plan, not optional.
Tone calibration. Coaching notes that read like a written warning get ignored or escalate the relationship. Guard: the Skill enforces “trusted peer” voice in 02-coaching-note-format.md, the escalation check in step 5 routes performance-conversation signals out of the coaching path entirely, and the rollout plan calls for testing the tone with one of your strongest AEs before sending one cold.
Manager-of-record dependency. This is a tool for the manager, not a replacement. The output is a draft. The manager edits, contextualizes with off-call observations (1:1 history, ramp stage, deal context the call did not show), and delivers in person. Auto-send is intentionally not in the bundle.
Stack
Gong — call source, transcript layer, manager-of-record verification, optional per-call scorecards as a complementary layer
Claude (Sonnet 4.5 or higher) — rubric scoring, escalation check, synthesis into the three / two / one shape
Coaching rubric, note format, escalation criteria — the three reference files in apps/web/public/artifacts/ae-rep-coaching-skill/references/ that turn a generic “score this transcript” into a coaching loop your team actually owns
---
name: ae-rep-coaching
description: Generate a weekly coaching note for a single AE from their last ten Gong calls. Output is a three-things-working / two-things-to-tighten / one-specific-exercise note that the manager edits and delivers — never auto-sent. Use weekly per direct report.
---
# AE rep coaching
## When to invoke
Invoke when a sales manager wants a structured coaching draft for one direct report based on recent call activity. Take a Gong rep ID and a date window as input and produce a Markdown coaching note grounded in specific call moments.
Do NOT invoke for:
- Performance Improvement Plans (PIPs), formal HR processes, or any document that becomes part of an employee's official record. This skill writes a coaching note, not a personnel file. PIPs require HR involvement, signed acknowledgment, and a defensibility bar this skill is not designed to meet.
- Compensation, promotion, or termination decisions. The skill scores call behaviors against a rubric. It does not assess overall contribution, ramp trajectory, or pipeline coverage.
- Peer-to-peer feedback (no manager-of-record context, no authorization to read calls).
- A rep the invoking user does not directly manage. The Skill checks the requesting user's manager-of-record status against the rep ID and refuses if there is no match. Wrong-manager data leakage is the highest-impact failure mode this skill could enable.
- Calls that are not internal sales activity (customer-success QBRs, partner calls, internal syncs miscategorized in Gong).
## Inputs
- Required: `rep_id` — the Gong user ID for the AE being coached.
- Required: `manager_id` — the Gong user ID of the invoking manager. Used to verify the manager-of-record relationship before any transcript is read.
- Optional: `window_days` — how far back to pull. Default 14. Cap at 30; older calls reflect a different version of the rep.
- Optional: `max_calls` — cap on number of calls analyzed. Default 10. Set lower (5-7) for high-volume SDRs to keep token cost bounded.
- Optional: `call_types` — restrict to one or more of `discovery|demo|negotiation|closing`. Default: all four.
## Reference files
Read all of the following from `references/` before generating the note. These are the user's own coaching artifacts. Without them, the output is generic feedback that any sales-coaching blog could produce.
- `references/01-coaching-rubric-template.md` — the per-call-type rubric the Skill scores against. Replace the template with your team's actual rubric.
- `references/02-coaching-note-format.md` — the literal Markdown format and tone the manager wants. The Skill matches this style rather than inventing one per run.
- `references/03-escalation-criteria.md` — the signals that mean "stop, this is not a coaching moment, this is a performance conversation." When any criterion fires, the Skill refuses to produce the coaching note and surfaces the criteria instead.
## Method
Run these steps in order. Do not parallelize — each step depends on data from the previous one, and the escalation check must run before any analysis is committed to the output.
### 1. Verify manager-of-record
Query Gong for the rep's manager-of-record. If it does not match `manager_id`, refuse the request and return:
```
Refused: <manager_id> is not the manager of record for <rep_id>. Coaching notes are written by the direct manager only.
```
This is a hard refusal. Do not produce a partial note, do not suggest workarounds. Wrong-manager output is the most damaging failure mode this Skill could enable.
### 2. Pull recent calls
Use `pull_recent_calls(rep_id, window_days, max_calls, call_types)`. Filter out:
- Calls under five minutes (no signal, mostly logistics).
- Calls with more than five external attendees (group dynamics drown out the rep's behavior).
- Calls flagged in Gong as `bad_audio` or `transcription_failed`.
If fewer than three usable calls remain after filtering, stop and return: `Insufficient call data: <N> usable calls in window. Extend window_days or wait for more activity.` Producing a coaching note from one or two calls is hindsight bias amplification.
### 3. Classify call type
For each remaining call, take the Gong stage tag if present. If absent, classify the call into `discovery|demo|negotiation|closing` based on transcript content. Engineering choice: classification is explicit and per-call rather than blanket because applying a discovery rubric to a negotiation call produces irrelevant scoring (the noisiest failure mode in early rollouts).
### 4. Score against rubric
For each call, load the matching rubric section from `01-coaching-rubric-template.md`. Score each criterion 1-5 with a specific transcript citation (timestamp + 1-2 sentence quote). No score without a citation — this guard rules out gut-feel feedback dressed up as analysis.
### 5. Run escalation check
Before aggregating, evaluate every criterion in `03-escalation-criteria.md` against the scored calls. If any criterion fires (e.g. repeated misrepresentation of pricing, deal terms invented on the fly, hostile tone toward customer), stop and return the escalation block instead of the coaching note. Coaching is the wrong intervention for these signals; a performance conversation with HR involvement is.
### 6. Aggregate into three / two / one
Across the scored calls, identify:
- **Three patterns working.** Behaviors that scored 4-5 on multiple calls. Cite at least two calls per pattern.
- **Two patterns to tighten.** Behaviors that scored 1-2 on multiple calls. Cite at least two calls per pattern. If only one weak pattern is supported by ≥2 calls, return one — never pad to two.
- **One specific exercise.** A concrete, measurable behavior change for the upcoming week, tied to the strongest "tighten" pattern.
### 7. Render the note
Use the format in `02-coaching-note-format.md` exactly. Engineering choice: the format is fixed (not regenerated per run) so the AE sees the same shape every week and can scan for what changed.
## Output format
```markdown
# Coaching note — {Rep name}, week of {YYYY-MM-DD}
Calls analyzed: {N} ({list of call types and dates})
Window: last {window_days} days
## Three things working
1. **{Pattern}.** Cited in: {Call A — timestamp}, {Call B — timestamp}.
"{short quote}"
2. **{Pattern}.** Cited in: {Call C — timestamp}, {Call D — timestamp}.
"{short quote}"
3. **{Pattern}.** Cited in: {Call E — timestamp}, {Call F — timestamp}.
"{short quote}"
## Two things to tighten
1. **{Pattern}.** Cited in: {Call G — timestamp}, {Call H — timestamp}.
"{short quote}". Why it matters: {one sentence linked to deal outcomes}.
2. **{Pattern}.** Cited in: {Call I — timestamp}, {Call J — timestamp}.
"{short quote}". Why it matters: {one sentence linked to deal outcomes}.
## One exercise for next week
On your next {N} {call type} calls, {specific measurable behavior}.
Success looks like: {observable outcome the manager can verify in Gong}.
---
Draft by ae-rep-coaching skill. Manager edits and delivers; this note
is not auto-sent.
```
## Watch-outs
- **Coaching is not a performance review.** A coaching note that reads like a written warning gets ignored or escalates the relationship. Guard: the prompt forces "trusted peer" voice and the escalation check in step 5 routes performance issues out of the coaching path entirely.
- **Wrong-manager data leakage.** If a manager pulls a coaching note on a rep they do not manage, the Skill exposes call content the invoker should not see. Guard: step 1 verifies manager-of-record against Gong's permission model before any transcript is loaded; hard refusal on mismatch.
- **Bias amplification.** Rubric scoring against transcripts can encode reviewer bias (verbose reps score higher; non-native English speakers score lower on "discovery rapport"). Guard: every score requires a transcript citation, not a vibe; the rubric is reviewed quarterly with the full team, not maintained by one manager in isolation.
- **Stale rubric.** A rubric written for cold outbound applied to warm inbound demos produces irrelevant feedback. Guard: each rubric file carries a `last_edited` date; the Skill prepends a warning to the coaching note if the matching rubric section is older than 90 days.
- **Hindsight bias.** Two calls do not establish a pattern. Guard: step 2 refuses to produce a note with fewer than three usable calls; "patterns" require ≥2 supporting calls each.
- **Manager-of-record dependency.** This is a tool for the manager, not a replacement. The output is a draft. The manager edits, contextualizes with off-call observations (1:1 history, ramp stage, deal context), and delivers in person. Auto-sending is explicitly out of scope.
# Coaching rubric — TEMPLATE
> Replace this template's contents with your team's actual rubric per
> call type. The ae-rep-coaching skill loads the matching section
> based on each call's classification. Without your real rubric, the
> coaching note reflects generic sales-coaching wisdom rather than
> what your team has decided "good" looks like.
## How to use this rubric
Each call type has 4-6 criteria. Each criterion is scored 1-5 against a transcript with a citation (timestamp + 1-2 sentence quote). The Skill aggregates scores across calls; this file defines what is being scored, not how the aggregation works.
Update `last_edited` at the bottom every time you change the rubric. The Skill warns the manager when the rubric is older than 90 days.
## Discovery rubric
| # | Criterion | What 1-2 looks like | What 4-5 looks like |
|---|---|---|---|
| 1 | Opens with explicit agenda | No agenda; jumps to demo or pitch | States agenda, asks for additions, gets explicit buy-in |
| 2 | Uncovers measurable pain | Asks "any pain points?" generically | Pain quantified — money, time, headcount, risk |
| 3 | Maps the buying committee | Talks only to the loudest person | Names roles + economic buyer, asks who else weighs in |
| 4 | Tests budget and timeline early | Avoids both, hopes for the best | Both surfaced before minute 25, no awkwardness |
| 5 | Confirms next step before ending | "We'll be in touch" | Specific next step + date + named owner on each side |
## Demo rubric
| # | Criterion | What 1-2 looks like | What 4-5 looks like |
|---|---|---|---|
| 1 | Demo is anchored to discovery | Generic feature tour | Tied to ≥3 pains surfaced in discovery, named explicitly |
| 2 | Talks less than the customer overall | Monologue, ratio > 70/30 rep | Ratio ~50/50 or customer-heavier |
| 3 | Asks for reaction every 5-7 min | Talks for 20 min uninterrupted | Asks "does this match what you described?" repeatedly |
| 4 | Surfaces objections proactively | Hopes objections do not come up | Names the likely objection before the buyer does |
| 5 | Closes with mutual action plan | "Let me know what you think" | Confirms next step, decision criteria, and decision date |
## Negotiation rubric
| # | Criterion | What 1-2 looks like | What 4-5 looks like |
|---|---|---|---|
| 1 | Anchors on value before price | Leads with discount | Recaps quantified pain before discussing terms |
| 2 | Trades, never gives | Gives discount unilaterally | Every concession paired with an ask (term length, refs, timing) |
| 3 | Confirms the real decision criteria | Assumes price is the issue | Surfaces and confirms the actual blocker |
| 4 | Multi-threads the close | Single contact carrying the deal | Engages economic buyer + at least one other |
| 5 | Documents agreement same-day | Verbal only | Recap email out within 24h with terms + next step |
## Closing rubric
| # | Criterion | What 1-2 looks like | What 4-5 looks like |
|---|---|---|---|
| 1 | Confirms the path-to-signature | Asks "are we good?" | Walks through procurement, legal, security as a sequence |
| 2 | Owns the close date | Lets the buyer set the timeline | Names the date and gets explicit agreement |
| 3 | Pre-empts last-minute asks | Surprised by procurement requests | Has surfaced the redlines and asks before this call |
| 4 | Maintains champion engagement | Champion goes silent in week 2 | Champion is co-steering with the rep, has internal coverage |
## Last edited
{YYYY-MM-DD}
# Coaching note format — TEMPLATE
> The ae-rep-coaching skill renders its weekly note in this exact
> shape. Adjust the wording (tone, signoff, emoji policy) to match
> how your team writes; do not change the section count or order
> without updating the Skill's `Output format` section in lockstep.
## Why a fixed format
The AE reads one of these every week. A fixed shape lets them scan for what changed week over week instead of re-parsing the structure each time. The same reason quarterly business reviews use a fixed deck template — the cognitive load belongs on the content, not the container.
## The literal format
```markdown
# Coaching note — {Rep first name}, week of {YYYY-MM-DD}
Calls analyzed: {N} ({list of call dates and types})
Window: last {window_days} days. Rubric version: {YYYY-MM-DD}.
## Three things working
1. **{Crisp one-line pattern name}.** Cited in: {Call name — HH:MM},
{Call name — HH:MM}.
> "{1-2 sentence transcript quote}"
What you did: {one sentence in trusted-peer voice}.
2. **{Pattern}.** Cited in: {Call — HH:MM}, {Call — HH:MM}.
> "{quote}"
What you did: {one sentence}.
3. **{Pattern}.** Cited in: {Call — HH:MM}, {Call — HH:MM}.
> "{quote}"
What you did: {one sentence}.
## Two things to tighten
1. **{Crisp one-line pattern name}.** Cited in: {Call — HH:MM},
{Call — HH:MM}.
> "{quote}"
Why it matters: {one sentence linked to deal outcome — cycle
length, conversion, expansion potential, churn risk}.
The shift: {one sentence describing the alternative behavior}.
2. **{Pattern}.** Cited in: {Call — HH:MM}, {Call — HH:MM}.
> "{quote}"
Why it matters: {one sentence}.
The shift: {one sentence}.
## One exercise for next week
On your next {N} {call type} calls, {specific measurable behavior —
e.g. "ask the budget question before minute 20" or "surface one
objection proactively before the buyer raises it"}.
Success looks like: {one sentence — observable signal in Gong the
manager will check next week, not "feel more confident"}.
---
Draft generated by the ae-rep-coaching skill from the last
{window_days} days of calls. Your manager edits this before
delivering. If anything in here does not match your read of the
calls, push back — the rubric serves the conversation, not the
other way around.
```
## Voice rules
- Trusted peer, not performance reviewer. Read aloud — if it sounds like a written warning, rewrite.
- Specific over flattering. "You opened with an agenda on three of four calls" beats "great job on agendas."
- No corporate hedging. "You did" / "tighten this" / "try this." No "you might consider perhaps."
- One exercise, not five. Five exercises is zero exercises.
## Last edited
{YYYY-MM-DD}
# Escalation criteria — TEMPLATE
> Replace this template's contents with the criteria your team has
> agreed mark a "stop coaching, start a performance conversation"
> moment. The ae-rep-coaching skill evaluates every criterion before
> producing the weekly note. If any criterion fires, the Skill
> refuses to render the coaching note and returns this list with the
> matching evidence instead.
## Why a hard separation
Coaching notes are formative, low-stakes, weekly. Performance conversations are summative, high-stakes, on-record, and involve HR. Confusing the two damages the rep both ways: small issues become existential, real issues get softened into "patterns to tighten" and go unaddressed. The Skill enforces the separation by refusing to write a coaching note when any of the criteria below fires.
## Criteria
Each criterion is binary (fires or does not). The Skill quotes the transcript evidence when reporting a fire, so the manager can verify before acting.
### 1. Misrepresentation of product capability
The rep claims a capability the product does not have, in a way a reasonable buyer would rely on. Example: "Yes, we support SOC 2 Type 2 out of the box" when the product has Type 1 only.
### 2. Misrepresentation of pricing or contract terms
The rep states pricing, discount authority, term length, or termination terms inconsistent with what is in the actual contract template or pricing book. One-off slips happen; a pattern across multiple calls is escalation.
### 3. Hostile or disrespectful tone toward the customer
Sarcasm, dismissiveness, raised voice, interrupting repeatedly. The Skill cites timestamps and quotes; the manager makes the judgment call on intent, but a pattern triggers the criterion regardless of intent.
### 4. Side deal or off-contract promise
The rep promises something — feature, timeline, credit, side letter — that is not part of the standard contract and was not approved by deal desk or the manager. This is a legal exposure issue, not a coaching issue.
### 5. Discovery of harassment, discrimination, or compliance issue
Anything that surfaces in a call (rep behavior or customer behavior) that triggers HR or legal review. The Skill never tries to write this up itself; it surfaces the timestamp and stops.
### 6. Visible signs of burnout or distress
Repeated mentions of being overwhelmed, audible distress on calls, patterns suggesting a mental-health concern. Coaching is not the right intervention; a 1:1 conversation, EAP referral, or workload review is.
## What the Skill returns when a criterion fires
```markdown
# Escalation — not a coaching moment
The ae-rep-coaching skill stopped before writing a coaching note for
{Rep name} because the following criterion fired:
**{Criterion name}**
Evidence:
- {Call name — HH:MM}: "{quote}"
- {Call name — HH:MM}: "{quote}"
Recommended next step: {appropriate path — HR conversation, deal
desk review, EAP referral, etc.}. Do not paste this output into a
performance document; it is a flag, not a finding. Verify the
evidence yourself before acting.
```
## Last edited
{YYYY-MM-DD}