A Claude Skill that audits outside-counsel invoices line-by-line against the firm’s billing guidelines (block-billing, vague time entries, partner-staffing on associate work, double-billing of expenses, premium-time charges that violate the cap). Returns a structured audit report with per-line citations, the guideline violated, the recommended adjustment, and a confidence band — but never reduces the bill automatically. The legal-ops lead reviews and decides which adjustments to negotiate. Replaces the manual line-by-line read of a 20-page invoice with a 15-minute review of a structured report.
When to use
The firm has written outside-counsel billing guidelines (block-billing prohibition, time-entry minimum detail, partner/associate staffing ratios, expense reimbursement rules). Without guidelines the skill has nothing to audit against.
Monthly invoices from one or more named outside firms, in LEDES 1998B / 1998BI / 2000 format, or as PDF / Excel that can be parsed line-by-line.
The legal-ops lead or senior in-house counsel reviews the audit report before any adjustment is communicated to the outside firm. The skill writes; humans decide.
When NOT to use
Auto-reducing the bill without review. The skill flags; it does not adjust. Auto-deducting based on the skill’s findings damages the outside-counsel relationship, may violate the engagement letter’s dispute procedure, and exposes the firm to a fee-arbitration counterclaim. The skill’s output is decision support.
Disputing every flagged line. The skill catches volume that a human reviewer might miss. Picking battles is the legal-ops lead’s job; not every block-billing instance is worth the relationship cost of disputing.
Bills with a complex flat-fee or alternative-fee structure. The skill is calibrated to hourly billing. AFA-billed engagements need a different audit shape (deliverable completion, scope creep) that this skill does not cover.
Replacing e-billing platform validators.Onit, SimpleLegal, Brightflag, etc. have rule-engine validators built in. The skill is a layer ON TOP of those — for the structured-prose findings the rule engine can’t catch (“vague time entry: ‘0.6h research’” — too vague, but technically follows the format).
Setup
Drop the bundle. Place apps/web/public/artifacts/outside-counsel-bill-audit-skill/SKILL.md into your Claude Code skills directory.
Author the firm’s billing guidelines. Copy references/1-billing-guidelines-template.md, replace every placeholder. The guidelines should include: time-entry detail requirements, block-billing prohibition, staffing ratios (partner / senior associate / junior associate / paralegal), expense reimbursement rules, premium-time caps, scope-of-work boundaries.
Configure the LEDES parser (or PDF parser for non-LEDES invoices). The bundle includes a parser that handles LEDES 1998B and 1998BI; for PDF invoices, the skill expects a pre-parsed line-item CSV.
Set the per-firm calibration. Different outside firms have different baseline behavior. The skill’s output includes a per-firm calibration note (“this firm’s block-billing rate has dropped from 18% to 9% over six months”) that helps the legal-ops lead read findings in context.
Dry-run on a closed month. Audit last month’s invoices. Compare the skill’s findings to the legal-ops lead’s manual review of the same invoices. Tune the guidelines if the skill flags things the lead doesn’t care about, or misses things the lead does.
What the skill does
Six steps. Deterministic checks come before the LLM evaluation, because deterministic violations (block-billing format, expense double-entry) are reproducible and don’t need model judgment.
Validate the invoice format — confirm the LEDES file parses, or the CSV has the expected columns. Halt on parse failure.
Run deterministic checks — block-billing detection (single time entry covering >1 task description), expense double-entry (same expense ID on two lines), staffing-ratio breach (partner hours on a task type the guidelines name as associate work), premium-time cap breach (off-hours surcharge above the contractual cap).
Read the firm’s billing guidelines and the engagement letter’s specific terms. The guidelines are the comparison anchor; engagement-letter overrides are noted per matter.
Per-line LLM evaluation for the cases deterministic checks can’t cover: vague time-entry descriptions, scope-creep signals, work that should have been included in a flat fee but appears as a separate line. For each finding, cite the guideline section and the specific line.
Aggregate by guideline category — total hours / dollars by violation type. The aggregate is the negotiation lever, not individual lines.
Emit report + audit log — structured Markdown report for the legal-ops lead, plus a JSONL audit log entry per audit run for the firm’s spend-tracking system.
Cost reality
Per invoice (typical 200-800 line items), on Claude Sonnet 4.6:
LLM tokens — typically 30-80k input (invoice + guidelines + skill instructions) and 5-10k output. Roughly $0.30-0.80 per invoice. Heavy invoices (>1,000 lines) may need chunking.
Legal-ops lead time — the win. Manual line-by-line audit of a complex invoice is 2-4 hours. Reviewing the skill’s report and deciding which findings to dispute is 20-40 minutes.
Setup time — 60 minutes once for the guidelines authoring; per-firm calibration is 15 minutes per firm.
Success metric
Adjustment rate per audit — share of audited invoices that result in a negotiated reduction. Should sit at 5-15% (above 25% means too aggressive on findings; below 2% means the skill is missing real overcharges or the guidelines are too soft).
Average reduction per audit — dollar value of negotiated reductions. Should be 1-4% of total invoice value at most outside firms.
Legal-ops lead time per invoice — should drop from 2-4 hours to 30-60 minutes (review + negotiation prep).
vs alternatives
vs e-billing platform rule engines. Onit / SimpleLegal / Brightflag handle the deterministic part well and are the right tool for high-volume firms. The skill is a layer ON TOP for the prose-judgment findings. Use both.
vs manual review. Manual is right for the smallest legal departments where invoice volume is low. The skill earns its setup cost at >5 invoices per month.
vs ChatGPT-style “audit this invoice.” Generic chat returns generic findings. The skill is structured: cited guideline section per finding, deterministic checks first, no auto-adjustment.
Watch-outs
Auto-adjustment drift.Guard: the skill’s output ends with the structured report. There is no “adjusted bill” output. The legal-ops lead is the sole adjustment authority.
Confidentiality of invoice content.Guard: outside-counsel invoices typically contain attorney work-product descriptions that are privileged. The skill processes locally where the calling Claude session runs — for SaaS Claude, the privilege posture is the firm’s responsibility (most BigLaw and corporate-legal departments use API access with zero-retention configuration).
Firm-relationship cost of over-disputing.Guard: the skill’s per-firm calibration note tracks dispute volume over time. If disputes climb, the report flags it for the legal-ops lead.
Guideline drift.Guard: the audit log captures the guidelines-file SHA per run. Guideline changes are visible across invoices.
Hallucinated guideline citations.Guard: every finding cites a specific guideline section by ID; findings without a citable section are flagged as “no matching guideline” rather than asserted.
Stack
The bundle lives at apps/web/public/artifacts/outside-counsel-bill-audit-skill/:
---
name: outside-counsel-bill-auditor
description: Audit an outside-counsel invoice line-by-line against the firm's billing guidelines. Returns a structured Markdown report with per-line citations, the guideline violated, and a recommended adjustment per finding. Never adjusts the bill automatically — the legal-ops lead reviews and decides.
---
# Outside-counsel bill auditor
## When to invoke
Use this skill when a legal-ops lead has an outside-counsel invoice (LEDES 1998B/1998BI/2000 or pre-parsed CSV) and the firm's billing guidelines, and wants a structured audit before the invoice is approved or disputed.
Do NOT invoke this skill for:
- **Auto-reducing the bill.** The skill flags; humans decide. Auto-deducting based on findings damages the outside-counsel relationship and may violate the engagement letter's dispute procedure.
- **AFA-billed engagements.** The skill is calibrated to hourly billing.
- **Bills the firm has already approved.** Audit happens before approval, not after.
## Inputs
- Required: `invoice_path` — path to the LEDES file or pre-parsed CSV.
- Required: `guidelines_path` — path to the firm's billing guidelines file (see `references/1-billing-guidelines-template.md`).
- Optional: `engagement_letter_overrides_path` — per-matter overrides to the firm guidelines.
- Optional: `firm_name` — outside firm name, for the per-firm calibration section of the report.
## Reference files
- `references/1-billing-guidelines-template.md` — the firm's guidelines shape.
- `references/2-ledes-parser-notes.md` — LEDES format parsing notes.
## Method
Six steps.
### 1. Validate the invoice format
Parse the LEDES file or CSV. Halt with a parse-error report if the format is malformed. Check that required columns are present: `line_id`, `date`, `timekeeper`, `timekeeper_role`, `task_code`, `activity_code`, `hours`, `rate`, `amount`, `narrative`.
### 2. Run deterministic checks
Without invoking the LLM, flag:
- **Block-billing**: a single time entry with a narrative containing multiple distinct task verbs (e.g. "Reviewed contract; drafted letter; called client" in 2.4 hours). Flag at >2 verbs unless the engagement letter explicitly permits.
- **Expense double-entry**: the same `line_id` or `expense_id` referenced twice.
- **Staffing-ratio breach**: partner hours on a `task_code` the guidelines name as associate-or-below work (default: legal research, deposition summaries, document review).
- **Premium-time cap breach**: off-hours surcharge entries totaling more than the engagement letter's monthly cap.
- **Rate variance**: timekeeper rate that deviates from the engagement letter's rate sheet by more than ±2%.
### 3. Read the guidelines
Load `guidelines_path` and the engagement-letter overrides if present. The guidelines define the comparison anchors for steps 4 and 5. SHA-256 the guidelines for the audit log.
### 4. Per-line LLM evaluation
For lines not flagged by deterministic checks, evaluate against the guidelines for prose-judgment violations:
- **Vague time entries** — "0.6h research" without naming the issue researched, the source consulted, or the deliverable.
- **Scope-creep signals** — work on a topic outside the engagement letter's matter scope.
- **Should-be-flat-fee work** — work that the engagement letter explicitly bundled into a flat fee but appears as a separate hourly line.
- **Internal-conference billing** — multiple timekeepers on the same internal conference where the guidelines limit attendees.
For each finding, cite:
- `line_id`
- `guideline_section_id` (or "no matching guideline" if the finding is intuitive but not codified)
- `recommended_adjustment` — dollar amount or "negotiate" tag
- `confidence` — `high` (clear violation), `medium` (likely violation, judgment call), `low` (signal worth surfacing, may be defensible)
### 5. Aggregate by guideline category
Group findings by violation category. Total hours and dollars by category. The aggregate is the negotiation lever — "block-billing on 23 lines totaling $4,200" is a stronger negotiation point than 23 individual line disputes.
Per-firm calibration: if `firm_name` is provided AND the audit log has prior runs for the same firm, surface the trend ("block-billing rate this month: 9%; prior 6-month average: 12%; trending down").
### 6. Emit report + audit log
Write the report to stdout in the format below. Append one JSONL line to `audit/<YYYY-MM>.jsonl`:
```json
{
"audit_id": "uuid",
"timestamp": "ISO-8601",
"invoice_id": "...",
"firm_name": "...",
"invoice_total_usd": 0,
"guidelines_sha": "...",
"findings_by_category": {
"block_billing": { "count": 0, "hours": 0, "dollars": 0 },
"vague_entries": { "count": 0, "hours": 0, "dollars": 0 },
...
},
"skill_version": "1.0",
"model": "claude-sonnet-4-6"
}
```
## Output format
```markdown
# Bill audit — {firm_name} — {invoice_id}
Audited: {ISO timestamp} · Invoice total: ${total} · Skill v1.0
{PER-FIRM TREND if firm_name has prior runs}
## Aggregate findings
| Category | Lines | Hours | Dollars |
|---|---|---|---|
| Block-billing | 23 | 47.2 | $14,160 |
| Vague entries | 18 | 12.6 | $3,780 |
| Staffing-ratio breach | 4 | 8.0 | $4,800 |
| Premium-time cap breach | 2 | 6.0 | $2,250 |
| Rate variance | 1 | 0.8 | $40 |
Total flagged: $25,030 of $187,400 (13.4%).
## Findings — high confidence
### Block-billing — line 142
- **Narrative:** "Reviewed contract; drafted comments; called opposing counsel" — 2.4h
- **Guideline:** §3.2.a (no time entry shall combine more than one task)
- **Recommended adjustment:** request narrative split into 3 entries; adjust if split shifts hours
### Staffing-ratio breach — line 287
- **Narrative:** "Document review for production" — 4.0h, partner timekeeper rate $850/hr
- **Guideline:** §4.1 (document review staffed at associate level or below)
- **Recommended adjustment:** rebill at junior associate rate ($375/hr); $1,900 reduction
## Findings — medium confidence
(per-line entries with `confidence: medium`)
## Findings — low confidence (informational)
(per-line entries with `confidence: low`)
## Provenance
- Guidelines: `firm-billing-guidelines.md` — SHA `{short}`
- Engagement letter: `engagement-letters/<matter>.md` (if used)
- Audit record: `audit/2026-05.jsonl` line {N}
```
## Watch-outs
- **Auto-adjustment drift.** *Guard:* output ends with the structured report; no "adjusted bill" output.
- **Confidentiality of attorney work-product.** *Guard:* invoice narratives are privileged. Process via API access with zero-retention; do not paste invoices into shared chat surfaces.
- **Over-disputing damages firm relationship.** *Guard:* per-firm trend tracking surfaces dispute-volume drift.
- **Hallucinated guideline citations.** *Guard:* every finding cites `guideline_section_id`; findings without a citable section get "no matching guideline" tag rather than fake citations.
# Outside-counsel billing guidelines template
The bill auditor reads guidelines in this shape. Copy this file to `firm-billing-guidelines.md` (or wherever your skill config points), fill in the firm-specific values, and version it in git.
The guidelines are the comparison anchor. Without them the skill has nothing to audit against. Most law departments revise these annually; the skill captures the file's SHA per audit so changes are visible.
## Section IDs
Every guideline carries an ID (`§1.1`, `§3.2.a`, etc.). The skill's findings cite the ID. If you renumber, the audit log's prior findings still reference the old IDs — keep a renumbering mapping or treat renumbering as a guideline-version bump.
## §1 — Engagement scope and pre-approval
- **§1.1** All matters require a written engagement letter naming the matter, the timekeepers, the rate sheet, the budget, and any flat-fee components.
- **§1.2** Work outside the named scope requires written pre-approval from the assigned in-house attorney.
- **§1.3** Budget overruns >10% require written notice within 5 business days; overruns >25% require pre-approval before further work.
## §2 — Timekeeper composition
- **§2.1** Use of new timekeepers (not on the engagement letter's rate sheet) requires written approval before time is billed.
- **§2.2** The rate sheet is fixed for the engagement's duration unless an annual rate adjustment is provided in writing 60 days in advance.
- **§2.3** Rate variance from the rate sheet on any line is a violation regardless of intent.
## §3 — Time entry detail
- **§3.1** Every time entry shall name (a) the task verb, (b) the specific deliverable or document, (c) the issue or matter focus.
- *Example pass:* "Drafted §4 (Indemnification) of MSA between Acme and BetaCo, focused on caps and carve-outs."
- *Example fail:* "Worked on contract."
- **§3.2.a** No single time entry shall combine more than one distinct task.
- Block-billing definition: a narrative containing more than 2 distinct task verbs (drafted, reviewed, called, attended, researched, etc.).
- **§3.2.b** Minimum increment: 0.1 hour. No 0.05h or smaller entries; no rounded-up entries (e.g. 0.5h for what was actually 0.2h work).
- **§3.3** No "review" entry without naming what was reviewed and the outcome of the review.
## §4 — Staffing ratios
- **§4.1** Document review shall be staffed at associate level or below. Partner time on document review requires written justification.
- **§4.2** Legal research shall be staffed at associate level or below. Partner time on research requires written justification (novel issue, conflict).
- **§4.3** Deposition summaries shall be staffed at paralegal or associate level. Partner / senior counsel time only when the substance requires.
- **§4.4** Client conferences shall not have more than 2 outside-firm timekeepers attending unless pre-approved.
- **§4.5** Internal outside-firm conferences shall not bill more than 2 outside timekeepers unless the conference involves substantive case strategy and is documented as such.
## §5 — Expenses
- **§5.1** Expenses shall be billed at cost. No markup on photocopying, courier, transcript fees, computer research (Westlaw / Lexis), or travel.
- **§5.2** Travel: economy class for flights under 6 hours; business class permitted for transoceanic. Hotel at standard corporate rate, no luxury class without approval.
- **§5.3** Meals: $50 per person per day cap unless client-facing meal with documented rationale.
- **§5.4** No charge for in-firm administrative time (secretarial, file management, billing-related work).
- **§5.5** Computer research: at-cost only. No proration of monthly subscription fees onto matters.
## §6 — Premium time
- **§6.1** Standard hourly rates apply Monday-Friday 7am-9pm in the timekeeper's local time zone.
- **§6.2** Off-hours premium (1.25× standard) requires explicit pre-approval per matter.
- **§6.3** Off-hours premium cap: $5,000 per matter per month, absent written waiver.
## §7 — Discounts and adjustments
- **§7.1** A 10% timely-payment discount applies if the firm pays within 30 days of invoice receipt.
- **§7.2** Disputed lines remain disputed until resolved; the firm may pay the undisputed portion within terms without losing the discount on that portion.
## §8 — LEDES and submission format
- **§8.1** All invoices shall be submitted in LEDES 1998BI or 2000 format unless the firm has approved a paper or PDF alternative in writing.
- **§8.2** Each line shall include `task_code` (UTBMS) and `activity_code` (UTBMS) values from the engagement letter's permitted list.
- **§8.3** Invoices shall be submitted within 30 days of month-end. Late invoices may be rejected.
## How the skill uses each section
- **Deterministic checks**: §3.2.a (block-billing verb count), §3.2.b (minimum increment), §4.1-§4.4 (staffing ratios when timekeeper roles are tagged), §5.1 (expense double-entry detection), §6.3 (premium-time monthly cap), §2.3 (rate variance).
- **LLM evaluation**: §3.1 (time-entry detail quality — pass/fail per narrative), §3.3 (review-entry quality), §5.4 (administrative-time detection in narratives), §7 / §8 (compliance posture).
## Customizing the template
When you adapt this template:
1. Add or remove sections to match the firm's actual guidelines. Don't keep template language that doesn't reflect the firm's policy.
2. Renumber IDs only when necessary; cross-reference old IDs in the changelog so audit-log entries stay interpretable.
3. Document the engagement-letter override path — most law departments allow per-matter exceptions to specific sections.
# LEDES parser notes
The bill auditor's deterministic checks operate on parsed line-item records. Most outside-counsel firms in the US ship invoices in LEDES (Legal Electronic Data Exchange Standard) format. This file documents the formats the skill handles and the per-format quirks.
## Supported formats
### LEDES 1998B (legacy)
- Pipe-delimited flat file. Header row + data rows.
- Each row represents one billed item (time or expense).
- Columns are positional, not named — the parser maps by position per the LEDES 1998B spec.
- Limited expense-detail granularity; expense category is one of ~20 codes.
### LEDES 1998BI (international)
- Same shape as 1998B with currency-code and tax fields added.
- Used by firms billing outside the US or in multiple currencies.
- The skill normalizes amounts to the engagement-letter base currency before deterministic checks.
### LEDES 2000 (XML, less common)
- XML format; richer schema including matter / sub-matter hierarchy and structured timekeeper records.
- The skill parses the timekeeper section once per invoice and joins to time entries by `timekeeper_id`.
- Most US firms still ship 1998B/1998BI; LEDES 2000 is more common in EU.
## Required columns (after parsing)
The skill expects each line, regardless of source format, to land in this normalized shape:
| Column | Type | Notes |
|---|---|---|
| `line_id` | string | Unique within the invoice. |
| `date` | ISO-8601 date | Date the work was performed. |
| `timekeeper_id` | string | LEDES timekeeper ID. |
| `timekeeper_name` | string | Display name. |
| `timekeeper_role` | string | `partner`, `senior_associate`, `associate`, `paralegal`, `other`. The skill needs role to apply staffing-ratio guidelines. |
| `task_code` | string | UTBMS task code (e.g. `L110` for legal research). |
| `activity_code` | string | UTBMS activity code (e.g. `A101` for plan and prepare for). |
| `hours` | number | 0 for expense lines. |
| `rate` | number | The hourly rate billed. 0 for expense lines. |
| `amount` | number | Line total. |
| `narrative` | string | The free-text time-entry description. |
| `is_expense` | boolean | True for expense lines. |
| `expense_category` | string | UTBMS expense code if `is_expense`; null otherwise. |
## Per-format quirks
### LEDES 1998B narrative width
The 1998B spec doesn't cap narrative width, but some submission portals truncate at 250-500 chars. Firms occasionally submit truncated narratives that look vague when they were originally detailed. The skill flags very-short narratives but does not auto-assume truncation; the legal-ops lead checks the source.
### Timekeeper role inference
LEDES doesn't include a `role` field directly. The skill infers role from the engagement letter's rate sheet (timekeeper rates tier into roles) OR from a per-firm `timekeeper_roles.csv` mapping if provided.
If neither is available, the skill flags every line where role-dependent guidelines (§4.x) apply as "role unknown — staffing check skipped" rather than guessing.
### Expense detail granularity
LEDES 1998B has a coarse expense category. For finer detail (e.g. distinguishing photocopying from CD-ROM duplication), the skill reads the `narrative` field of expense lines. Firms vary in how detailed expense narratives are.
### Partial-hour rounding
LEDES preserves the actual hours billed; rounding (or non-rounding) is the firm's policy. The skill doesn't enforce rounding policy — that's the engagement letter's job. The skill does flag suspicious patterns (every entry ending in .0 or .5, suggesting the firm rounds aggressively).
## CSV fallback
For invoices not in LEDES format (small firms, paper invoices, PDF-extracted), the skill accepts a CSV with the same normalized columns above. The CSV must:
- Use comma delimiter, double-quote text qualifier, UTF-8 encoding.
- Include header row.
- Use ISO-8601 dates.
A pre-parsed CSV is the recommended format when a PDF invoice has been OCR'd — manual cleanup of the CSV is more reliable than auto-extraction from PDF, which often loses table alignment.
## Audit-log line storage
The audit log captures `findings_by_category` (aggregated) and per-line findings IDs, NOT the full invoice. Rationale: invoice content is privileged; the audit log should be retainable longer than the invoice and shouldn't carry the privileged content.
For full reproducibility of a finding, the legal-ops lead can re-run the skill against the original invoice file (which lives in the e-billing platform's record).
## What the skill does NOT do
- Calculate the dispute total (the legal-ops lead picks which findings to dispute).
- Communicate findings to the outside firm (the legal-ops lead handles the conversation).
- Enforce a fixed dispute response window (the engagement letter governs).
- Decide whether the finding is worth disputing relative to the firm relationship.
The skill is decision support, not negotiation automation.