ooligo
prompt

Interview question bank — competency-tagged prompt pack

Difficulty
beginner
Setup time
15min
For
recruiter · hiring-manager · interviewer
Recruiting & TA

Stack

A pack of structured prompts for Claude that turn a role rubric into a tiered set of interview questions: behavioral (probe past behavior under named conditions), situational (response to a hypothetical), technical-deep-dive (drill into a claimed competency), and reverse-questions (what to expect from the candidate, with what answers signal what). Every question is tagged with the rubric dimension it probes, the anchor it differentiates between, and the follow-up to ask if the answer is too rehearsed. Replaces the “we just wing it” interview with a question library the panel actually opens before the call.

When to use

  • The role has a written rubric (structured interviewing prerequisite).
  • The interview panel includes interviewers who don’t run interviews regularly — engineers, hiring managers, IC leads — and need to walk in with prepped questions calibrated to the rubric.
  • You want consistency across panelists. Every panelist asks variants of the same anchor questions, so the debrief compares notes on the same dimensions.
  • You’re calibrating a junior interviewer. The pack’s “follow-up if rehearsed” annotations make the deeper signal visible.

When NOT to use

  • Unstructured cultural-add interviews where the goal is rapport, not signal. Different conversation. The pack is for signal-collection rounds.
  • Live coding interviews. Different artifact (code-and-talk format). The take-home evaluator workflow handles artifact-evaluation; live-coding is its own workflow.
  • Rubrics that haven’t passed a fairness check — the pack’s prompts will produce questions that probe the rubric dimensions, including the bad ones. Run the rubric through the diversity slate auditor framing or the Boolean search builder fairness pre-flight first.
  • Questions you want to lock down for the year. The pack regenerates per-role per-rubric. If your firm needs frozen, reviewed questions for legal compliance (some industries do), use the pack as a starter and lock the output, not the prompts themselves.

Setup

  1. Drop the bundle. Place apps/web/public/artifacts/interview-question-bank-prompt-pack/interview-question-bank-prompt-pack.md somewhere your interviewers can read (Notion, the team wiki, an internal Claude project’s knowledge files).
  2. Author the role rubric. Same rubric the screen and reference workflows use. Without it, the prompts have nothing to probe.
  3. Create a Claude project per role. Drop the rubric in as project knowledge. Save each prompt in the pack as a saved prompt within the project.
  4. Generate the questions. Run each prompt against the rubric. Copy the questions to the panel’s interview-prep doc. Tag each question with the panelist who’ll ask it.
  5. Review for tone and fit. The prompts produce competent questions. The hiring manager edits them for the firm’s voice and the role’s specifics.

What the pack contains

Twelve prompts, in three tiers.

Tier 1 — Behavioral (probe past behavior under named conditions)

Behavioral questions are the workhorses of structured interviewing. The pack generates questions in the STAR shape (Situation, Task, Action, Result) per rubric dimension, with a follow-up for each that drills past the rehearsed answer.

  • B1. Produce 3 behavioral questions per rubric dimension. Each tagged with the dimension and the rubric anchor (1-5) it discriminates between.
  • B2. For each behavioral question, produce one drill-down for the case where the answer is too rehearsed (the panelist can tell the candidate has prepped this exact story). The drill-down asks for a different example, a counter-factual, or a step the candidate skipped.
  • B3. Produce 3 behavioral questions that probe the negative — when the candidate failed at the dimension. Pre-emptively reduce the “I’m a perfectionist” type non-answer.

Tier 2 — Situational (response to a hypothetical)

Situational questions probe how the candidate would handle a scenario. Less reliable than behavioral but useful for senior-scope questions where the candidate may not have a directly-comparable past situation.

  • S1. Produce 2 situational scenarios per rubric dimension at the role’s level. Each scenario is calibrated to the level (Senior IC scope problems, not Staff scope; Manager scope problems, not Director scope).
  • S2. For each scenario, list the answer dimensions the panelist should listen for (specific decision criteria, what they ask before deciding, what they avoid).

Tier 3 — Technical / craft deep-dive

For roles where there’s a craft (engineering, design, sales methodology), this tier produces questions that drill into the candidate’s claimed competency.

  • T1. Given the rubric’s must_have skills, produce 5 deep-dive questions per skill. Each labeled “shallow” (sanity check the candidate has the skill at all) or “deep” (probe the edges of the skill).
  • T2. For each deep-dive question, list 3 follow-ups that the panelist asks if the candidate’s first answer is correct but surface-level.
  • T3. Produce 2 questions that surface a gap in the skill rather than confirm presence. (“Tell me about a time you had to use X but didn’t have Y.” Probes whether the candidate notices the limit.)

Tier 4 — Reverse questions (what the candidate asks back)

Strong candidates ask substantive questions. Weak candidates ask “what’s the culture like.” This tier helps the panelist read the candidate’s questions.

  • R1. Produce a list of 10 substantive questions a strong candidate might ask, grouped by what each question signals (the candidate is thinking about X, prefers Y, is looking for Z).
  • R2. Produce a list of 10 weak / generic questions and what each signals (candidate didn’t research, is anxious about basics, is fishing for a specific answer).

Cost reality

Per role’s question generation, on Claude Sonnet 4.6:

  • LLM tokens — typically 5-10k input (rubric + prompt + skill instructions) and 3-6k output (the generated question library) per prompt invocation. Total per role: roughly $0.30-0.60 if running all 12 prompts.
  • Interviewer time — the win. Hand-authoring a behavioral question library per role is 4-8 hours; the pack delivers a starter library in 30 minutes of prompt-and-edit.
  • Setup time — 15 minutes to set up the Claude project per role. Per-firm setup of the pack (saving prompts, integrating with team wiki) is a one-time 30-60 minute task.

Success metric

Track three things, monthly:

  • Cross-panelist question overlap — share of questions asked by ≥2 panelists in the same loop. Should be ≥40% on a calibrated pack (the rubric dimensions ARE the through-line); below 25% means panelists are improvising.
  • Debrief time — wall-clock from “last interview ends” to “decision recorded.” Should drop by ~30% because debriefs are anchored on the same dimensions.
  • Panelist confidence in their notes — qualitative; ask the panelists “did you walk in with a question library?” The honest answer at most firms is “no, we improvised” — the pack’s success metric is moving that to “yes, and it helped.”

vs alternatives

  • vs hand-authored question library. Hand-authoring is the right call for a small fast-iterating team where the rubric and the questions co-evolve in the founders’ heads. The pack earns its setup cost on teams that hire across multiple panelists per loop.
  • vs ATS-native question banks (Greenhouse Interview Plans, Ashby Interview Templates). ATS-native is the right call if your team lives in the ATS and wants questions surfaced in-context. Pick the pack if you want the question library version-controlled in your own repo and re-generatable as the rubric evolves.
  • vs ChatGPT-style “give me interview questions for senior engineer.” Generic chat returns generic questions. The pack is structurally different: every question is tagged with a rubric dimension, an anchor, and a follow-up.
  • vs no prep at all. Predictable failure mode: panelists ask different questions, the debrief compares apples and oranges, the decision drifts to whoever spoke first.

Watch-outs

  • Bias inheritance from rubric. Guard: the pack generates questions FROM the rubric. If the rubric has biased dimensions (“culture fit” without anchors, school-prestige scoring), the questions probe the bias. Audit the rubric upstream — see the diversity slate auditor.
  • Question rehearsal. Guard: the pack’s B2 prompt explicitly produces drill-downs for rehearsed answers. The drill-down asks for a different example or a counter-factual; it does not let the candidate re-run the prepped script.
  • Generic questions slipping through. Guard: every generated question must reference the rubric dimension and the anchor it discriminates between. Questions that don’t reference an anchor are flagged in the prompt’s output for the panelist to drop or rewrite.
  • Inconsistent question difficulty across panelists. Guard: the prompts are tagged with the rubric anchor (1-5) they’re calibrated to. Two panelists asking different questions about the same dimension are still calibrated to the same anchors.
  • Length blowup. Guard: the pack’s prompts cap output at “3 per dimension, 12 dimensions max” — a typical role’s library lands at ~50-80 questions, not 500. The hiring manager picks 8-15 to actually use per panel slot.
  • Outdated questions on stale rubrics. Guard: re-run the pack when the rubric changes (the pack is fast — 30 minutes is cheap). Old question libraries linked from interview-prep docs go stale silently otherwise.

Stack

The artifact bundle lives at apps/web/public/artifacts/interview-question-bank-prompt-pack/ and contains:

  • interview-question-bank-prompt-pack.md — the twelve prompts, paste-ready into Claude

Tools the workflow assumes you use: Claude (the model). The output drops into Notion, the team wiki, or an ATS interview-plan template.

Related concepts: structured interviewing, behavioral interviewing, interview loop design, quality of hire.

Files in this artifact

Download all (.zip)