prompt

面接質問バンク — コンピテンシータグ付きプロンプトパック

Difficulty

初級

Setup time

15min

For

recruiter · hiring-manager · interviewer

Recruiting & TA

Stack

Claude

ロールルーブリックを階層化された面接質問セット（行動質問：特定の状況下での過去の行動を調査、状況質問：仮定のシナリオへの対応、技術的深掘り：主張されたコンピテンシーへの掘り下げ、逆質問：候補者からの質問への期待と、それぞれが示すシグナル）に変換するClaude向けの構造化プロンプトパックです。すべての質問はルーブリック次元、差別化するアンカー、答えが準備済み過ぎる場合に聞くフォローアップのタグが付いています。「場当たり的に質問する」面接を、パネルが実際に通話前に開く質問ライブラリに置き換えます。

使う場面

ロールに書面によるルーブリックがある（構造化面接が前提条件）。
面接パネルに定期的に面接を行わない面接官（エンジニア、採用マネージャー、ICリード）が含まれており、ルーブリックにキャリブレーションされた準備済み質問を持って面接に臨む必要がある。
パネリスト間の一貫性を求めている。すべてのパネリストが同じアンカー質問のバリエーションを質問するため、デブリーフで同じ次元のノートを比較できる。
若手面接官をキャリブレーションしている。パックの「準備済み回答へのフォローアップ」アノテーションにより、より深いシグナルが可視化される。

使わない場面

目標がシグナル収集ではなくラポール構築の、非構造化カルチャーフィット面接。 別の会話です。このパックはシグナル収集ラウンド向けです。
ライブコーディング面接。 別のアーティファクト（コードアンドトーク形式）です。テイクホーム評価ワークフローはアーティファクト評価を扱い、ライブコーディングは独自のワークフローです。
公平性チェックを通過していないルーブリック — パックのプロンプトはルーブリック次元を調査する質問を生成します。問題のある次元も含めて。まずダイバーシティスレート監査のフレームワークまたはブール検索ビルダーの公平性プレフライトでルーブリックを確認してください。
1年間固定したい質問。 パックはロールごと・ルーブリックごとに再生成します。法的コンプライアンスのために凍結・レビューされた質問が必要な企業（一部の業界では必要）は、パックをスターターとして使用し、プロンプト自体ではなく出力を固定してください。

セットアップ

バンドルを配置する。 apps/web/public/artifacts/interview-question-bank-prompt-pack/interview-question-bank-prompt-pack.mdを面接官が読める場所（Notion、チームwiki、社内Claudeプロジェクトのナレッジファイル）に配置します。
ロールルーブリックを作成する。 スクリーンおよびリファレンスワークフローが使用するのと同じルーブリックです。なければプロンプトに調査すべき内容がありません。
ロールごとにClaudeプロジェクトを作成する。 ルーブリックをプロジェクトナレッジとして追加します。パック内の各プロンプトをプロジェクト内の保存済みプロンプトとして保存します。
質問を生成する。 各プロンプトをルーブリックに対して実行します。質問をパネルの面接準備ドキュメントにコピーします。各質問に質問する担当のパネリストをタグ付けします。
トーンとフィットをレビューする。 プロンプトは有能な質問を生成します。採用マネージャーが会社のボイスとロールの具体性に合わせて編集します。

パックの内容

12のプロンプトを3つのティアに分けて構成しています。

ティア1 — 行動質問（特定の状況下での過去の行動を調査）

行動質問は構造化面接の主力です。パックはルーブリック次元ごとにSTAR形式（状況、タスク、行動、結果）で質問を生成し、各質問に準備済みの回答を掘り下げるフォローアップを付けます。

B1. ルーブリック次元ごとに3件の行動質問を生成します。各質問に次元と区別するルーブリックアンカー（1〜5）をタグ付けします。
B2. 各行動質問に対して、答えが準備済み過ぎる（候補者がまさにこのストーリーを準備してきたと分かる）場合の掘り下げを1件生成します。掘り下げは別の例、反実仮想、または候補者が省略したステップを質問します。
B3. 次元において候補者が失敗した場合を調査する3件の行動質問を生成します。「私の欠点は完璧主義です」型の答えを事前に減らします。

ティア2 — 状況質問（仮定のシナリオへの対応）

状況質問は候補者がどのようにシナリオを処理するかを調査します。行動質問より信頼性は低いですが、候補者に直接比較できる過去の状況がない可能性があるシニアスコープの質問には有用です。

S1. ルーブリック次元ごとにロールのレベルに合わせた2件の状況シナリオを生成します。各シナリオはレベルに合わせてキャリブレーションされます（StaffスコープではなくシニアICスコープの問題、Directorスコープ問題ではなくManagerスコープの問題）。
S2. 各シナリオに対して、パネリストが聞くべき回答の次元（具体的な意思決定基準、決定前に何を確認するか、何を避けるか）を列挙します。

ティア3 — 技術的・クラフト深掘り

クラフトがあるロール（エンジニアリング、デザイン、営業手法）では、このティアは候補者の主張されたコンピテンシーを掘り下げる質問を生成します。

T1. ルーブリックのmust_haveスキルごとに5件の深掘り質問を生成します。各質問に「浅い」（候補者がスキルを持っているか確認するサニティチェック）または「深い」（スキルの境界を調査）のラベルを付けます。
T2. 各深掘り質問に対して、候補者の最初の答えが正しいが表面的な場合にパネリストが聞く3件のフォローアップを列挙します。
T3. スキルの存在を確認するのではなく、スキルのギャップを表面化させる2件の質問を生成します（「Xを使う必要があったがYがなかった時のことを教えてください」。候補者が限界に気づくかどうかを調査します）。

ティア4 — 逆質問（候補者が逆に質問してくること）

優秀な候補者は実質的な質問をします。弱い候補者は「カルチャーはどんな感じですか」と聞きます。このティアはパネリストが候補者の質問を読む助けとなります。

R1. 優秀な候補者が質問しそうな10件の実質的な質問を、各質問が示すシグナル（候補者がXについて考えている、Yを好む、Zを求めている）でグループ化して生成します。
R2. 弱い・汎用的な10件の質問と各質問が示すシグナル（候補者がリサーチしていない、基本的なことで不安になっている、特定の答えを求めている）を生成します。

コスト

Claude Sonnet 4.6でのロールごとの質問生成1回あたり：

LLMトークン — プロンプト呼び出しごとに通常5〜10kの入力（ルーブリック＋プロンプト＋スキル指示）と3〜6kの出力（生成された質問ライブラリ）。12のプロンプトすべてを実行した場合のロールあたり合計：約$0.30〜0.60。
面接官の時間 — 節約です。ロールごとに行動質問ライブラリを手作業で作成するには4〜8時間かかります。パックは30分のプロンプトと編集でスターターライブラリを提供します。
セットアップ時間 — ロールごとにClaudeプロジェクトをセットアップするのに15分。パックのファームごとのセットアップ（プロンプトの保存、チームwikiとの統合）は一度限りの30〜60分の作業です。

成功指標

月次で3つのことを追跡します：

パネリスト間の質問の重複 — 同じループで2名以上のパネリストが質問した質問の割合。キャリブレーションされたパックでは40%以上であるべきです（ルーブリック次元が通しの糸になります）。25%未満の場合、パネリストが即興しています。
デブリーフ時間 — 「最後の面接終了」から「決定記録完了」までの経過時間。デブリーフが同じ次元に基づいているため、約30%短縮されるはずです。
パネリストのノートに対する自信 — 定性的。パネリストに「質問ライブラリを持って面接に臨みましたか」と確認します。ほとんどの企業での正直な答えは「いいえ、即興しました」です。パックの成功指標はこれを「はい、そして役に立ちました」に変えることです。

代替手段との比較

vs 手作業による質問ライブラリ。 手作業が適切なのは、ルーブリックと質問が創業者の頭の中で共進化する小規模で素早く反復するチームです。ループあたり複数のパネリストにわたって採用するチームでは、パックのセットアップコストが回収できます。
vs ATS内蔵の質問バンク（Greenhouse Interview Plans、Ashby Interview Templates）。 ATS内蔵が適切なのは、チームがATSで作業し、文脈の中で質問を提示したい場合です。質問ライブラリを自分のリポジトリでバージョン管理し、ルーブリックが進化するにつれて再生成できるようにしたい場合はパックを選んでください。
vs ChatGPT形式の「シニアエンジニアの面接質問を教えて」。 汎用的なチャットは汎用的な質問を返します。パックは構造的に異なります。すべての質問にルーブリック次元、アンカー、フォローアップのタグが付いています。
vs 全く準備しない。 予測可能な失敗モード：パネリストが異なる質問をし、デブリーフがリンゴとオレンジを比較し、最初に発言した人の意見に流される。

注意点

ルーブリックからのバイアス継承。 対策： パックはルーブリックから質問を生成します。ルーブリックにバイアスのある次元（アンカーのない「カルチャーフィット」、学校名声スコアリング）がある場合、質問はそのバイアスを調査します。上流でルーブリックを監査してください。ダイバーシティスレート監査を参照してください。
質問の準備済み回答。 対策： パックのB2プロンプトは準備済みの回答への掘り下げを明示的に生成します。掘り下げは別の例または反実仮想を質問し、候補者が準備したスクリプトを再実行させません。
汎用的な質問の混入。 対策： 生成されたすべての質問は、ルーブリック次元と区別するアンカーを参照する必要があります。アンカーを参照しない質問は、パネリストが削除または書き直すためにプロンプトの出力でフラグが立てられます。
パネリスト間の質問難易度の不一致。 対策： プロンプトはキャリブレーション対象のルーブリックアンカー（1〜5）でタグ付けされています。同じ次元について異なる質問をする2名のパネリストでも、同じアンカーにキャリブレーションされています。
長さの肥大化。 対策： パックのプロンプトは出力を「次元あたり3件、最大12次元」にキャップしています。典型的なロールのライブラリは500件ではなく約50〜80件の質問になります。採用マネージャーはパネルスロットあたり実際に使用する8〜15件を選びます。
古くなったルーブリックに基づく陳腐化した質問。 対策： ルーブリックが変更されたらパックを再実行してください（パックは速い — 30分は安い）。面接準備ドキュメントからリンクされた古い質問ライブラリは、他に何も対処しなければ静かに陳腐化します。

スタック

アーティファクトバンドルはapps/web/public/artifacts/interview-question-bank-prompt-pack/にあり、以下を含みます：

interview-question-bank-prompt-pack.md — Claudeに貼り付けるだけで使える12のプロンプト

ワークフローが前提とするツール：Claude（モデル）。出力はNotion、チームwiki、またはATSの面接プランテンプレートに追加します。

関連概念：構造化面接、行動面接、面接ループ設計、採用品質。

GitHubでこのページを編集

Files in this artifact

Download all (.zip)

# Interview Question Bank — Twelve Prompts for Claude

A pack of structured prompts for generating a tiered interview question library from a role rubric. Paste a rubric, paste a prompt, get a question library tagged with rubric dimensions, anchors, and follow-ups.

## How to use this pack

1. Create a Claude project named `interview-questions-<role-slug>` per role.
2. Drop the role rubric in as project knowledge. Every prompt below assumes the rubric is loaded and reads from it.
3. Save each prompt below as a saved prompt within the project, tagged by tier.
4. Run them in order. Most produce 30-90 second outputs at Sonnet 4.6 speed.
5. Review the outputs. Edit for the firm's voice. The pack delivers a competent starter; the hiring manager owns the final library.

## Rubric input shape

The pack assumes a rubric with this shape (the same shape used by the screening, take-home, and reference workflows):

```json
{
"role": "Senior Backend Engineer",
"level": "Senior IC (L5)",
"dimensions": [
{
"id": "skill_match",
"label": "Production Go or Rust experience and distributed-systems depth",
"anchors": {
"1": "...",
"2": "...",
"3": "...",
"4": "...",
"5": "..."
}
},
...
]
}
```

If the rubric doesn't load, the prompts halt and ask for the rubric file.

---

# Tier 1 — Behavioral

## B1. Three behavioral questions per rubric dimension

```
Role: You are a structured-interviewing question author. Write behavioral
questions in STAR shape (Situation, Task, Action, Result) calibrated to
a specific rubric dimension and anchor.

Context: The rubric is loaded as project knowledge. Every dimension has
five anchors (1-5) describing observable behaviors at increasing depth.

Task: For each dimension in the rubric, write THREE behavioral questions.
Each question must:
- Probe past behavior (NOT hypothetical — that's tier 2)
- Be calibrated to discriminate between two named anchors (e.g. "this
question discriminates anchor 3 from anchor 4")
- Be answerable in 3-5 minutes by a candidate at the role's level
- Avoid leading language ("tell me about a successful project" is leading;
"tell me about a project that didn't go as planned" is neutral)

For each question, output:
- The question text
- The dimension it probes
- The anchors it discriminates between
- The signal you're listening for in a strong answer
- The signal you're listening for in a weak answer

Things to avoid:
- Questions that probe traits ("are you a team player?") — probe behavior.
- Questions that have a single right answer — questions that probe the
candidate's framing of the problem.
- "Culture fit" questions without behavioral anchors.
- Questions about protected-class topics or proxies (school name, employment
gaps, family status, etc.).

Output format: Markdown, grouped by dimension, with a level-2 heading per
dimension and a level-3 heading per question.
```

## B2. Drill-down questions for rehearsed answers

```
Role: You are a structured-interviewing question author. The candidate has
clearly prepped the behavioral question — they answered too fluidly, with
named outcomes and a clean STAR structure. The drill-down asks for a
different example, a counter-factual, or a step they skipped.

Context: The rubric is loaded as project knowledge. Tier-1 behavioral
questions (B1 output) are loaded too.

Task: For each B1 question, produce ONE drill-down. Each drill-down must:
- Ask for a different example, the same dimension ("walk me through a
different time you had to do that")
- Or ask for a counter-factual ("what would you have done differently?")
- Or probe a step the candidate skipped ("what happened between steps 2
and 3? You moved fast there")
- Land in 30-60 seconds — drill-downs are quick probes, not new questions

For each drill-down, output:
- The drill-down question
- The B1 question it pairs with
- What the rehearsed-answer pattern looks like (so the panelist knows
when to use the drill-down)

Things to avoid:
- Aggressive or interrogator-style framings — the drill-down is curiosity,
not confrontation
- Drill-downs that probe a different dimension than the B1 question
- Drill-downs that ask the candidate to defend their original answer

Output format: Markdown, paired with the B1 question.
```

## B3. Behavioral questions probing failure / negative cases

```
Role: You are a structured-interviewing question author. Standard behavioral
questions probe success. Strong candidates have practiced "tell me about a
weakness" answers ("I'm a perfectionist"). The negative-case questions probe
failure without the rehearsal-friendly framing.

Context: Rubric loaded as project knowledge.

Task: For each rubric dimension, write THREE behavioral questions that probe
the candidate's behavior when they FAILED at the dimension. Each question
must:
- Probe a real failure ("tell me about a time you misjudged X")
- NOT be the "weakness" question
- Calibrate to the rubric — failure on a dimension at level 3 looks
different from failure at level 5 (a junior engineer's worst day is
a senior engineer's normal day)
- Be answerable without the candidate having to disclose a confidential
incident — the question should work even if the example is sanitized

For each question, output:
- The question text
- The dimension it probes
- The signal of a healthy failure narrative (named cause, named
correction, named lesson)
- The signal of an unhealthy narrative (blame on others, no specific
cause, no lesson, "I'm a perfectionist" pattern)

Things to avoid:
- Asking the candidate to disclose a regulated or proprietary incident
- Probing for failures that are protected-class adjacent (gaps, etc.)
- Stress-test framings ("describe your biggest professional regret")

Output format: Markdown, grouped by dimension.
```

---

# Tier 2 — Situational

## S1. Two situational scenarios per rubric dimension

```
Role: You are a structured-interviewing question author. Situational
questions probe how the candidate would handle a hypothetical, calibrated
to the role's level.

Context: Rubric loaded as project knowledge. Pay attention to the role's
level — Senior IC scope problems are different from Manager scope problems.

Task: For each rubric dimension, write TWO situational scenarios. Each
scenario must:
- Be calibrated to the role's level (a Senior IC scenario probes
cross-team-system tradeoffs; a Staff IC scenario probes org-wide
architectural tradeoffs)
- Be answerable in 5-8 minutes
- Have multiple defensible answers — the scenario probes the candidate's
framing, not a "right answer"
- Stay grounded in real situations the candidate would plausibly hit —
not contrived puzzles

For each scenario, output:
- The scenario as a paragraph (set the stage in 50-100 words)
- The opening question
- Two follow-up probes (drill-downs based on the candidate's first
response — "if they answer X, ask Y; if they answer Z, ask W")

Things to avoid:
- Trick questions or gotchas
- Scenarios that depend on the candidate having seen this exact stack
- Scenarios with a single "smart" answer — those probe pattern-matching,
not judgment

Output format: Markdown, grouped by dimension.
```

## S2. Listening dimensions per scenario

```
Role: You are a structured-interviewing notetaker primer. Panelists listen
for specific things in each scenario answer. Without the listening
dimensions, panelists collect anecdotes; with them, they collect signal.

Context: Rubric loaded as project knowledge. S1 scenarios loaded too.

Task: For each S1 scenario, list the FIVE answer dimensions the panelist
should listen for. Each dimension must:
- Be observable in the candidate's spoken answer (NOT something only
visible in their resume or in code)
- Map to a specific rubric anchor

For each scenario, output:
- "What strong answers do" (3 bullets — name the moves, the questions
asked back, the criteria stated)
- "What weak answers do" (3 bullets — name the failure patterns: jumping
to solution without clarifying, ignoring constraints, generic framing)
- "What to write down" (the concrete notes the panelist takes that anchor
the debrief later)

Output format: Markdown, paired with each S1 scenario.
```

---

# Tier 3 — Technical / craft deep-dive

## T1. Five deep-dive questions per must-have skill

```
Role: You are a craft-deep-dive question author. The role's rubric names
must-have skills. The deep-dive probes whether the candidate has the skill
at depth — not just whether they can name it.

Context: Rubric loaded as project knowledge. Focus on the must_have skills.

Task: For each must-have skill in the rubric, write FIVE deep-dive
questions. Label each as:
- SHALLOW: sanity check the candidate has the skill at all (e.g. "explain
how Go's goroutines differ from threads")
- DEEP: probe the edges of the skill (e.g. "walk me through the time you
debugged a goroutine leak — what symptoms led you to suspect it, what
tools did you use, what was the fix")

The mix should be 1 shallow + 4 deep per skill. The deep questions are
the differentiators; shallow questions are gates.

For each question, output:
- The question text
- SHALLOW or DEEP label
- The skill it probes
- The signal of a 4-anchor answer (depth, but not edge-case awareness)
- The signal of a 5-anchor answer (depth + named edge cases + cited
failure modes the candidate has personally hit)

Things to avoid:
- Trivia questions ("what's the keyword in Rust for X?") — probe usage,
not memorization
- Questions whose answer changes between language versions (probe
fundamentals, not the latest framework's API)
- Whiteboard-coding questions framed as discussion — those are a
different format

Output format: Markdown, grouped by skill.
```

## T2. Three follow-ups per deep-dive question

```
Role: You are a deep-dive follow-up author. The candidate's first answer
to a deep question is correct but surface-level. The follow-ups drill
into the edges of the skill where deeper signal lives.

Context: Rubric loaded. T1 questions loaded.

Task: For each T1 deep question, produce THREE follow-ups. Follow-ups
must:
- Drill into a specific edge of the skill (the failure mode, the limit
of the technique, the case where the standard answer breaks)
- Be open-ended — the candidate constructs the answer
- Be answerable in 2-4 minutes each
- Surface the candidate's ability to ENGAGE with not-knowing — partial
answers are signal too

For each follow-up, output:
- The follow-up question
- What edge of the skill it probes
- What an "I don't know but here's how I'd find out" answer looks like

Things to avoid:
- Follow-ups that assume the candidate already gave a wrong answer
- Stacked follow-ups that pile pressure rather than probe depth

Output format: Markdown, paired with each T1 question.
```

## T3. Gap-finding questions per skill

```
Role: You are a gap-finding question author. Most technical questions
probe whether the candidate HAS a skill. Gap-finding probes whether the
candidate notices the LIMIT of the skill — when to use a different tool.

Context: Rubric loaded.

Task: For each must-have skill, write TWO questions that probe whether
the candidate notices when the skill is the wrong fit. Each question must:
- Describe a situation where the candidate's claimed skill would be
over-applied or mis-applied
- Ask the candidate to identify the alternative tool
- Probe judgment, not memorization

For each question, output:
- The question text
- The skill it probes
- The strong-answer pattern (named alternative tool, named criteria for
when each fits)
- The weak-answer pattern (defends the original tool, doesn't notice
the limit, claims the original tool covers the case)

Output format: Markdown, grouped by skill.
```

---

# Tier 4 — Reverse questions

## R1. Substantive questions a strong candidate might ask

```
Role: You are a reverse-question reader. Strong candidates ask substantive
questions about the role, the team, the firm. The questions they ask
signal what they prioritize.

Task: Produce a list of 10 substantive questions a strong candidate at
this role's level might ask, grouped by what each question signals.

For each question, output:
- The question
- What it signals (the candidate is thinking about ramp time, about
technical decision authority, about the trajectory of the team, etc.)
- How the panelist should respond — honestly, with specifics, not with
the recruiter pitch

Things to avoid:
- Generic questions ("what's the culture like?") — those are tier R2
- Questions that fish for the panelist's commitment ("would you
recommend joining?") rather than probe substance

Output format: Markdown, grouped by signal.
```

## R2. Generic / weak questions and what they signal

```
Role: You are a reverse-question reader. Weak candidates ask generic
questions. Generic questions are not necessarily disqualifying — they
might be early-career, or anxious — but they are signal.

Task: Produce a list of 10 generic / weak questions and what each
signals.

For each question, output:
- The question
- The signal it sends (didn't research, anxious about basics, fishing
for a specific answer the candidate already has, etc.)
- How the panelist should react — usually, redirect to a more specific
question to give the candidate another chance

Output format: Markdown, grouped by signal.
```

## R0. Synthesizing the reverse-question read

```
Role: You are an interview debrief facilitator. The candidate asked
N questions across the loop. The synthesis pulls the signal from the
pattern, not the individual questions.

Context: Rubric loaded. R1 + R2 outputs loaded.

Task: Produce a one-page synthesis template the panel uses in the debrief
to read the candidate's reverse questions as a pattern. The template
captures:
- Which substantive areas the candidate probed (matching to R1 signals)
- Which generic patterns showed up (matching to R2 signals)
- The cross-panelist read — different panelists got different questions;
the pattern is the read

Output format: A markdown template the panel fills in during the debrief.
```