01 · A curated starter pack, not a survey
Welcome — here's seven skills + a house style
This site is a curated set of seven Claude Code skills — small, reusable workflows you can drop into ~/.claude/skills/ and invoke from any Claude Code session — plus the design-system reference Kareem uses for documentation websites. The skills were picked specifically for someone whose main task is digitizing old PDFs with Gemini: getting structured tables out of scans, validating extractions, prompting Gemini well, and not over-trusting one model's view of a hard call. The design-system reference is what makes any documentation site you build for Kareem feel "right" to him at a glance.
SKILL.md describes one possible way of doing a thing — a default protocol someone else found useful in a different context. Your Claude has much more context on this project than this site does: which diagnostics you've already found load-bearing for the registers, which prompt patterns actually work for which schema, which methodology choices were already settled in earlier years and don't need re-debating. Before installing or invoking any skill literally, ask your Claude to adapt it. Something like: "Read this SKILL.md. Given what we've actually learned from the 1905 run and the earlier years, how would you modify this skill so it fits our register pipeline? Then save the adapted version to ~/.claude/skills/<name>/SKILL.md." The skills are most valuable when they're tailored to your work, not when they're applied as-is.
Each skill panel has:
- A one-line description and a copy-paste invocation
- Why it's useful specifically for the digitization task
- The key gotchas (things you'd otherwise learn the hard way)
- A download button that gives you the full
SKILL.mdsource - The full
SKILL.mdembedded in a collapsible block so you can scroll through it without leaving the page
The 30-minute plan
- Read panel 02 (Setting up Codex). Three of the most useful skills here (Audit, Council
--mixed, Deep-research) depend on having Codex installed. Do this once and forget it. - Read panels 03–06 carefully. First-look, Prompt, Audit, and Council are the four essentials. ~18 min total.
- Skim panels 07–09. Handoff, Deep-research, and Read-paper. Note which apply — you can come back. Read-paper is not for the register scans; it's for the occasional reference paper.
- Read panel 10 (Web aesthetics). Your current
results.htmlstyle is fine for self-contained reports. The house style is for new shareable doc sites. - Download the 4 essentials. Drop each into
~/.claude/skills/<skill-name>/SKILL.md. The directory has to match the skill name. - Hand the markdown handout to your Claude. Ask it: "Read this. For each skill, tell me how I'd adapt it to the next register year I work on." Then iterate.
Installing Claude Code (if you haven't yet)
Claude Code is the CLI that runs these skills. On macOS:
brew install --cask claude-code
# then sign in via your Anthropic / Claude.ai account
claude
Installing a skill
A skill is just a folder under ~/.claude/skills/ with a SKILL.md file. To install council, for example:
mkdir -p ~/.claude/skills/council
# then save the downloaded council.md as SKILL.md inside that folder:
mv ~/Downloads/council.md ~/.claude/skills/council/SKILL.md
Restart Claude Code (or run /help) and the skill is invocable as /council. Same pattern for every skill on this site — the folder name has to match the name: field in the skill's frontmatter.
The skills, at a glance
| Skill | Tier | One-line | Needs Codex? | Best fit for your work |
|---|---|---|---|---|
| first-look | Essential | diagnostic protocol on a new extraction before interpreting it | — | after each 50-row QA sample |
| prompt | Essential | restructure an informal ask into Role/Task/Context/Constraints | — | every new schema-specific Gemini prompt |
| audit | Essential | send recent code changes to Codex for independent review | yes | before committing any change to build_inherited.py, clean.py, extract.py |
| council | Essential | parallel critic agents on a topic/file, separate synthesis | for --mixed | methodology forks ("re-extract whole register vs. just footnote pages?") |
| handoff | Occasional | generate a tight continuation prompt at end of a session | — | end of a long Claude Code session |
| deep-research | Occasional | parallel multi-vendor research (Claude + Codex + Gemini) | yes | once per new register year, for prior-art scoping |
| read-paper | Reference only | read an academic PDF safely via 4-page chunks + sub-agent | — | NOT for your register scans — only for reference papers Kareem sends you |
| web aesthetics | Occasional | design-system reference for documentation websites | — | new doc sites; your existing results.html reports are fine as-is |
Pipeline map — when to reach for which skill
Your per-year work cycle on the Official Register has the same shape every time: extract a full volume with Gemini, QA-sample 50 rows, find failure modes, patch prompts and post-processing, targeted re-extract, validate. Here's where each skill fits into that loop.
| Stage in the year's pipeline | Reach for | Why |
|---|---|---|
| Drafting a new schema-specific Gemini prompt (e.g. a House-Reps or military variant for a different year) | /prompt | Structures Role/Task/Context/Constraints/Output Format. Surfaces assumptions you forgot to state ("did I tell Gemini what to do with footnote superscripts?"). |
Just finished extraction; about to read rows_inherited.csv for the first time |
/first-look | Distributions, missingness, mass points, sentinel codes. Your "8.8% compensation_type=unknown" and "96.9% agency-aligned" lines are literally first-look output. |
About to commit a change to build_inherited.py, clean.py, or extract.py |
/audit | Catches silent data bugs Claude introduces — off-by-one ditto walks, sample restrictions that drop legitimate rows, column-misalign heuristics that fire on the wrong pattern. The 1905 ditto-chain bug (phantom Bryan row poisoning Clement's title) is exactly the class of issue. |
| Methodology fork-in-the-road ("re-extract whole 1907 register, or just footnote-exposed pages?") | /council --mixed codex | Five parallel critics with one non-Claude voice. Use sparingly — only for irreversible or compounding methodology calls. |
| Closing a multi-hour Claude Code session | /handoff | Max-5-bullet continuation prompt so the next session picks up at the right file/page. |
| Starting on a new register year you've never touched | /deep-research | "Has anyone digitized the 1879 schedule? What's the canonical column set? What footnote conventions does this year use?" One pass per year, then archive. |
| Building a new shareable documentation site for the project | design-system.html | Use the house palette + fonts (panel 10). Your existing internal results.html reports do not need to be rewritten. |
02 · A one-time setup that unlocks 3 skills
Setting up Codex
Three of the most useful skills on this site — Council (with --mixed), Audit, and Deep-research — rely on the OpenAI Codex CLI as a second-vendor brain. The whole point is that you don't want every critical opinion in your workflow to come from Claude. Codex (an OpenAI model with a CLI wrapper) gives an independent read on the same input.
What Codex is, briefly
Codex CLI is OpenAI's coding-assistant CLI. It runs locally, authenticates against your ChatGPT account, and accepts a single prompt via codex exec. The Council skill uses it as one of the critic agents; the Audit skill uses it to review code diffs; Deep-research dispatches research tasks to it in parallel with Claude.
Install
On macOS, install via Homebrew or npm:
# option A — Homebrew
brew install codex
# option B — npm (works everywhere)
npm install -g @openai/codex
Authenticate
Run codex once. It will print a URL, you'll log into ChatGPT in your browser, and it will store credentials locally.
codex
# follow the prompt — sign in via your ChatGPT account (a Plus or Pro plan is required)
# you should land on a "Codex ready" message; quit with Ctrl-C
Test it
Once authed, this one-liner should print a short response:
env -u OPENAI_API_KEY codex exec --skip-git-repo-check -C /tmp "Say hello in one sentence."
env -u OPENAI_API_KEY? If your shell has an OPENAI_API_KEY set (e.g., for some other workflow), Codex will try to use it as an API key — but the Council/Audit/Deep-research skills assume Codex is authed via your ChatGPT account, not by API key. The env -u OPENAI_API_KEY prefix strips the variable for just that one command. The skills already include this prefix in their invocations.
What changes after Codex is set up
/council --mixedautomatically swaps one Claude critic slot for Codex. You'll see one critic that reasons very differently from the others — that's the point./auditworks out of the box. Diffs your latest changes against the last commit and sends them to Codex for an independent code review./deep-researchwith--tool both(the default) runs Claude WebSearch and Codex in parallel on the same research question.
Optional: Gemini CLI
The Council and Deep-research skills also support Gemini as a third vendor. If you want this:
npm install -g @google/gemini-cli
gemini auth # one-time login
Not required. Council degrades gracefully — if Gemini CLI is missing, it falls back to an all-Claude panel and prints a note.
06 · For methodology calls, not daily decisions
Council — parallel critics, no echo chamber
Dispatches up to 5 critic agents in parallel, each with a different lens, then runs a separate synthesis pass. --mixed swaps one Claude critic for Codex (or Gemini) so you get a non-Claude opinion in the mix.
Why this matters for the methodology calls in your work
When Claude tells you a pipeline change looks good, that's one opinion. For a multi-year digitization project the cost of a silent methodology error compounds across every register year: a wrong column-detection rule, a Gemini prompt that loses footnote markers, a sampling assumption that under-represents military pages. Council is how you stop Claude from being the only voice in the room — spin up critics with explicit personas (a footnote-skeptic, a "this year's schema is more heterogeneous than 1905" critic, a budget hawk) and use --mixed codex so one critic isn't another Claude.
The pattern that matters: parallel dispatch, separate synthesis, single round. No iterative debate (which drifts toward conformity). Hard cap of 5 critics (beyond that you're just adding noise). Use this for methodology forks, not daily decisions — it's 5 model calls instead of 1.
Realistic invocations for the next register year
# Council the footnote-fix strategy from the 1905 error report
/council should I (a) re-extract the entire 1907 register with a footnote-aware prompt, or (b) detect footnote-exposed pages first and re-extract only those, given that the 1905 run cost $48 total? --type decision --mixed codex
# Council a plan file before committing methodology to a new year
/council file:.claude/plans/1907_extraction_plan.md --type plan --mixed codex
# Council the sampling strategy for cross-year accuracy comparison
/council I'm reusing the same 50-row stratified sample format used in 1905_try. For multi-year accuracy comparisons, should I stratify by schema (standard / military / other) or by page-density bucket? --type decision --n 4 --mixed codex
Important flags
--mixed codex— swap ONE Claude critic for Codex. This is the flag your boss insists on. Without it, you're getting four Claudes who all share a worldview.--type plan|paper|decision|grant— picks a default panel of personas suited to the task type.decisiondefaults to skeptic, pre-mortem, chief-of-staff.planadds a budget-hawk. Match the type to what you're doing.--n K— number of critics, default 3, max 5.file:<path>— Council a written document instead of a free-text topic. Useful for reviewing your own plan files.--chef-skill— special hardcoded 3-role panel for reviewing the design of a skill or tool. Works without persona files.
Gotchas
- The default panels need persona files. The skill expects files like
~/.claude/agents/skeptic-agent.md. If you don't have them, the skill aborts with a clean error (it does NOT silently fall back to a generic agent). Use--chef-skillfor the no-persona path, or ask Kareem for his persona files. - Council is not free. It's 5 model calls instead of 1. Don't reach for it on simple decisions. It pays off on irreversible or compounding choices — methodology, extraction prompts, sampling strategy.
- Single round only. The skill never runs Round 2. If the synthesis verdict surprises you, re-formulate the question and run Council again with a different topic, not the same topic with "now respond to the critics."
How to ask your Claude to set this up for the project
/council for the first realistic fork in the 1907 pipeline."
Full SKILL.md source council/SKILL.md · ~200 lines
---
name: council
description: Dispatch N parallel critic agents (hard cap 5) on a topic or file, then run a separate synthesis pass. Built-in panels for plan/paper/grant/decision review. Use --chef-skill flag to review a skill/tool design with a hardcoded 3-role panel that requires no persona files. Use --mixed to swap one Claude critic for Codex or Gemini.
argument-hint: "<topic> | file:<path> [--type plan|paper|grant|decision] [--n K] [--panel a,b,c] [--mixed codex|gemini] [--chef-skill]"
---
# /council — Parallel Critics + Separate Synthesis
(See the Download button above for the full file.)
Dispatches N critic agents in parallel, collects raw outputs, and runs a separate synthesis pass. Hard cap of 5 critics. Single-round only. Never majority-votes on narrative output.
## Config
```
COUNCIL_CRITIC_MODEL=opus # or sonnet for cheaper runs
COUNCIL_SYNTHESIS_MODEL=opus
COUNCIL_MAX_CRITICS=5
COUNCIL_DEFAULT_N=3
```
## Default panels (resolve from --type or keyword inference)
| Task type | Default panel | N |
|-----------|---------------|---|
| Plan / architecture review | skills-engineer, skeptic, pre-mortem, budget-hawk | 4 |
| Paper / proposal review | academic-editor, harsh-referee, methodologist | 3 |
| Grant proposal | + grant-strategist, funder-officer | 5 |
| Decision support | skeptic, pre-mortem, chief-of-staff | 3 |
## --mixed (peer swap)
If `--mixed codex|gemini` was passed, swap ONE Claude critic for a cross-vendor peer. Preserves the 5-critic cap.
| Peer vendor | Best swap target | Rationale |
|---|---|---|
| Codex | skeptic, pre-mortem, harsh-referee | Codex strong on empirical rigor + failure modes |
| Gemini | completeness-checker, domain-expert, academic-editor | Gemini strong on long-context synthesis |
## Dispatch phases (summary)
1. Parse topic + flags, resolve panel
2. Meta-reference guard (skip if --chef-skill)
3. Persona file existence check — abort if missing, no silent fallback
4. Rate-limit advisory
5. Parallel Task dispatch (one message, N Task calls)
6. Collect raw outputs (no inline synthesis)
7. Separate synthesis dispatch (general-purpose, fresh context)
8. Emit synthesis + raw critic outputs in <details> collapsible
9. Log to ~/.claude-assistant/logs/council-invocations.csv
For the full skill text including the --chef-skill short-circuit, mixed-mode dispatch details, and the synthesis template, download the SKILL.md file via the button above.
03 · Before you interpret, look at the data
First-look — distributions before theories
Diagnostic protocol for any new dataset or puzzling result. Runs .describe(), checks mass points at 0 and at scale maxima, hunts for sentinel missing codes, and reads the data dictionary before you build a regression on top.
Why this is essential for your work
This is the skill you already do manually after every extraction — the 1905 results.html reports compensation-type counts (45.1% per_year, 30.7% per_day, 8.8% unknown), 96.9% agency-path alignment, and 698 column-misalign rows. That whole "headline numbers" section is first-look output. Formalizing it as a skill means it happens consistently and surfaces things you'd otherwise miss.
Concretely: before you trust an extraction — before you say "we got 96% on 1907" — look at distributions. Which columns are zero or empty for too many rows because Gemini treated blanks as zeros? Which compensation values cluster suspiciously (a mass of "1.00" rows because the package's default fallback is 1)? Which name fields are 100% non-null because Gemini hallucinated names rather than admit it couldn't read a cell? First-look catches all of these in 30 seconds.
The cardinal rule: when a result surprises you, your first move is .describe(), not a theory. If you find yourself writing "one possible explanation is..." before you've looked at the distribution, stop and go back to step 1.
results.html already includes register-specific checks the default doesn't know about: compensation-type breakdown by schema, agency-path alignment audit (Gemini's visible_agency vs. the ToC-derived bureau_path), share of ditto-resolved cells per bureau, column-misalign repair count, vessel-page exclusion sanity check, schema-mix vs. prior years. Some of those are more useful for the registers than the generic mass-point checks. The goal is one standardized post-extraction protocol you run after every year — not adopting someone else's diagnostic set wholesale. Before installing, ask your Claude to take the default SKILL.md, merge in your register-specific diagnostics, and possibly add others Claude suggests that you haven't tried yet but make sense for tabular OCR output. Save the adapted version. The skill is a template; the version that ends up in ~/.claude/skills/first-look/SKILL.md should be your tailored one.
The default protocol in one minute
- Describe every variable.
df.info(),df.describe(include='all').T. Check dtype, count (missingness), mean vs. median, min/max. - Compute mass points. For each numeric column: share at 0, share at max, share missing. >5% mass at 0 or at the scale max is a structural feature you need to diagnose before using the column.
- Plot the distribution (or print quantiles). Histograms or
df[col].quantile([0, .01, .05, .25, .5, .75, .95, .99, 1]). Look for bimodality, long tails, unit errors. - Hunt for sentinel missing codes. -999, -99, 9999, "NA", "N/A", "null", empty strings, year=1900 in historical data, 0 where 0 can't legitimately mean anything.
- Read the documentation. The README, codebook, or — for an extraction package — the actual prompt template the package ships with. Don't trust your mental model.
- Compare expected vs. observed. Write down (in one sentence) what you expected and what you saw. If they don't match, explain the mismatch distributionally before reaching for a substantive story.
When to trigger it (without being asked)
- Loading or referencing a new dataset, even just to "check" something
- A result is puzzling — unexpected sign, magnitude, or correlation
- Before running a regression or correlation on a column you haven't inspected this session
- When using an unfamiliar package — read its source/config first
Two prompts: one to adapt the skill, one to run it
The skill is most valuable if you customize it once, then invoke the customized version on every new year. Two prompts:
SKILL.md. I've been doing post-extraction diagnostics on the U.S. Official Register (1865–1959) for a while. Here's what my 1905 results.html currently checks that the default SKILL.md doesn't: compensation-type breakdown by schema (standard / military / other / legacy_simple / minimal); agency-path alignment audit (rows where Gemini's visible_agency matches some node of the ToC-derived bureau_path — aligned / transition / mismatch_low); share of ditto-resolved cells per bureau; column-misalign repair count; vessel-page exclusion check; schema-mix vs. prior years. Some of these are probably more load-bearing for the registers than the default's generic mass-point checks. Propose an adapted SKILL.md that (a) keeps the cardinal rule and the trigger conditions, (b) replaces or augments the default protocol with my register-specific checks, and (c) adds anything else you think would catch register-specific failure modes I haven't already coded. Show me the diff before saving. Save the final adapted version to ~/.claude/skills/first-look/SKILL.md."
pipeline/extract_full/1907_try/rows_inherited.csv. Before I write the results.html report, run my customized /first-look. Then on top of the standardized protocol, flag anything that looks like a footnote-marker contaminant in compensation (a value starting with a small isolated digit followed by a dollar amount) since that's a known 1905 failure mode I want to monitor across years. One sentence on what you expected vs. observed. Don't theorize about substantive patterns yet."
Full SKILL.md source first-look/SKILL.md · ~120 lines
(See the Download button above for the full file. Highlights below.)
# First-Look Diagnostics
The cardinal rule: when a result surprises you, your first move is .describe(), not a theory.
## When to trigger (proactively, without being asked)
1. User loads or references a new dataset.
2. A result is puzzling (unexpected sign, magnitude, or correlation).
3. Before running a regression or correlation on a column you haven't inspected.
4. When using an unfamiliar package — read its source/config first.
## The protocol
Step 1 — Describe every variable: df.info(); df.describe(include='all').T
Step 2 — Mass points: for each col, share at 0, share at max, share NaN
Step 3 — Distribution: histogram or quantiles
Step 4 — Sentinel codes: -999, -99, 9999, "NA", empty, year=1900
Step 5 — Read the documentation (package source, codebook, prompt templates)
Step 6 — Expected vs. observed: one sentence, then diagnose distributionally before theorizing
## Common failure modes
- Interpreted a correlation matrix without checking one column was 56% zeros.
- Wrote a historical explanation for a sign reversal without checking row counts.
- Trusted a mental model of a package's scale convention without reading the source.
- Claimed a pipeline was correct because a log said "100% match" without checking what was being matched.
04 · Structure beats stream-of-consciousness
Prompt — format then execute
Takes a dictated/stream-of-consciousness request and restructures it into Role/Task/Context/Constraints/Output Format, then either executes it directly or routes it through /council for review.
Why it's essential for the Gemini prompts in your pipeline
Your extract.py already ships five schema-specific Gemini prompts (standard, military, 1881, minimal, other). Each new register year may introduce a new sub-variant (the House-Reps prompt was a 1905 special case; future years will have their own). The single biggest determinant of extraction quality is whether those prompts are tight. Vague prompts → hallucinated rows (like the phantom Bryan duplicate in 1905). Over-specified prompts → refusals on ambiguous pages.
The /prompt skill is a structural discipline: it forces you to articulate Role / Task / Context / Constraints / Output Format. Even when you don't end up using the formatted version verbatim in Gemini, the act of running it surfaces the assumptions you were making — "did I specify what to do with footnote superscripts on compensation cells?", "did I tell it to ignore dot-leaders in titles?", "did I tell it that 'do' is ditto, never a name?"
Three depth levels
- Light (default) — formatting only, no extra rationale. Use for everyday asks.
- Standard — formatting + assumptions/rationale block. Use when the request has implicit assumptions you want surfaced.
- Deep — formatting + research/compare/verify block. Use when you want Claude to actually go look something up before answering.
Useful tokens
hold— show the formatted prompt but don't execute. Useful when you want to copy the structured prompt into Gemini, not run it through Claude.council— after formatting, route through/councilinstead of executing directly. Combines structure + multi-critic review in one step.depth:deep council— chain them. Format deeply, then council the formatted prompt.
Example uses for the register work
# Format the next iteration of the standard-schema Gemini prompt (with the 1905 footnote fix layered in), don't execute
/prompt I want gemini to extract every row from this 1907 register page in the standard 6-column schema (Name, Official title, Where born, Whence appointed [state/county/cong_dist], Where employed, Compensation), preserving original spelling, treating 'do' as ditto (never as a name), dropping any leading superscript digit on the compensation cell since that's a footnote marker not part of the number, and returning JSON with one object per row hold
# Format the House-Reps prompt variant for a new year (3 reps per district line, state header above)
/prompt I need a gemini prompt that handles the House of Representatives layout, where the leftmost cell is 'STATE. Nth.' (state + district ordinal) and each row should set whence_appointed_state=state, whence_appointed_cong_dist=district, official_title='Representative', where_employed='Washington, D. C.' hold
# Format a methodology question, then have a council review it
/prompt for the 1907 register should the ditto resolver scope be bureau_path or visible_agency given that some pages have two agencies depth:standard council
Important
- Don't over-engineer. Light is the default for a reason. A 1-sentence ask doesn't need a 20-line structured prompt.
counciltoken is opt-in only./prompt Xnever auto-routes to council. You have to add the literal word.- Use with dictation. If you dictate your asks (Wispr Flow, etc.),
/promptis especially helpful — it cleans up the run-on phrasing.
Full SKILL.md source prompt/SKILL.md · ~50 lines
---
name: prompt
description: Restructure an informal/dictated request into a properly-formatted prompt (Role/Task/Context/Constraints/Output Format) and execute it. Use when the user has a stream-of-consciousness ask, or wants to dispatch the formatted prompt via /council. Light/standard/deep depth.
argument-hint: "<your ask> [depth:light|standard|deep] [council] [hold]"
---
# /prompt — Format and Execute
Format an informal request into a structured prompt, then execute it.
You are a prompt formatter. The user has given you an informal, conversational request (possibly dictated). Your job:
1. Parse the intent: extract the core task, audience, and desired output.
2. Calibrate depth: Light (default), Standard (+ assumptions), Deep (+ research/verify).
3. Format into a structured prompt using the formatting elements in formatting-core.md.
4. Inject depth directives if Standard or Deep.
5. Show the formatted prompt in a fenced code block.
6. Tool-routing check: if another tool would serve better, flag it.
7. Council opt-in: if input contains literal token `council`, dispatch via /council.
8. Execute the prompt immediately (unless step 7's council token was present, or `hold` was passed).
9. Ask ONE clarifying question ONLY if ambiguity would change the output significantly.
Important:
- Do NOT over-engineer simple requests.
- Match complexity of formatting to complexity of task.
- Light depth is the default.
- If the user says "hold" or "don't run" or "just format", show but do not execute.
- `council` token: opt-in only. /prompt does NOT default-wrap in council.
09 · For reference papers, not your source corpus
Read-paper — 4-page chunks, sub-agent
Reads an academic paper PDF safely by splitting it into 4-page chunks and delegating to a sub-agent. Produces both curated notes (for slides) and searchable full-text markdown. Prevents context overflow and forces careful reading.
/read-paper on a register scan, the sub-agent will skim it as if it were prose, miss every column structure, and waste 10 minutes of Claude tokens producing useless notes. This panel is here only for the occasional reference paper use case described below.
When this applies to your work
You'll occasionally need to read reference papers — methodology papers about OCR/HTR, papers describing prior digitizations of historical registers, IPUMS / census linkage methodology, or papers Kareem sends to contextualize the project. For those, this skill is the right tool. Expect to invoke it a handful of times across the whole project, not weekly.
Why the 4-page chunking matters
If you hand Claude a 50-page paper PDF, it skims. The 4-page chunking forces a sub-agent to read carefully, three chunks at a time, updating notes after each batch. The output is two files:
paper_summaries/Author_Year.md— curated structured notes (research question, method, findings, key figures, quotable passages, limitations)paper_markdown/Author_Year.md— full searchable text (good for "did this paper mention X?")
Critical gotchas (read these before invoking)
- Never bypass this skill with an ad-hoc "read this paper" agent. The 4-page chunking is what makes the output good; a regular agent will skim.
- The sub-agent must read PDF splits, not just the pymupdf4llm markdown. The markdown loses table formatting, figures, and layout. The splits preserve them.
- Always extract limitations and open questions. The discussion section is where you'll find what the authors couldn't do — high-value for understanding what's not in their dataset.
- Prefer the published version over working papers. Author websites usually have the journal version freely. Better figures, authoritative.
Example uses for the register project
# Read a methodology paper Kareem sent you about historical-record linkage
/read-paper ~/papers/Abramitzky_Boustan_Eriksson_linkage_methods.pdf
# Read a paper by search query (skill will find it on the web)
/read-paper "Goldin Sokoloff 1982 women's relative wages historical census"
Full SKILL.md source read-paper/SKILL.md · ~210 lines
(See the Download button above for the full file. Highlights below.)
# Read Academic Paper
Read an academic paper PDF by splitting it into 4-page chunks and extracting structured notes.
This prevents context window overflow and produces better extraction than reading a full PDF at once.
## Pipeline
Step 1 (main context): Acquire the PDF
- If local: verify exists, copy to reference_papers/
- If search query: WebSearch, prefer published version over working paper
- CRITICAL: always preserve the original PDF
Step 2 (main context): Convert full text to markdown via pymupdf4llm
- Saves to paper_markdown/Author_Year.md
- For scanned PDFs (low text ratio), skips this step
Step 3 (main context): Split into 4-page chunks via PyPDF2
- Saves to splits_<name>/<name>_pp1-4.pdf, _pp5-8.pdf, ...
Step 4 (sub-agent): Read all splits, 3 at a time, updating notes after each batch
- Must read every PDF split via the Read tool (no shortcuts to markdown grepping)
- Extracts: research question, data, method, findings, figures, quotables, lecture relevance, limitations
- Outputs: paper_summaries/Author_Year.md
Step 5 (main context): Read summary, present 5-10 line brief to user
## When NOT to split
- Papers under ~15 pages: sub-agent can read directly
- Quick triage: read just pages 1-4 for abstract + intro
- Slides/presentations: read directly in main context
08 · Three independent takes on the same question
Deep-research — three vendors, parallel
Dispatches parallel research on a topic across Claude WebSearch, Codex CLI, and Gemini CLI. Archives raw reports per-project. Default --tool both = Claude + Codex.
When this helps
Before you commit time to a new register year, spend 10 minutes finding out whether someone has already done it. Deep-research dispatches the same question to two or three independent search-capable models in parallel and gives you their reports side-by-side. For the Federal Register project, this is roughly a one-time pass per new year (and maybe once at the start of a new methodology question):
- Prior-art scoping — "Has the 1879 Official Register been digitized at scale beyond the existing efforts? With what tool? Where's the dataset?"
- Vendor/method comparison — "Gemini 3 Flash vs. Claude vision vs. AWS Textract on dense 19th-century federal employment tables — what does the recent literature say?"
- Domain context — "What footnote conventions does the 1893 Official Register use? Did the salary-rate footnote markers exist in pre-1900 volumes?"
Three vendors is the right number. One vendor → you're trusting one search index. Two vendors → you see disagreement but can't break the tie. Three → you can spot the consensus and the outlier.
How it routes
By default the skill writes reports into a project folder rather than dumping them into your conversation. You can pass --project <name> to override; otherwise it picks a route based on a config file.
Example invocations
# Default: Claude + Codex (two vendors)
/deep-research what are the best practices for extracting tabular data from 19th-century federal employment registers using vision-LLMs in 2026
# All three vendors, deep depth
/deep-research has the U.S. Official Register been digitized at scale by any prior research group? --tool all-auto --depth deep
# Quick mode (one vendor only, fastest)
/deep-research what footnote conventions appear in the pre-1900 Official Register, and how should they be distinguished from compensation digits --quick
Gotchas
- Codex must be set up first. See panel 02.
- Gemini CLI is optional. Skill degrades to Claude+Codex if Gemini isn't installed.
- It pre-structures your topic via
/promptbefore dispatching. Even tight questions go through the prompt formatter; the latency (5–15s) is worth the quality lift. Override only with--no-prestructure. - Reports get archived to the project folder. Read them; don't just skim Claude's summary. The whole point is to see the raw outputs side-by-side.
Full SKILL.md source deep-research/SKILL.md · ~400 lines
(See the Download button above for the full file. This skill is large — ~400 lines — because it has two modes (dispatch + absorb) and detailed routing config. The highlights:)
# /deep-research
Federated deep-research workflow. Reports live in their project folder; a thin pointer INDEX in a central archive directory enables cross-project discovery.
## Two modes
1. Dispatch: /deep-research <topic> [--tool ...]
Builds a DR-tuned prompt via /prompt, routes to a project folder, dispatches to one or more research tools, and archives results.
2. Absorb: /deep-research --absorb [<file>]
Scans Projects/*/raw-inputs/ for unindexed files (drag-dropped from external tools like ChatGPT or Gemini Deep Research browser) and processes them into canonical form.
## Automatic tools (no paste-loop)
- Claude WebSearch subagent
- OpenAI Codex CLI
- Gemini 2.5 Pro CLI
## Paste-loop tools (opt-in only)
ChatGPT, Grok, Perplexity, Gemini Deep Research (browser).
## Phases (dispatch mode)
0.5 - Pre-structure via /prompt (mandatory; no skip on well-formed inputs)
1 - Prompt build via the 8-element DR Prompt Schema
1.4 - Project routing (pick PROJECT_DIR before dispatch)
1.5 - Tool selection
2 - Codex quota check (RED blocks unless --force)
3 - Parallel dispatch
4 - Archive reports to project folder + index pointer
5 - Optional synthesis pass
05 · Codex as a second pair of eyes on your data pipeline
Audit — independent code review
Sends your recent code diff to OpenAI Codex for an independent review. Catches bugs, logic errors, wrong variable references, sample-restriction issues, and silent data-quality problems that Claude might miss.
Why this is essential for your pipelines specifically
Your scripts (extract.py, build_inherited.py, clean.py, reextract_other.py, validate_agency.py) are pure data-quality code. The bugs they hide are silent — nothing throws an error; you just get wrong numbers downstream.
results.html sample (Alfred B. C. Clement, p.41, State Department clerks) was a ditto-chain failure: a phantom duplicate "Henry L. Bryan — Assistant law clerk — $1,800" row in Gemini's raw JSON poisoned the ditto resolver, so every subsequent "do" row inherited the wrong title. The fix — a chain-sanity check that rejects an anchor whose name appeared earlier on the same page — would have been surfaced by an /audit pass on build_inherited.py: "the resolver never validates that the anchor it inherits from isn't a duplicate." This is the exact class of bug Audit catches and code review by Claude often misses.
When to invoke
Any time you've written non-trivial pipeline code with Claude and you're about to commit. The skill diffs your latest changes against the last commit, sends them to Codex with a focused review prompt, and surfaces:
- Bugs, logic errors, silent data issues
- Wrong variable references or typos Claude introduced
- Sample restrictions that may silently drop or include the wrong observations (the most dangerous bug in data pipelines — a wrong filter doesn't error, it just gives wrong answers)
- Off-by-one and boundary issues (relevant for ditto walks, page-range slicing, strip-merge dedup)
- Inconsistencies between what the code does and what comments say
How to invoke
# Review all uncommitted changes (most common)
/audit
# Review only staged changes
/audit staged
# Review the last commit (after you've committed but before you push)
/audit last-commit
# Review a specific file — useful when you've touched many
/audit pipeline/extract_full/1907_try/build_inherited.py
Important
- This is a second opinion, not the truth. Codex operates with limited context and will sometimes flag intentional design choices as bugs. Always validate against the actual codebase before changing anything. If Codex is wrong, say why.
- Codex must be set up. See panel 02.
- The skill uses
env -u OPENAI_API_KEYso Codex uses its ChatGPT-account auth, not an API key.
Full SKILL.md source audit/SKILL.md · ~55 lines
---
name: audit
description: Send recent code changes to OpenAI Codex for independent review. Catches bugs, logic errors, and silent issues that Claude might miss.
argument-hint: "[optional: file path, 'staged', 'last-commit', or 'v28 v29']"
---
# Codex Audit
## Steps
1. Determine what to review (no arg = HEAD diff; 'staged' = --cached; 'last-commit' = HEAD; file path = file-specific diff).
2. Get the diff. Save to /tmp/codex_audit_diff.txt. If empty, stop.
3. Send to Codex:
env -u OPENAI_API_KEY codex exec --skip-git-repo-check -C /tmp \
"Review the code diff in /tmp/codex_audit_diff.txt. Look for:
- Bugs, logic errors, or silent data issues
- Wrong variable references or typos
- Sample restrictions that may silently drop/include wrong observations
- Off-by-one errors or boundary issues
- Inconsistencies between what the code does and what comments say
- Any change that could produce wrong results without an obvious error
Be specific: cite line numbers and explain WHY something is a problem.
If you find no real issues, say so briefly — do not invent problems." \
--full-auto -o /tmp/codex_audit_result.txt
4. Read and present results. For each finding: state issue, cite file:line, explain impact, add your own assessment (agree or false positive).
5. Clean up temp files.
## Important
- Codex is a SECOND OPINION — limited context, may flag intentional choices as bugs.
- Always validate Codex's findings against the actual codebase.
- env -u OPENAI_API_KEY is required (so Codex uses ChatGPT auth, not API key).
07 · End a session, start the next one fast
Handoff — continuation prompt
Generates a tight (max 5 bullet points), structured continuation prompt that a fresh Claude Code session can paste in to pick up where this session left off.
Why this matters for a long digitization project
Multi-month projects have lots of context that gets lost between Claude Code sessions. The skill produces a fenced block with exactly four sections: what was accomplished, what remains, blockers/gotchas, and key files touched. Max 5 bullet points across all sections. The discipline is enforced by the skill — vague entries are not allowed.
How to use it
At the end of a work session — when you're about to close Claude Code or your context is getting long — run /handoff. Copy the fenced block. When you start the next session, paste it as your first message.
What it produces
## Session Handoff — 2026-05-18
**What was accomplished:**
- Added chain-sanity check to pipeline/extract_full/1907_try/build_inherited.py (rejects ditto anchors whose name appeared earlier on same page)
- /first-look on rows_inherited.csv flagged 132 rows with compensation starting with a small isolated digit — likely footnote-marker contamination
**What remains:**
- Build footnote-detector script; identify pages with footnoted salary rates
- Re-extract just those pages with the updated footnote-aware prompt
- Re-run the 50-row QA sample, stratified to over-sample footnote pages
**Blockers / gotchas:**
- pp.412-418 in the 1907 michigan scan are double-exposed; need an alternate source from version_downloads/
**Key files touched:**
- pipeline/extract_full/1907_try/build_inherited.py
- pipeline/extract_full/1907_try/rows_inherited.csv
- pipeline/extract_full/1907_try/audit_footnote_compensation.csv
Rules the skill enforces
- Max 5 bullet points total across all sections — be ruthless.
- Every bullet must include a specific file path or concrete detail. No vague statements.
- Absolute dates, not "today" or "yesterday."
- Also updates the project's
MEMORY.mdif one exists.
Full SKILL.md source handoff/SKILL.md · ~40 lines
---
name: handoff
description: Generate a concise session continuation prompt for the next session. Use at the end of a work session.
argument-hint: ""
---
# Session Handoff Generator
Generate a plain-text continuation prompt that a fresh Claude Code session can use to pick up where this session left off.
## Format
Output a fenced block with exactly this structure (no more, no less):
## Session Handoff — [DATE]
**What was accomplished:**
- [1-3 bullet points, each one line]
**What remains:**
- [1-3 bullet points with specific next steps]
**Blockers / gotchas:**
- [Any issues the next session should know about, or "None"]
**Key files touched:**
- [List of file paths that were created or modified]
## Rules
- Max 5 bullet points total across all sections. Be ruthless about brevity.
- Every bullet must include a specific file path or concrete detail.
- Use absolute dates, not relative ("2026-03-12", not "today").
- Do NOT include code blocks, verbose context, or explanations.
- Also update the project's MEMORY.md with any new information from this session.
10 · So your documentation websites match the rest of the team's
Web aesthetics — the house style
results.html reports are fine. The current style (Playfair Display + Source Sans 3, navy/gold, sticky nav + long scroll) is well-suited to a linear narrative report and doesn't need to be rewritten. This panel is about new shareable doc sites — not retroactive cleanup. The marginal readability win comes from the palette + fonts, not the layout: your long-scroll structure already matches the house style's "idiom B" linear-report pattern.
When you build a new documentation website that Kareem will share more broadly — explaining a methodology, summarizing a multi-year extraction, walking a co-author through the pipeline — matching the visual language of his other sites makes it land at a glance. The other internal sites (Indian Census Rolls, Agricultural Census, Indian Schedules, DAR Lineage Books) all share a palette, three Google Fonts, and one of two layout idioms. That's not an accident; it's a design system worth reusing for new work.
This panel hands you the spec in a form your Claude can read. The goal: when Kareem asks for a new doc site, you don't have to invent an aesthetic — you point your Claude at this spec and at the right idiom for the content type.
The canonical project-agnostic spec. Covers palette, fonts, the two layout idioms (A: tabbed-dark default, B: longscroll-light alternative), section headers, card patterns, status pills, anti-patterns, and the usage checklist. Read this first.
Or open the spec in a new tab to browse it (it's a website itself).
The palette and fonts, in one block
/* Canonical palette */
--ivory: #FAF9F5; /* page background, warm */
--paper: #FFFFFF; /* card background */
--slate: #141413; /* body text */
--clay: #D97757; /* accent — links, emphasis, the only "color" */
--oat: #E3DACC; /* warm divider */
--olive: #788C5D; /* affirmative pill / "yes" state */
--rust: #B04A3F; /* negative pill / "no" state */
--g100..g700; /* warm-gray scale, for body, borders, eyebrow text */
--rail-bg: #1c1a17; /* dark charcoal — only used for Idiom A's sidebar */
--rail-ink: #f3ede0; /* cream text on dark rail */
/* Fonts (Google Fonts; load all three) */
--serif: 'Instrument Serif'; /* h1/h2 display, italic for emphasis */
--sans: 'Inter'; /* body, UI, sub-20px text — weight 600 not faux-bold */
--mono: 'JetBrains Mono'; /* code, eyebrows, numbers, status pills */
Two layout idioms — pick by content type, not by default
The design system defines two layout patterns. They share everything (palette, fonts, components) except layout. Pick by what the content is, not by habit:
#1c1a17), 6–9 hash-routed panels with a soft fade transition. This is what the RA-skills site you're reading uses. Pick when content is navigable by topic (reference docs, methodology indexes, multi-feature dashboards).#F4EDDD) with scroll-spy table of contents, all content in continuous scroll, anchor-linked. This is the natural fit for your results.html per-year reports — one story, top to bottom. Pick when content is a single linear narrative, when ctrl+F across the whole doc matters, or when print-friendly is required.Your existing results.html already has the idiom-B structure (sticky nav + long scroll). Swapping the palette and fonts to the house style is roughly a 1-hour change for any new report; doing it retroactively on older reports is optional.
Reference files (download or browse)
| File | What it is | Open |
|---|---|---|
| design-system.html | The spec itself — palette, type scale, section header pattern, card components, status pills, anti-patterns, usage checklist. Read this first. | view · download |
| idiom-tabbed-dark.html | Reference implementation of idiom A — a full working tabbed-panel site to copy as the starting template for browse-by-topic doc sites. | view · download |
Three "don't relitigate" gotchas
These three points have been hard-won; please don't have your Claude re-discover them by trial and error:
- Rail color. The dark sidebar is
#1c1a17, NOT#2a2622. The lighter variant looks fine in isolation but fails contrast against cream text. - Sub-20px serif. Instrument Serif is single-weight (400 only). For card titles, stat values, table-header names — anything under 20px that needs to feel "bold" — use Inter weight 600, not Instrument Serif weight 500. The latter triggers browser faux-bold and looks mushy.
- Mobile overflow-wrap. At
≤880px, scopeoverflow-wrap: anywheretocode,.t-ex,.field-table td:first-child,.schema,.src .src-url a— NOT plain<td>. The blanket-td version over-wraps prose ("Endpoint" → "Endp/oint/prob/e"). Plushtml, body { overflow-x: hidden }as a safety net.
How to prompt your Claude
results.html report for the 1907 Official Register extraction (linear narrative: summary → pipeline steps → improvements → schemas → compensation → accuracy → remaining errors → files). Read design-system.html first for the palette tokens, type scale, and anti-patterns. Use idiom B (long scroll + light rail / scroll-spy ToC) since this is one linear story top-to-bottom. Match my existing 1905 results.html structure but with the house palette (ivory/clay) and fonts (Instrument Serif / Inter / JetBrains Mono) instead of navy/gold/Playfair."
design-system.html first, then open idiom-tabbed-dark.html and use it as the starting template — tabbed dark-rail layout exactly. Don't invent new colors, don't change the fonts."
Anti-patterns (don't do these)
- Pure black or pure white backgrounds. Tint them warm — ivory or paper.
- Default system fonts. Load at least one Google Font (Instrument Serif for display, Inter for body, JetBrains Mono for code).
- Purple-to-blue or cyan-on-dark gradients. The accent color is
--clay; the only "color" on the page. - Neon glow, bounce/elastic easing, glass-morphism / frosted blur.
- Emoji as section headers. Use SVG illustrations or eyebrow lines.
- Four shades of indigo. The palette is what it is.
- Icon-only buttons without
aria-label.