Skip to the content.

A subscribable calendar of dates is useful. A subscribable calendar that also tells you what each case is about in a couple of sentences is more useful — especially when you’re tracking 30 cases and can’t remember which Wang is which.

When enabled, Case Calendar generates a 2-4 sentence prose summary for each docket and renders it on the index page under the case row. Summaries are opt-in, off by default, and only run on dockets where the source documents actually support a confident answer (the LLM is instructed to refuse rather than fabricate when they don’t).

← Back to docs

What gets summarized

The summary pipeline pulls three sets of source documents for each docket:

  1. Primary document — the latest indictment / superseding indictment / information for criminal dockets; the latest amended complaint / complaint / petition for civil. Establishes who’s involved and what the case is about.
  2. Disposition documents — judgments, plea agreements, verdict forms, orders of dismissal, dispositive memoranda. Anything that materially changes “where does the case stand”. A dispositive order in a busy civil case can be a hundred pages back from the latest entry, so the pipeline walks several pages of the docket newest-first to find it.
  3. Operator-provided documents (optional, see extra_documents below) — anything you’ve manually pointed the pipeline at to fill a CourtListener data gap.
  4. Operator-provided aggregation notes (optional, see aggregation_note below) — Additional text provided for clarification.

Those documents — plus a structured scaffold of the hearings and deadlines the extractor already recorded — go into a single LLM call. The model returns prose; Case Calendar persists it to the case_summaries table.

The page-rendered output looks like:

Mr. Jones is charged in the Northern District of Texas with one count of wire fraud conspiracy and five counts of wire fraud for his alleged role in a $2.6 million online romance-scam scheme. He pled guilty to the conspiracy count on January 14, 2025 pursuant to a plea agreement; sentencing is scheduled for May 28, 2026. Co-defendant Smith remains a fugitive abroad.

A live deployment with real summaries on real federal-court dockets is at casecalendar.net.

Summaries hyperlink the words themselves, the way a news article does. In the example above, is charged would link to the indictment, pled guilty to the plea agreement, and (on a concluded case) was sentenced to the judgment. The link points to the supporting document’s PDF — CourtListener’s own copy (storage.courtlistener.com) when available, with the Internet Archive mirror as a fallback (the same URL the calendar event bodies link to). There are no footnote numbers or “(see Doc 1)” markers — just the phrase a reader would naturally tap.

The model decides which phrase each document supports (it is the one that read them), so any document the pipeline feeds it can be linked — primary documents, dispositions, and operator-provided extra_documents alike, not a fixed list of phrases. Under the hood, each document is shown to the model with a short reference token that never reaches the page; the model links a phrase to a token, and Case Calendar resolves the token to a real URL before storing the summary. A phrase the model can’t tie to a document it was actually given — or one whose document has no openable URL (a paperless minute order, a not-yet-uploaded or sealed PDF) — is left as plain, unlinked text. A summary can never link to a document that wasn’t in the set the model summarized from.

There is nothing to configure; links appear automatically once summaries are enabled.

Enabling summaries

Add a top-level block to config.yaml:

case_summaries:
  enabled: true
  # provider: anthropic
  # model: claude-sonnet-4-6
  # allow_ocr: true
  # debounce_seconds: 300

The example above shows every key. The field-by-field reference — defaults and the provider-precedence cascade — lives in Configuration → Case summaries.

When enabled: true, summaries auto-refresh as part of sync and serve: whenever the syncer sees a new primary document or disposition — or whenever a hearing or deadline changes posture (gets marked held / cancelled, or rescheduled), even when no new document accompanies it — it flips the row’s stale flag. At the end of the sync (or after the debounce timer fires in serve), the pipeline regenerates every stale row before re-emitting the index. The page reflects the case’s current posture without you running anything manually.

Cost

Summaries add a higher-tier model on top of the always-on extraction cost. Measured per-provider backfill numbers, the CourtListener API rate limits, the price-table caveats, and how to read the llm-tokens log lines all live on the dedicated Cost page — it covers the whole pipeline, since extraction and CourtListener quota cost something even with summaries off.

The “insufficient documents” refusal

The summary LLM is instructed to refuse rather than fabricate when its inputs are too sparse to support a confident summary. If the primary document text is empty (image-only PDF that didn’t OCR), garbled (custom font subsets — see the installation page), or otherwise lacks the substance needed to identify the parties and the gist of the charges or claims, the model emits this exact sentence verbatim:

Documents available for this docket are insufficient to generate a reliable summary.

That gets stored and rendered like any other summary. Subscribers see the honest acknowledgement instead of a plausible-sounding hallucination, and operators can grep for the sentence in the database to find dockets that need attention (typically: install poppler/tesseract for local OCR, or point extra_documents at an out-of-band source).

This refusal is one of several truthfulness guardrails the summary prompt enforces, all backed by a deterministic post-generation guard (prompt rules alone are soft). What the model is told, and how the guard’s layers fit together, are covered in Data quality guardrails in the architecture overview.

Multi-docket aggregation

For cases that span multiple logical PACER dockets (district + appellate; co-defendants on separate dockets; parallel filings in different venues), the AI summary is generated per logical docket — one summary per distinct (docket_number, court_id) pair — then rendered as a labeled paragraph block on the index page:

3:24-cv-00100 (N.D. Cal.): The district court suit alleges …

24-12345 (9th Cir.): The Ninth Circuit appeal challenges …

To frame the litigation strategy for the model, add an aggregation_note on the case:

- id: anthropic-v-dow
  name: "Anthropic v. DOW"
  calendar: tech
  dockets: [72380208, 72379655, 73136734]
  aggregation_note: >-
    Parallel suits challenging separate Department of War actions taken
    under distinct statutory authorities, each filed in the proper venue
    for the action it targets.

The note is only shown to the summarizer. It’s not rendered to subscribers. Keep it short and factual — the model uses it as framing, not as text to copy.

CourtListener sibling dockets pool into one summary

When multiple CourtListener docket_id values resolve to the same (docket_number, court_id) — typically because the upstream pacer_case_id changed mid-life and CourtListener stored the docket under two or more IDs — Case Calendar treats them as one logical PACER docket. The summary pipeline pools entries across every sibling docket_id in the group (deduplicating by PACER entry_number), generates a single summary, and renders one paragraph in the index. So the Akhter case listed as dockets: [71989485, 73333500, 73320754] where all three share docket number 1:25-cr-00307 in E.D. Va. produces one Sonnet call, one stored summary, and one paragraph — not three near-duplicate slices. See CourtListener sibling dockets for how to spot when this is happening and how to list them in config.yaml.

extra_documents

CourtListener and PACER sometimes don’t surface documents the public should be able to see. Two real failure modes the project has hit:

For those cases, point Case Calendar at the document directly. Each entry needs three fields:

- id: us-v-zewei
  name: "United States v. Zewei"
  calendar: cybercrime
  dockets: [70789744]
  extra_documents:
    - docket: 70789744
      url: https://www.justice.gov/opa/media/1407196/dl
      note: >-
        This PDF is the unsealed indictment in S.D. Tex. case
        4:23-cr-00523 (United States v. Xu Zewei).

The three fields — docket, url, and the required note — are documented in Configuration → extra_documents. The note rides into the prompt as trusted operator metadata; the document text itself is still treated as untrusted (the same way CourtListener / PACER text is).

Case Calendar fetches the bytes through the same pypdf → OCR fallback chain as it does for CourtListener documents, then feeds them to the LLM as their own labeled section. Each entry’s LLM block is headed OPERATOR-PROVIDED DOCUMENT (sourced outside CourtListener) with the operator’s note line beneath it.

Keep the note short — one sentence that identifies the document by name plus a case citation. The note is data fed to the summary LLM. Bug numbers, workaround details, or “remove this once CourtListener fixes it” all belong in a # comment in config.yaml, not in note. The LLM is summarizing the case for public subscribers; any mention of CourtListener internals or tooling state in its output would be both off-topic and a leak of internal context.

Remove each extra_documents entry once the upstream gap closes.

The index page renders a static <footer> block; two of its lines are legally-sensitive disclaimers kept out of the LLM’s output:

Case descriptions and calendar entries are generated by AI from public court filings and may contain mistakes — consult the linked dockets for authoritative information.

Criminal defendants are presumed innocent unless and until convicted in a court of law.

Both are rendered by the page template, not by the LLM. The legally-loaded text is stable regardless of model output or prompt revision. The summary prompt explicitly tells the model NOT to include these — they’re the renderer’s responsibility.

Next steps