How we merged two overlapping court order trackers, enriched missing fields with Claude Haiku for a few cents, replaced 200 paywalled links with free CourtListener alternatives, and shipped a searchable explorer — all through conversational prompting with Claude Code.
Public data can source prosecution leads. An open-source fraud-scoring system, run against the full SBA PPP dataset, identified the same lenders, geographies, and loan populations that DOJ prosecuted — using nothing but a downloadable CSV and a standard laptop.
The PPP fraud pipeline worked because the SBA released everything. Medicare's public data is fragmented, de-identified, and missing the features detection needs. Here's what exists on GitHub, where it falls short, and what CMS would need to release to let outside analysts do for healthcare fraud what one Python repo did for PPP.