Methodology
This page explains how The Epstein Index collects, processes, and categorizes information. Understanding our methodology helps you evaluate the quality and reliability of the data presented.
Evidence Levels
Every entity and relationship in the graph is assigned an evidence level based on its strongest supporting source:
| Level | Description |
|---|---|
| Court Record | Directly sourced from court filings, verdicts, depositions, or legal proceedings. This is the highest tier of evidence. |
| Official Document | Government records, flight logs, financial filings, law enforcement reports, and official press releases. |
| Credible Reporting | Reporting from established news outlets (NYT, WSJ, BBC, Miami Herald, etc.) with named sources and editorial oversight. |
| Multiple Sources | Information corroborated by two or more independent sources, but not yet confirmed by official records. |
| Single Source | Reported by only one source, without independent confirmation. Treat with appropriate skepticism. |
| Alleged | Claims made in legal filings or by individuals that have not been corroborated. These are allegations, not established facts. |
| Rumored | Speculation, anonymous tips, social media claims, or unverified reports. Included for completeness but should not be treated as reliable. |
Data Sources
Our data is drawn from these categories of public sources:
- Court Records: Unsealed filings from Giuffre v. Maxwell, SDNY criminal case, Palm Beach County proceedings, and related civil litigation
- Government Documents: DOJ press releases, FBI records, FOIA responses, SEC filings, and congressional testimony
- Official Records: Flight logs (FAA records), property records, corporate registrations, and financial disclosures
- News Reporting: Investigative journalism from established outlets including the Miami Herald, New York Times, and others
- Depositions & Testimony: Sworn statements from court proceedings and investigations
- Public Archives: DocumentCloud, PACER, CourtListener, and government archives
Collection & Extraction Process
Data enters the system through eight automated collectors that run on scheduled intervals:
- News Collector — Monitors news sources for Epstein-related reporting (every 6 hours)
- Court Documents Collector — Checks for new unsealed filings and legal documents (daily)
- FOIA Collector — Monitors government FOIA responses and releases (daily)
- Wikipedia Collector — Extracts structured data from relevant Wikipedia articles (daily)
- Epstein Files Collector — Processes documents from public Epstein-related archives (daily)
- Deep Research Collector — AI-powered thread pulling that follows connections across sources (daily)
- Source Discovery Collector — Proactively discovers new sources via link harvesting, semantic search, and graph topology analysis (every 8 hours)
- Social Media Collector — Monitors social media for verified, credible reports (every 12 hours)
Collected documents are processed by an AI extraction pipeline that identifies entities (people, organizations, locations, events, documents) and their relationships. Each extraction includes:
- Relevance Scoring: Every entity is scored 1-5 for Epstein case relevance. Entities with a score below 2 are automatically rejected.
- Source Context: The exact text from the source document that supports each claim is preserved.
- Evidence Level Assignment: Each entity and relationship is classified using the evidence hierarchy above.
Quality Assurance
- Deduplication: A daily automated sweep identifies and merges duplicate entities using exact name matching, middle name variants, alias matching, and fuzzy string similarity.
- Orphan Cleanup: Entities with zero relationships are automatically removed daily, as they provide no graph value.
- Relevance Gate: The extraction AI is specifically instructed to reject entities with no direct Epstein case connection. Historical events, pop culture references, and tangential mentions are filtered out.
- Batch Monitoring: Each extraction batch is monitored for orphan rate. Sources that consistently produce orphaned entities are flagged for review.
Important Disclaimer
Presence does not imply guilt. The Epstein Index documents connections and associations found in public records. An individual or organization appearing in this database does not mean they are guilty of any wrongdoing. Many people documented here are victims, witnesses, legal professionals, or individuals mentioned in context.
Evidence levels are assigned based on source quality, not likelihood of claims being true. A "court record" evidence level means the information comes from a court filing — it does not mean a court has validated the claim.