DISCLAIMER: This website aggregates publicly available information from court documents, government records, and established news reports. We make no independent claims or accusations. The presence of any individual or organization does not imply guilt or wrongdoing. All information is categorized by evidence level and includes source references. This project is for research and public interest purposes only.

Methodology

This page explains how The Epstein Index collects, processes, and categorizes information. Understanding our methodology helps you evaluate the quality and reliability of the data presented.

Evidence Levels

Every entity and relationship in the graph is assigned an evidence level based on its strongest supporting source:

Level	Description
Court Record	Directly sourced from court filings, verdicts, depositions, or legal proceedings. This is the highest tier of evidence.
Official Document	Government records, flight logs, financial filings, law enforcement reports, and official press releases.
Credible Reporting	Reporting from established news outlets (NYT, WSJ, BBC, Miami Herald, etc.) with named sources and editorial oversight.
Multiple Sources	Information corroborated by two or more independent sources, but not yet confirmed by official records.
Single Source	Reported by only one source, without independent confirmation. Treat with appropriate skepticism.
Alleged	Claims made in legal filings or by individuals that have not been corroborated. These are allegations, not established facts.
Rumored	Speculation, anonymous tips, social media claims, or unverified reports. Included for completeness but should not be treated as reliable.

Data Sources

Our data is drawn from these categories of public sources:

Court Records: Unsealed filings from Giuffre v. Maxwell, SDNY criminal case, Palm Beach County proceedings, and related civil litigation
Government Documents: DOJ press releases, FBI records, FOIA responses, SEC filings, and congressional testimony
Official Records: Flight logs (FAA records), property records, corporate registrations, and financial disclosures
News Reporting: Investigative journalism from established outlets including the Miami Herald, New York Times, and others
Depositions & Testimony: Sworn statements from court proceedings and investigations
Public Archives: DocumentCloud, PACER, CourtListener, and government archives

Collection & Extraction Process

Data enters the system through eight automated collectors that run on scheduled intervals:

News Collector — Monitors news sources for Epstein-related reporting (every 6 hours)
Court Documents Collector — Checks for new unsealed filings and legal documents (daily)
FOIA Collector — Monitors government FOIA responses and releases (daily)
Wikipedia Collector — Extracts structured data from relevant Wikipedia articles (daily)
Epstein Files Collector — Processes documents from public Epstein-related archives (daily)
Deep Research Collector — AI-powered thread pulling that follows connections across sources (daily)
Source Discovery Collector — Proactively discovers new sources via link harvesting, semantic search, and graph topology analysis (every 8 hours)
Social Media Collector — Monitors social media for verified, credible reports (every 12 hours)

Collected documents are processed by an AI extraction pipeline that identifies entities (people, organizations, locations, events, documents) and their relationships. Each extraction includes:

Relevance Scoring: Every entity is scored 1-5 for Epstein case relevance. Entities with a score below 2 are automatically rejected.
Source Context: The exact text from the source document that supports each claim is preserved.
Evidence Level Assignment: Each entity and relationship is classified using the evidence hierarchy above.

Quality Assurance

Deduplication: A daily automated sweep identifies and merges duplicate entities using exact name matching, middle name variants, alias matching, and fuzzy string similarity.
Orphan Cleanup: Entities with zero relationships are automatically removed daily, as they provide no graph value.
Relevance Gate: The extraction AI is specifically instructed to reject entities with no direct Epstein case connection. Historical events, pop culture references, and tangential mentions are filtered out.
Batch Monitoring: Each extraction batch is monitored for orphan rate. Sources that consistently produce orphaned entities are flagged for review.

Important Disclaimer

Presence does not imply guilt. The Epstein Index documents connections and associations found in public records. An individual or organization appearing in this database does not mean they are guilty of any wrongdoing. Many people documented here are victims, witnesses, legal professionals, or individuals mentioned in context.

Evidence levels are assigned based on source quality, not likelihood of claims being true. A "court record" evidence level means the information comes from a court filing — it does not mean a court has validated the claim.