Skip to content

Latest commit

 

History

History
497 lines (314 loc) · 14.4 KB

File metadata and controls

497 lines (314 loc) · 14.4 KB

Data Set 9 — Detailed Notes, Sources, and History

This document contains all detailed information related to Data Set 9 (DS09), including:

  • historical and deprecated sources
  • community torrent magnets
  • reconstruction and recovery notes
  • NATIVEs / placeholder analysis
  • status tracking and context

For current recommended download sources, see:

Main repository READMEData Set 9


⚠️ Scope & Intent

This file is archival and explanatory in nature.

  • It preserves historical context and provenance
  • It documents how DS09 circulated and evolved
  • It is not intended as a quick-start or primary download page

The main README is kept intentionally concise. This document exists so nothing important is lost.


Table of Contents


Overview

Data Set 9 has historically been the largest, most unstable, and most complex of the Epstein Files releases.

Unlike other datasets, DS09:

  • was distributed via an incomplete and unreliable ZIP
  • circulated in multiple partial community reconstructions
  • required reconciliation using metadata (.DAT / .OPT) files
  • included both PDFs and a large number of NATIVEs (media files)
  • was the largest EFTA release to date

This document exists to preserve context, evolution, and verification paths without overloading the main index.


Current Status Summary

As of the latest update:

  • DS09 is not yet 100% complete

  • Community reconciliation indicates ~99.9% reconstructable by file count

  • Remaining gaps primarily affect:

    • a small number of PDFs
    • unrecovered or placeholder NATIVEs

⚠️ Completeness is measured by file presence, not byte-for-byte parity.


Recommended Sources (Current)

The following represent the currently recommended DS09 sources.

These are the sources linked and maintained in the main README.

  • Community flattened archive (PDFs only)
  • Community ~140 GB archive (broader coverage)

See the main README for live links, sizes, and hashes.

This section intentionally avoids duplicating magnets to ensure a single authoritative surface.


Historical Sources & Magnets

This section preserves all known historical DS09 sources that circulated during early preservation and reconstruction efforts.

These sources are not recommended for new downloads, but are retained for archival and forensic completeness.


Early Partial Community Magnet (~45 GB)

  • Status: Deprecated
  • Observed Size: ~45 GB
  • Completeness: Severely incomplete
  • Notes: One of the earliest circulating community magnets. Missing large portions of PDFs and all known NATIVEs.

Magnet:

magnet:?xt=urn:btih:0a3d4b84a77bd982c9c2761f40944402b94f9c64&dn=DataSet9-incomplete.zip&xl=48995762176&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

Known Issues:

  • Incomplete
  • Not suitable for reconstruction
  • Superseded by later composites

Deduplicated Community Composite (~89.5 GB)

  • Status: Superseded
  • Observed Size: ~89.54 GB
  • Completeness: Partial
  • Notes: Deduplicated merge of multiple partial sources. Useful historically for overlap analysis.

Magnet:

magnet:?xt=urn:btih:7ac8f771678d19c75a26ea6c14e7d4c003fbf9b6&dn=dataset9-more-complete.tar.zst&xl=96148724837&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.com%3A1337%2Fannounce&tr=udp%3A%2F%2Fexodus.desync.com%3A6969%2Fannounce&tr=http%3A%2F%2Fopen.tracker.cl%3A1337%2Fannounce&tr=udp%3A%2F%2Fopen.stealth.si%3A80%2Fannounce&tr=udp%3A%2F%2Fzer0day.ch%3A1337%2Fannounce&tr=udp%3A%2F%2Fwepzone.net%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker1.myporn.club%3A9337%2Fannounce&tr=udp%3A%2F%2Ftracker.torrent.eu.org%3A451%2Fannounce&tr=udp%3A%2F%2Ftracker.theoks.net%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.srv00.com%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.qu.ax%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.dler.org%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker.bittor.pw%3A1337%2Fannounce&tr=udp%3A%2F%2Ftracker.alaskantf.com%3A6969%2Fannounce&tr=udp%3A%2F%2Ftracker-udp.gbitt.info%3A80%2Fannounce&tr=udp%3A%2F%2Frun.publictracker.xyz%3A6969%2Fannounce&tr=udp%3A%2F%2Fopentracker.io%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.dstud.io%3A6969%2Fannounce&tr=https%3A%2F%2Ftracker.zhuqiy.com%3A443%2Fannounce

ym's Compiled Flattened (~140 GB)

  • Status: Archived (historical reference)
  • Observed Size: ~94.58 GB
  • Completeness: Incomplete
  • Notes: Compiled flat DS09 containing 532

Magnet:

magnet:?xt=urn:btih:5b50564ee995a54009fec387c97f9465eb18ba00&dn=dataset-9_by_fuckthissite3.tar&xl=148072017920

Larger Community Archive (~140 GB)

  • Status: Archived (historical reference)
  • Observed Size: ~140 GB
  • Completeness: Broad but incomplete
  • Notes: Represented the most complete community archive at the time of circulation; later replaced by current recommended sources.

Magnet:

magnet:?xt=urn:btih:5b50564ee995a54009fec387c97f9465eb18ba00&dn=dataset-9_by_fuckthissite3.tar&xl=148072017920
  • Provided by community contributor u/FuckThisSite3

Reconstruction & Analysis Notes

Early DS09 recovery efforts involved:

  • merging multiple partial community archives
  • deduplicating by filename and size
  • reconciling expected file counts using .DAT manifests
  • identifying missing PDFs and NATIVEs

These workflows are preserved here for historical reference.

Current recovery efforts focus on:

  • publicly accessible DOJ endpoints
  • direct verification of missing entries
  • incremental recovery rather than bulk recombination

NATIVEs & Placeholder Analysis

DS09 includes a significant number of NATIVEs (media files).

Observed behaviors include:

  • placeholder files of consistent small sizes
  • missing DOJ endpoints for some native entries
  • overlap with known NATIVEs from other datasets

Key notes:

  • Placeholder sizes observed: 4670 b, 2433 b
  • Estimated expected NATIVEs: 2542
  • Confirmed recoverable NATIVEs: 1983

Detailed lists and notes are maintained in accompanying files under notes/DS09/.


Deprecated / Removed Sources

Some DS09 sources and workflows were removed from the main README to:

  • reduce confusion for new users
  • avoid recommending obsolete or inferior sources
  • maintain a clear “current state” index

Those sources are preserved here intentionally.

Removal from the main README does not imply invalidation or suppression.


Timeline of Availability

  • DOJ releases Data Set 9 ZIP (incomplete / unstable)
  • Early partial community magnets circulate (~45 GB)
  • Deduplicated community composites appear (~89.5 GB)
  • Larger community archive circulates (~140 GB)
  • Current recommended sources consolidated in main README

This document captures that evolution.


Relationship to Main README

  • Main README: Stable, forward-facing index with current recommended sources

  • This document: DS09-specific deep dive, history, and archival record

The two are intentionally separated to balance clarity and transparency.


Maintenance Notes

  • This document is append-only
  • Historical entries are not removed once documented
  • Updates may add context, annotations, or clarifications

This file exists so DS09 can be understood even years later, regardless of how sources change.


Archive from ../README.md


Data Set 9 — Status, Reconstruction, and Analysis

Data Set 9 has historically been incomplete and unstable when accessed via DOJ direct download. Multiple users have reported download cutoffs (commonly around ~49 GB of ~180 GB), HTTP 404s, paginator failures, and IP-based blocking.

These issues have been observed across multiple regions and do not appear to be user-specific.


Reconstruction Status (Current)

Data Set 9 reconstruction efforts are ongoing, but the methodology has evolved.

Earlier recovery work focused on consolidating multiple partial community archives. That process is no longer actively documented here, as the current focus is on:

  • Recovering remaining files directly from DOJ endpoints
  • Verifying expected file counts against official .DAT / .OPT manifests
  • Enumerating missing, placeholder, or inaccessible NATIVEs

The most complete publicly available DS09 compilations are documented in the main README.

Historical reconstruction methods, intermediate sources, and deprecated magnets are preserved in:

  • this document
  • the Git commit history
  • issue discussions

This separation is intentional and exists to maintain clarity while preserving transparency.


Reconstruction Completeness Estimate (~99.9%)

Update: As of early February 2026, reconciliation using official dataset metadata (.DAT / .OPT) indicates that Data Set 9 is now ~99.9% reconstructable by file count from currently circulating public sources.

What “~99.9%” Means (High-Level)

Based on manifest analysis:

  • ~531,307 expected IMAGES entries (PDFs)
  • ~531,282 PDFs currently recovered
  • ~25 PDFs remain missing

In addition:

  • ~2,542 expected NATIVES (media files)

  • 1,983 NATIVEs confirmed as directly downloadable from DOJ sources

  • Remaining NATIVEs appear as:

    • small placeholder files
    • or entries lacking publicly accessible DOJ endpoints

Taken together, this places DS09 at approximately ~99.9% complete by file count, where completeness reflects:

PDFs + recoverable native/media files

This does not imply byte-for-byte parity or canonical completeness.

Important: This document does not claim Data Set 9 is fully complete or authoritative. It records the best-known public reconstruction status based on community analysis at the time of writing.


NATIVEs Placeholder Analysis & Recovery Status

Placeholder Characteristics

Two distinct placeholder file sizes have been consistently observed:

  • 4670 bytes
  • 2433 bytes

This behavior was identified by comparing placeholder files in Data Set 9 against known NATIVEs from Data Set 10.


Revised Estimates

Based on this analysis, the estimated number of missing NATIVEs was revised:

  • Previous estimate: ~135 files
  • Revised estimate: ~2,542 files

Additionally:

  • ~25 image (PDF) files remain missing
  • Some missing images may overlap with native placeholder entries

Recovery Progress

As of the latest update:

  • 1,983 / 2,542 NATIVEs have been confirmed as directly downloadable from DOJ sources
  • Remaining NATIVEs are still under active investigation

Observed Native File Extensions

(Non-exhaustive; sourced primarily from Data Set 10)

.3gp
.amr
.m4a
.m4v
.mov
.mp3
.mp4
.opus
.pluginpayloadattachment
.wav

Additional extensions tested during recovery efforts include:

.avi
.wmv
.ts
.vob
.csv
.xlsx
.xls
.docx
.doc

Reference & Analysis Files (Data Set 9)

The following files are provided for transparency, reconstruction, and community verification.

All files are located under:

/notes/DS09/


Missing / Incomplete Content

These files document content believed to be missing or incomplete relative to expected DS09 manifests.


Duplicate Detection Results

These files document duplicate detection runs. One copy of each duplicate was removed during cleanup.


Broken File Checks


Invalid / Unexpected File Extensions (No Action Taken)

These files document files with extensions inconsistent with expected DS09 content types. No action was taken; results are provided for review only.


Additional Analysis

  • Non-PDF files present in Data Set 9: Non-PDF in Epstein Files - Data Set 9.csv

  • Corrupt media analysis (informational): corrupt.txt

    Contains extracted metadata / partial information from a corrupt video file. Included for transparency; file integrity could not be recovered.


Ongoing Status

Efforts are ongoing to enumerate, test, and recover remaining NATIVEs from DOJ sources.

Progress updates, discussion, and verification notes are tracked in:

https://github.com/yung-megafone/Epstein-Files/issues/4