Tadabur

A large-scale Quran audio dataset featuring ... reciters and word-level alignment — built for Arabic speech research.

Overview

Tadabur covers the entire Qur’an with automatic word-level timestamp alignment and structured JSON metadata for each file. It is the largest and most diverse publicly available Qur’anic speech dataset.


🕐 Audio	1400+ hours
🎙️ Reciters	600+
🔗 Alignment	Word-level timestamps

Dataset Structure

Each example contains an audio file and a JSON metadata file:

{
  "reciter_id": 88,
  "surah_id": 3,
  "ayah_id": 82,
  "word_alignments": [
    {
      "word": "أفلا",
      "start": 0.089,
      "end": 0.87
    },
    ...
  ],
  "text_ar_simple": "افلا يتدبرون القران ولو كان من عند غير الله لوجدوا فيه اختلافا كثيرا",
  "text_ar_uthmani": "أَفَلَا يَتَدَبَّرُونَ ٱلْقُرْءَانَ ۚ وَلَوْ كَانَ مِنْ عِندِ غَيْرِ ٱللَّهِ لَوَجَدُوا۟ فِيهِ ٱخْتِلَـٰفًا كَثِيرًا",
  "ayah_duration_s": 10.9,
  "audio_filename": "tadabur_spk0088_S3_A82_db1f8e71_000003.wav"
}

Models

Whisper models fine-tuned on Tadabur for Qur'anic ASR, available on HuggingFace:

Model	Base	Status	Link
Tadabur-Whisper-Small	Whisper Small	✅ Available	🤗 HuggingFace

Citation

@misc{alherran2026tadabur,
  author = {Alherran, Faisal},
  title  = {Tadabur: A Large-Scale Quran Audio Dataset},
  year   = {2026},
  url    = {https://github.com/fherran/tadabur}
}

Contact

For questions or collaboration, feel free to reach out on LinkedIn.

_{Released under CC BY-NC 4.0 · For research and educational use only · Please engage with Qur'anic content respectfully.}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
samples		samples
static		static
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tadabur

Overview

Dataset Structure

Models

Citation

Contact

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tadabur

Overview

Dataset Structure

Models

Citation

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages