Hello Sunnah.com team,
Thank you for your amazing work on this platform. I noticed that the Musnad Ahmad collection is currently only around 4% complete.
As a software engineer, I wanted to help accelerate this process. I have managed to extract the complete Musnad Ahmad dataset (Darussalam numbering, +27,000 hadiths) and wrote an ETL script to format it to closely match your data schema (converting raw text into structured JSON with collection, bookNumber, chapterTitleArabic, hadithNumber, etc.).
I have divided the massive dataset into smaller JSON files by Book/Chapter to make it manageable. You can review my proof of concept, the raw data, and the formatted output on my repository here:
https://github.com/mahmoudkalimero1100-rgb/musnad-ahmad-json
I would love to contribute this data. Before submitting any Pull Requests, could you let me know:
Does the JSON structure in my /data/formatted/ folder align with your ingestion requirements?
Would you prefer me to submit incremental PRs book by book to facilitate your code review?
Looking forward to your feedback and to helping complete this collection!
Best regards,
Hello Sunnah.com team,
Thank you for your amazing work on this platform. I noticed that the Musnad Ahmad collection is currently only around 4% complete.
As a software engineer, I wanted to help accelerate this process. I have managed to extract the complete Musnad Ahmad dataset (Darussalam numbering, +27,000 hadiths) and wrote an ETL script to format it to closely match your data schema (converting raw text into structured JSON with collection, bookNumber, chapterTitleArabic, hadithNumber, etc.).
I have divided the massive dataset into smaller JSON files by Book/Chapter to make it manageable. You can review my proof of concept, the raw data, and the formatted output on my repository here:
https://github.com/mahmoudkalimero1100-rgb/musnad-ahmad-json
I would love to contribute this data. Before submitting any Pull Requests, could you let me know:
Does the JSON structure in my /data/formatted/ folder align with your ingestion requirements?
Would you prefer me to submit incremental PRs book by book to facilitate your code review?
Looking forward to your feedback and to helping complete this collection!
Best regards,