This project aims to validate email addresses by checking their syntax, MX records, whether they belong to disposable email provider domains, catchall status, and if the email exists (is deliverable). The project utilizes various Python packages and scripts to perform these validations.
- Validates email addresses for syntax and domain existence.
- Checks MX records to ensure the domain can receive emails.
- Identifies disposable email provider domains.
- Determines catchall status to verify if the domain accepts all emails.
- Validates if the email exists and is deliverable.
- Provides logging functionality to track validation processes.
- Python
- Requests library for making HTTP requests
- BeautifulSoup for HTML parsing
- Regular expressions for email extraction
- Scrapy framework for efficient web scraping
- DNSPython for DNS-related operations
- Smtplib for SMTP operations
- PiaPy for VPN connectivity
- Openpyxl for reading and manipulating Excel files
- Clone the repository to your local machine.
- Install the required Python packages by running
pip install -r requirements.txt. - Ensure you have Python 3 installed on your system.
To use the project, follow these steps:
- Prepare a text file (
emails.txt) containing the email addresses to validate, with one email per line. - Run the
bulk_verifier.pyscript to validate the email addresses and save the results to a CSV file (emails_validated.csv). - Optionally, review the log files generated in the
logsdirectory for detailed validation process information.
In addition to the core Python packages, the project utilizes the following libraries and frameworks:
- dns.resolver for DNS-related operations
- disposable_email_domains for identifying disposable email provider domains
- smtplib for SMTP operations
- Openpyxl for reading and manipulating Excel files
The project presents several takeaways and challenges:
- Integration of Multiple Python Packages: Combining various Python packages and frameworks to handle different aspects of email validation required careful integration and management.
- Handling DNS Lookups and SMTP Connections: Dealing with DNS lookups and SMTP connections involved understanding network protocols and error handling for robust validation.
- Logging and Error Handling: Implementing logging functionality and error handling mechanisms were crucial for monitoring and troubleshooting validation processes effectively.
- Scalability and Performance: Ensuring the project's scalability and performance for processing large volumes of email addresses while maintaining accuracy posed significant challenges.
This project is distributed under the MIT License. See the LICENSE file for details.