Skip to content

Conversation

@Mirza-Samad-Ahmed-Baig
Copy link

This pull request introduces concurrent scraping to the spider_some_note function in main.py using a ThreadPoolExecutor, significantly improving the performance of scraping multiple notes—especially when retrieving all notes from a user.

Problem
The previous implementation of spider_some_note scraped notes sequentially, which was:
Slow for users with many notes
Inefficient, as it did not leverage available system resources

Solution
Refactored the spider_some_note function to use Python’s ThreadPoolExecutor, enabling:
Concurrent scraping of notes
Parallel execution, reducing overall scraping time

Benefits
Improved Performance: Dramatically faster scraping of multiple notes
Increased Efficiency: Maximizes use of system resources by running scraping tasks in parallel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant