🐧 Slack Search Scraper

A Python tool for scraping Slack search results and saving them to a file. Built with Playwright for reliable web automation.

✨ Features

🔍 Search across your Slack workspace and export results
💾 Automatically saves messages as they're found
🔐 Handles authentication seamlessly (saves auth state after first login)
⏱️ Smart timeout handling for partial result pages
🛟 Graceful interrupt handling (Ctrl+C safe)
📝 Exports messages in text format
🧊 Cool as ice - gentle scrolling for reliable data collection
🐠 Goes fishing for those hard-to-find messages

Installation

Clone the repository:

git clone https://github.com/jguice/penguin.git
cd penguin

Install Poetry (if you haven't already):

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies:

poetry install
poetry run playwright install

Usage

Basic usage:

poetry run python slack_search_scraper.py "your search query"

All available options:

poetry run python slack_search_scraper.py [options] "search query"

Options:
  --workspace WORKSPACE  Slack workspace URL (default: https://app.slack.com/client)
  --format {text,json}  Output format (default: text)
  --output OUTPUT       Output file (default: slack_export_[timestamp].txt)
  --auth-file AUTH_FILE Path to save/load authentication (default: slack_auth.json)
  --verbose            Enable verbose debug output
  -h, --help           Show this help message and exit

Example with options:

poetry run python slack_search_scraper.py --format json --output my_search.json "from:@user after:2023-01-01"

First Run Authentication

On first run, you'll need to log in to your Slack workspace. The script will:

Open a browser window
Navigate to Slack's login page
Wait for you to complete authentication
Save the authentication state for future runs

After the first successful login, authentication will be automatic for subsequent runs.

Search Query Notes

Important: Slack's search interface applies its own autocomplete/query processing. This means:

Queries with spaces might be modified by Slack's autocomplete
The actual search performed might differ from your exact input
You may need to adjust your query to achieve the desired search
Slack limits search results to 100 pages (~2000 messages), so for large exports you may need multiple runs with different date ranges

For example, searching for "John Smith" might be processed differently than expected. Try variations or check Slack's web interface to see how your query is interpreted.

Tip: For large exports, break up your search into smaller date ranges:

# Example: Export messages for each quarter
poetry run python slack_search_scraper.py "from:@user after:2023-01-01 before:2023-04-01"
poetry run python slack_search_scraper.py "from:@user after:2023-04-01 before:2023-07-01"

📦 Output

Results are saved to a text file with a timestamp in the filename:

Format: slack_export_YYYYMMDD_HHMMSS.txt

Development

This project uses:

Conventional Commits for commit messages
Semantic versioning
GitHub Actions for CI/CD

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
.releaserc.json		.releaserc.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
slack_search_scraper.py		slack_search_scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐧 Slack Search Scraper

✨ Features

Installation

Usage

First Run Authentication

Search Query Notes

📦 Output

Development

Contributing

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🐧 Slack Search Scraper

✨ Features

Installation

Usage

First Run Authentication

Search Query Notes

📦 Output

Development

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages