WhoIsHiring

Overview

WhoIsHiring is an automated pipeline designed to scrape job postings from different sources using Playwright, store the raw data in MongoDB, and process it using an AI parser. The parsed data is then consumed by various clients, including Telegram and Discord bots, as well as a web frontend, delivered via Redpanda (Kafka).

Architecture

Website with the content: The starting point where job postings are sourced.
Playwright Parser: An automation tool that collects job postings data.
MongoDB: A NoSQL database where raw job postings data is stored.
MongoDB Connector: Connects MongoDB to the Kafka stream.
Kafka (Redpanda): A distributed streaming platform that manages data flow between components.
AI Parser: Processes and analyzes raw job data.
Clients:
- Telegram Bot: Sends updates and notifications to Telegram users.
- Discord Bot: Sends updates and notifications to Discord users. (TODO)
- Frontend: A web interface for users to interact with the parsed job data.(TODO)

Features

Automated Scraping: Seamlessly scrape job postings from websites.
Data Storage: Efficiently store raw data in MongoDB.
Stream Processing: Utilize Kafka (Redpanda) for data streaming.
AI Parsing: Leverage AI to process and analyze job data.
Multi-Client Support: Deliver processed data to Telegram, Discord, and a web frontend.

Installation

Clone the repository:

git clone https://github.com/JulyJ/whoishiring.git
cd whoishiring

Install dependencies

npm install
npx playwright install --with-deps

Copy the .env.example files to .env for each project and update them with your specific configuration.
```
cp .env.example .env
```
Run build
```
npx turbo build
```
For each service in the project that has its own Dockerfile, you’ll need to build and run the Docker containers. Below are example for how to do this:
```
cd apps/playwright-parser
docker build -t whoishiring-playwright-parser
docker run -d --name whoishiring-playwright-parser
```
To run the run-parser.sh script with different parameters using Docker, you can add a specific section:
```
docker run --rm whoishiring-parser ./run-parser.sh gh
```

To stop and remove the containers once you are done, you can use:

docker stop whoishiring-playwright-parser
docker rm whoishiring-playwright-parser

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
apps		apps
packages		packages
.dockerignore		.dockerignore
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
architecture.jpg		architecture.jpg
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhoIsHiring

Overview

Architecture

Features

Installation

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

JulyJ/whoishiring

Folders and files

Latest commit

History

Repository files navigation

WhoIsHiring

Overview

Architecture

Features

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages