newbee_scraper

Taking the Sting Out of the Job Hunt

Running the project locally

clone the repo
navigate to the project root
create your venv

python -m venv venv

activate and enter your venv

source venv/bin/activate

install dependencies

pip install -r requirements.txt

create your .env file

SECRET_KEY=[YOUR SECRET KEY HERE] or ask someone on the team for the secret key
DATABASE_URL=postgresql:///job_crawler
TEST_DATABASE_URL=postgresql:///job_crawler
DATABASE_NAME=job_crawler
OPEN_AI_API_KEY=[ROBOT API KEY]

Create your local job_crawler database: If you already have a job_crawler DB and want to use the backup data:
1. drop table, create table, import backup_database.sql:
  1. dropdb job_crawler
  2. createdb job_crawler
  3. psql -d job_crawler -f data/backup_database.sql
  or start from scratch
2. create the tables and run the scraper (this will take a while):
  1. createdb job_crawler
  2. psql -d job_crawler -f data/migrate.sql
  3. python full_scrape.py

Project Overview:

Who's it for? - Bootcamp grads & junior devs What are we looking for? - Job descriptions that fit our needs (bootcamp grads, no degree) Where are we looking for it? - (for now) https://stillhiring.today/

There will be three scrapers:

Scrape for the URLs
- Get company "career" URLs
Scrape those job URLs (company websites) for jobs in our field
- top 5 job boards
  1. jobs.lever.co: 191
  2. boards.greenhouse.io: 113
  3. jobs.ashbyhq.com: 37
  4. jobs.jobvite.com: 8
  5. careers.smartrecruiters.com: 7
Scrape and save the job descriptions
Run the job descriptions through GPT

REMINDER:

Please make sure to run the following commands before you start working on the codebase to ensure you have the most up-to-date packages and code:

git pull

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
GPT		GPT
data		data
gpt		gpt
t1		t1
t2		t2
t3		t3
utilities		utilities
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
QA.py		QA.py
README.md		README.md
app.py		app.py
full_scrape.py		full_scrape.py
job_boards.db		job_boards.db
pyprojects.toml		pyprojects.toml
requirements.txt		requirements.txt
technologies_chart.png		technologies_chart.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

newbee_scraper

Running the project locally

Project Overview:

There will be three scrapers:

REMINDER:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

newbee_scraper

Running the project locally

Project Overview:

There will be three scrapers:

REMINDER:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages