A deterministic, end-to-end pipeline that automatically tailors LaTeX resumes to LinkedIn job postings while maintaining transparency and resume integrity.
Job applications often require tailoring resumes to specific roles, but manual customization is time-consuming and inconsistent. This pipeline automates the process while ensuring:
- Transparency: Every change is explained and reviewable
- Integrity: No fabricated skills or inflated experience
- Consistency: Systematic keyword optimization across applications
- Control: Human oversight at every step
- π Automated LinkedIn Job Scraping - Extract job descriptions from LinkedIn URLs
- π TF-IDF Keyword Extraction - Identify high-signal keywords without LLM bias
- βοΈ AI-Powered Resume Tailoring - Intelligently modify resume content using OpenAI
- π Change Tracking - Detailed explanations for every modification
- π Safety First - Strict rules prevent skill fabrication or experience inflation
- π LaTeX Output - Professional, ATS-friendly PDF generation
-
Setup
git clone <this-repository> cd linkedin-resume-pipeline pip install -r requirements.txt
-
Configure
cp .env.example .env # Add your OpenAI API key to .env -
Add Your Resume
# Place your LaTeX resume as resume/resume.tex -
Add Jobs
# Edit data/jobs.csv job_id,company,role,job_url google-swe-001,Google,Software Engineer,https://linkedin.com/jobs/view/123456 -
Run Pipeline
python run_pipeline.py
-
Review Results
- Check
outputs/for tailored resumes - Review
changes_explained.txtfor all modifications
- Check
- Python 3.8+
- Chrome browser
- OpenAI API key
- LaTeX resume file
pip install -r requirements.txtCreate a .env file:
OPENAI_API_KEY=your_openai_api_key_here
CHROME_PROFILE_PATH=/Users/yourname/chrome-selenium-profile # OptionalLinkedIn aggressively blocks automated scraping. To work reliably and ethically, this project uses Selenium with a persistent Chrome user profile.
- Allows manual login to LinkedIn once
- Reuses your real browser session (cookies, auth, JS execution)
- Avoids brittle username/password automation
- Dramatically reduces bot detection
- Selenium launches Chrome using a custom user-data directory
- You log in to LinkedIn normally
- The session is reused for all future runs
-
Create a Chrome profile directory
mkdir -p ~/chrome-selenium-profile -
Set environment variable
CHROME_PROFILE_PATH=/Users/yourname/chrome-selenium-profile
-
Run scraper for first time
python scripts/scrape_jobs.py
-
Log into LinkedIn when Chrome opens
- This is required only once
- Do NOT close Chrome while scraping is running
After this, future runs will reuse the authenticated session automatically.
- β macOS (fully tested)
β οΈ Windows (works, path format differs)β οΈ Linux (works, Chrome must be installed)
- Credentials are never stored in this repo
- Login happens in a real Chrome browser
- No password automation is used
- You can delete the profile directory at any time to reset
Place your LaTeX resume file at resume/resume.tex. The pipeline will only modify content inside \resumeItem{...} commands.
Add job postings to scrape and tailor for:
job_id,company,role,job_url
4165741696,Rogers Communications,Solution Architect - Managed Services,https://www.linkedin.com/jobs/view/4165741696/
4343808522,DataStealth.io,Cloud Architect,https://www.linkedin.com/jobs/view/4343808522/
Fields:
job_id: Unique identifier for this applicationcompany: Company namerole: Job titlejob_url: LinkedIn job posting URL
Scrape Jobs Only:
python scripts/scrape_jobs.pyExtract Keywords Only:
python scripts/extract_keywords.pyRewrite Resume Only:
python scripts/rewrite_resume.pyRun Complete Pipeline:
python run_pipeline.pyEach job creates a folder: outputs/<Company> - <Role> - <JobID>/
Contents:
job.json- Scraped job datakeywords.json- Extracted keywordsresume_tailored.tex- Tailored LaTeX resumechanges_explained.txt- Detailed change log
- Uses Selenium with persistent Chrome profile
- One-time LinkedIn login required
- Extracts full job description text
- Handles dynamic content loading
- Saves raw HTML on failures for debugging
- Applies TF-IDF analysis to job descriptions
- Identifies multi-word technical terms
- Filters out generic business language
- No LLM bias in keyword selection
- Uses OpenAI API for intelligent rewriting
- Strict Safety Rules:
- Only modifies
\resumeItem{...}content - Preserves all metrics and achievements
- No skill fabrication
- No experience inflation
- No structural changes
- Only modifies
- Generates detailed change explanations
- Every change is logged and explained
- Original and modified versions side-by-side
- Reasoning provided for each modification
- Human review strongly recommended
βββ data/
β βββ jobs.csv # Job input file
βββ resume/
β βββ resume.tex # Your LaTeX resume (you provide this)
βββ scripts/
β βββ scrape_jobs.py # LinkedIn scraping
β βββ extract_keywords.py # TF-IDF keyword extraction
β βββ rewrite_resume.py # AI resume tailoring
βββ outputs/ # Generated results
βββ run_pipeline.py # Main orchestrator
βββ requirements.txt # Python dependencies
βββ .env.example # Environment template
- Professional Output: Superior typography and formatting
- ATS Compatibility: Consistent, parseable structure
- Version Control: Text-based format for easy diffing
- Customization: Programmatic modifications possible
- Objectivity: Mathematical keyword extraction without AI bias
- Transparency: Explainable algorithm vs. black-box selection
- Cost Efficiency: No API calls for keyword identification
- Reproducibility: Deterministic results
- Accountability: Every modification is justified
- Learning: Understand what makes resumes effective
- Safety: Catch inappropriate changes before submission
- Compliance: Maintain truthfulness in applications
"We're looking for a Senior Software Engineer with expertise in
Python, React, and AWS to build scalable web applications..."
{
"technical_keywords": ["Python", "React", "AWS", "scalable web applications"],
"skill_keywords": ["software engineering", "full-stack development"],
"domain_keywords": ["cloud infrastructure", "microservices"]
}ORIGINAL: Built a web application using modern frameworks
UPDATED: Built a scalable web application using React and Python
REASON: Added "scalable" and specific technologies (React, Python)
mentioned in job requirements
- Optimizes existing experience for relevance
- Adds appropriate technical keywords
- Improves action verb strength
- Maintains factual accuracy
- Fabricates skills or experience
- Inflates job titles or responsibilities
- Creates false achievements
- Modifies dates or company names
- All changes must be defensible in interviews
- No misrepresentation of qualifications
- Transparency in automated modifications
- Human review is mandatory
- LinkedIn Dependency: Requires active LinkedIn access
- LaTeX Requirement: Resume must be in LaTeX format
- Manual Review Needed: Automated changes require human verification
- API Costs: OpenAI usage incurs charges
- Rate Limits: LinkedIn may throttle scraping requests
- Private Use: Currently designed for individual use
- Test with one job first to understand the output format
- Review all changes carefully before using tailored resumes
- Keep your base resume updated as the source of truth
- Use meaningful job_ids for easy organization
- Check LinkedIn rate limits if scraping many jobs
Note: This is a private tool for personal resume optimization. All resume content should remain truthful and defensible in interviews.