Occupational Exposure to Disruption from AI Products

Process:

Occupational Exposure to Disruption from AI Products

This project aims to derive a measure of the degree to which each occupation is exposed to disruption by Artificial Intelligence. To do so, I use public press release data to identify new AI product launches or adoptions. Next, I use various NLP techniques and a Large Language Model (GPT-4) to extract the concrete capabilities of each AI product. I compare these capabilities to a list of job relevant skills compiled by ESCO, and calculate a occupation-level exposure score by avaraging over the job skill - AI capability similarity of all job skills related to each occupation.

The main results are in results/occupational_exposure_to_ai_products/. The file scored_esco_occupations.csv contains occupational exposure scores for all individual occupations in the ESCO database. I also provide results aggregated to the ESCO 4-digit, 3-digit, 2-digit, or 1-digit level (which are compatible with the corresponding levels of the ISCO classification). Finally, the file scored_soc_equivalent_occupations.csv contains occupational exposure scores for occupations in O*NET, obtained by converting the ESCO scores using the ESCO-O*NET crosswalk provided by ESCO.

The plot below summarizes the main findings, which put knowledge-heavy and highly-skilled occupation at the top in terms of their exposure to AI, while occupation requiring physical labor are towards the bottom.

Occupation with high exposure scores tend to be highly paid and with low unemployment rates, suggesting that AI product innovation may be targeted towards tasks for which the required labor is relatively scarce and expensive.

The full details about the method and the results, as well as connections to existing literature, as described in the file main_text.pdf. Please keep in mind, that this project is a work in progress, and it may undergo slight or even significant changes prior to publication.

Reproducability

Run py/scrape_releases.py - populates folders data/links and data/articles
Run py/filter_press_release_by_keyword.py - creates data/filtered_press_releases.csv that contains only press releases that contain certain keywords related to AI (overwrites)
Run py/find_relevant_releases_finetune.py - creates results/relevant_press_releases.csv by first finetuning a DistilBERT model on a small labeled dataset of relevant and non-relevant press releases, and then using the model to score the press releases in data/filtered_press_releases.csv (appends if it exists)
Run py/extract_product_capabilities.py - creates results/processed_press_releases.csv (appends if it exists). Executes GPT queries.
Run py/match_capabilities_to_skills.py - creates results/scored_esco_skills.csv (overwrites)
Run py/scrape_cedefop.py - creates data/skills_intelligence_data.csv (overwrites)
Run R/aggregate_skills_to_occupations.R - creates results/scored_esco_occuations.csv
Run R/match_to_prior_work.R - creates results/scored_esco_occupations_matched.csv
(Optional) Run R/clusterize_capability_vectors.R
Run R/join_to_occupation_statistics.R - creates results/scored_esco_occupations_matched.csv

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
R		R
prompts		prompts
py		py
results		results
.gitignore		.gitignore
README.md		README.md
ai-products.Rproj		ai-products.Rproj
main_text.pdf		main_text.pdf
requirements.txt		requirements.txt
slides.pdf		slides.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Occupational Exposure to Disruption from AI Products

Reproducability

About

Uh oh!

Releases

Packages

Uh oh!

Languages

demirev/ai-products

Folders and files

Latest commit

History

Repository files navigation

Occupational Exposure to Disruption from AI Products

Reproducability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages