Chaining algorithms predict the emergence of scientific ideas

To reproduce the figures and analysis in this paper:

Collecting data given a list of scientists

Run query_semanticscholar.py on the domain {CS, Chemistry, Economics, Medicine, Physics}, or your own list of scientists
Run filter_by_year using the birth/death dates file, output to abstracts_filtered_year
Clean the abstracts with get_vectors.py, output to abstracts-cleaned directory
Encode the abstracts with SBERT via sbert.py, output to sbert-abstracts
Order/convert dates to timestamps via emergence_order.py, output to abstracts-ordered directory
Run models on the abstracts-ordered directory

Running models

Hyperparameters must be tuned for models on each scientist: run opt_hyperparam_exemplar.py for each model/field combination, output individual param values to `/individual-s-vals/
Run comparison between models: src/models/predict.py --type <nobel/turing> --field <field> --measure ll -i
Run shuffle tests between models: src/models/predict.py --type <nobel/turing> --field <field> --measure ll -i -s --sy
Run authorship analysis: src/models/predict_k_author_papers.py --type <nobel/turing> --field <field> -k <max authors, or -1 for first author>

Creating figures

Most figures are generated through functions in rain_plots.py, based on simulation outputs generated through the "Running models" section. The stacked authorship charts can be generated through stacked_bar_authorship.py. tSNE visualizations can be generated through make_tsne_figure.py.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chaining algorithms predict the emergence of scientific ideas

Collecting data given a list of scientists

Running models

Creating figures

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

nightingal3/metascience

Folders and files

Latest commit

History

Repository files navigation

Chaining algorithms predict the emergence of scientific ideas

Collecting data given a list of scientists

Running models

Creating figures

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages