nih_reporter

Using NIH RePORTER data as a machine learning playground for Databricks, NLP, Azure tools, and collaborative development

Stream Labels

This repo is intended to contain multiple streams (sub-projects or research ideas). Unique stream labels are to be used as directory names to organise the streams and match across directories. Label shared is reserved for codes and features common to all streams.

Stream labels should also be used as branch names to aid code management.

Key directories

  doc/                         - documentation
  src/                          - source codes
    |_  pipelines/[stream]/     - data / ml pipelines
    |_  notesbooks/[stream]/    - exploratory/ experimental notebooks
    |_  utils                   - utility scripts
  test/                         - codes for unit or regression testing
    |_ [stream]/                - organised by streams
  out/[stream]/                 - small output files(eg plots) generated by codes
  data/[stream]/                - small resources or files used by your program
  models/[stream]/              - saved models for deployment
  README.md
  requirements.txt              - use if applicable

Note: Large files ( say, > 1MB) should reside in external file system such as Databricks DBFS and OneDrive.

Notes for contributors

FORK: Create a fork from the main repo [jtjli/nih_reporter] unless you want to develop on top on an existing fork.
BRANCH: Use a branch that's representative of your development, such as using a Stream Label as the branch name. Avoid developing on the main branch.
Create a Pull Request when your codes are ready for merging into the main repo.
Wherever appropriate, use Stream Labels as section heading in files such as .gitignore, the global requirements.txt, and README

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src/notesbooks/DC		src/notesbooks/DC
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nih_reporter

Stream Labels

Key directories

Notes for contributors

About

Uh oh!

Releases

Packages

Languages

sdchandra/nih_reporter_DC

Folders and files

Latest commit

History

Repository files navigation

nih_reporter

Stream Labels

Key directories

Notes for contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages