Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 382 Bytes

File metadata and controls

9 lines (6 loc) · 382 Bytes

Arxiv ML project

Creating the data

Abstracts

We plan to make a dictionary of all the words used in the abstracts of the papers in our collection and then get rid of the useless ones (stopwords, etc. ). Then we can turn all the abstract to vectors where each elements of each vector shows the normalized requency of the corresponding word in the correspinding abstract.