Releases · TimSchopf/KeyphraseVectorizers

Added the options to use a custom POS-tagger, define custom stop words, and exclude certain spaCy pipeline components. This release solves issues #2 and #7.

Assets 4

18 Jun 19:23

TimSchopf

v0.0.9

bf8a697

Higher compatibility with available SpaCy pipelines

Fixed issue #11 and #10 by removing the default exclusion of certain spaCy pipeline components. This slightly slows down the keyphrase extraction process. However it grants higher compatibility to all available spaCy pipelines, including the ones that use transformers.

Assets 4

16 May 15:14

TimSchopf

v0.0.8

2069eb1

Added 'stop_words'=None option

Fixed #8

Assets 4

14 Feb 16:11

TimSchopf

v0.0.7

6289343

Add stopwords download automation

v0.0.7

Signed-off-by: Tim Schopf <tim.schopf@t-online.de>

Assets 4

12 Feb 14:47

TimSchopf

v0.0.6

48d7b68

Change "multiprocessing" parameter to "workers" parameter

change "multiprocessing" parameter to "workers" parameter

Signed-off-by: Tim Schopf <tim.schopf@t-online.de>

Assets 4

06 Feb 10:03

TimSchopf

v0.0.5

2f45652

Added min_df and max_df parameters, added support for documents that have more than 1000000 characters, and limit max keyphrase length to 8 words to prevent memory issues

update scipy requirements

Signed-off-by: Tim Schopf <tim.schopf@t-online.de>

Assets 4

03 Feb 16:25

TimSchopf

v0.0.4

591d71d

Increased efficiency of spaCy pipeline for POS tagging

v0.0.4

v0.0.4, increased efficiency of spaCy pipeline for POS tagging + adde…

Assets 4

Releases: TimSchopf/KeyphraseVectorizers

v0.0.13

Uh oh!

v0.0.12

Uh oh!

Add spacy.Language as valid argument for 'spacy_pipeline'

Uh oh!

Custom POS-tagger feature added

Uh oh!

Higher compatibility with available SpaCy pipelines

Uh oh!

Added 'stop_words'=None option

Uh oh!

Add stopwords download automation

Uh oh!

Change "multiprocessing" parameter to "workers" parameter

Uh oh!

Added min_df and max_df parameters, added support for documents that have more than 1000000 characters, and limit max keyphrase length to 8 words to prevent memory issues

Uh oh!

Increased efficiency of spaCy pipeline for POS tagging

Uh oh!