Skip to content

Commit ac14957

Browse files
authored
Merge pull request #314 from stephenhky/develop
Release 2.1.0
2 parents 427b88d + 9b50b62 commit ac14957

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+310
-37999
lines changed

.circleci/config.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,19 +13,20 @@ shared: &shared
1313
sudo apt-get update
1414
sudo apt-get install libc6
1515
sudo apt-get install python3-dev
16+
sudo apt-get install -y g++
1617
1718
- run:
1819
name: Installing Miniconda and Packages
1920
command: |
2021
pip install --upgrade --user pip
21-
pip install --upgrade --user .
2222
pip install --upgrade --user google-compute-engine
23+
pip install --user .
2324
2425
- run:
2526
name: Run Unit Tests
2627
command: |
27-
pip install -r test_requirements.txt
28-
python setup.py test
28+
pip install --user .[test]
29+
pytest
2930
3031
3132
jobs:

.readthedocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ sphinx:
1212
build:
1313
os: ubuntu-22.04
1414
tools:
15-
python: "3.9"
15+
python: "3.12"
1616

1717
# Build documentation with MkDocs
1818
#mkdocs:

MANIFEST.in

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,6 @@
11
include README.md
2-
include requirements.txt
3-
include setup_requirements.txt
4-
include test_requirements.txt
2+
include LICENSE
53
include pyproject.toml
64
include shorttext/data/shorttext_exampledata.csv
75
include shorttext/utils/stopwords.txt
86
include shorttext/utils/nonneg_stopwords.txt
9-
include shorttext/metrics/dynprog/dldist.pyx
10-
include shorttext/metrics/dynprog/dldist.c
11-
include shorttext/metrics/dynprog/lcp.pyx
12-
include shorttext/metrics/dynprog/lcp.c

README.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ representation of the texts and documents are needed before they are put into
1818
any classification algorithm. In this package, it facilitates various types
1919
of these representations, including topic modeling and word-embedding algorithms.
2020

21-
The package `shorttext` runs on Python 3.8, 3.9, 3.10, and 3.11.
21+
The package `shorttext` runs on Python 3.9, 3.10, 3.11, and 3.12.
2222
Characteristics:
2323

2424
- example data provided (including subject keywords and NIH RePORT);
@@ -31,8 +31,7 @@ Characteristics:
3131
- maximum entropy classification;
3232
- metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD);
3333
- character-level sequence-to-sequence (seq2seq) learning;
34-
- spell correction;
35-
- API for word-embedding algorithm for one-time loading; and
34+
- spell correction; and
3635
- Sentence encodings and similarities based on BERT.
3736

3837
## Documentation
@@ -84,6 +83,7 @@ If you would like to contribute, feel free to submit the pull requests. You can
8483

8584
## News
8685

86+
* 12/14/2024: `shorttext` 2.1.0 released.
8787
* 07/12/2024: `shorttext` 2.0.0 released.
8888
* 12/21/2023: `shorttext` 1.6.1 released.
8989
* 08/26/2023: `shorttext` 1.6.0 released.
@@ -159,8 +159,3 @@ If you would like to contribute, feel free to submit the pull requests. You can
159159
* 12/21/2016: `shorttext` 0.2.0 released.
160160
* 11/25/2016: `shorttext` 0.1.2 released.
161161
* 11/21/2016: `shorttext` 0.1.1 released.
162-
163-
## Possible Future Updates
164-
165-
- [ ] Dividing components to other packages;
166-
- [ ] More available corpus.

docs/codes.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,13 @@ Module `shorttext.metrics.dynprog`
6565
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6666

6767
.. automodule:: shorttext.metrics.dynprog.jaccard
68-
:members: soft_intersection_list
68+
:members:
69+
70+
.. automodule:: shorttext.metrics.dynprog.dldist
71+
:members:
72+
73+
.. automodule:: shorttext.metrics.dynprog.lcp
74+
:members:
6975

7076
Module `shorttext.metrics.wassersterin`
7177
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

docs/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,9 @@
5656
# built documents.
5757
#
5858
# The short X.Y version.
59-
version = u'2.0'
59+
version = u'2.1'
6060
# The full version, including alpha/beta/rc tags.
61-
release = u'2.0.0'
61+
release = u'2.1.0'
6262

6363
# The language for content autogenerated by Sphinx. Refer to documentation
6464
# for a list of supported languages.

docs/install.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Backend for Keras
2727
-----------------
2828

2929
The package keras_ (version >= 2.0.0) uses Tensorflow_ as the backend. Refer to
30-
:doc:`faq` for how to switch the backend. It is also desirable if the package Cython_ has been previously installed.
30+
:doc:`faq` for how to switch the backend.
3131

3232

3333
Possible Solutions for Installation Failures
@@ -41,7 +41,7 @@ you may try one (or more) of the following:
4141

4242
::
4343

44-
pip install -U python3-dev
44+
pip install python3-dev
4545

4646

4747

@@ -70,7 +70,6 @@ Required Packages
7070

7171
Home: :doc:`index`
7272

73-
.. _Cython: http://cython.org/
7473
.. _Numpy: http://www.numpy.org/
7574
.. _SciPy: https://www.scipy.org/
7675
.. _Scikit-Learn: http://scikit-learn.org/stable/

docs/intro.rst

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,19 +23,12 @@ Characteristics:
2323
- metrics of phrases differences, including soft Jaccard score (using Damerau-Levenshtein distance), and Word Mover's distance (WMD); (see :doc:`tutorial_metrics`)
2424
- character-level sequence-to-sequence (seq2seq) learning; (see :doc:`tutorial_charbaseseq2seq`)
2525
- spell correction; (see :doc:`tutorial_spell`)
26-
- API for word-embedding algorithm for one-time loading; (see :doc:`tutorial_wordembedAPI`) and
2726
- Sentence encodings and similarities based on BERT (see :doc:`tutorial_wordembed` and :doc:`tutorial_metrics`).
2827

29-
Before release 0.7.2, part of the package was implemented using C, and it is interfaced to
30-
Python using SWIG_ (Simplified Wrapper and Interface Generator). Since 1.0.0, these implementations
31-
were replaced with Cython_.
32-
3328
Author: Kwan Yuet Stephen Ho (LinkedIn_, ResearchGate_, Twitter_)
3429

3530
Home: :doc:`index`
3631

3732
.. _LinkedIn: https://www.linkedin.com/in/kwan-yuet-ho-19882530
3833
.. _ResearchGate: https://www.researchgate.net/profile/Kwan-yuet_Ho
3934
.. _Twitter: https://twitter.com/stephenhky
40-
.. _SWIG: http://www.swig.org/
41-
.. _Cython: http://cython.org/

docs/news.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
News
22
====
33

4+
* 12/14/2024: `shorttext` 2.1.0 released.
45
* 07/12/2024: `shorttext` 2.0.0 released.
56
* 12/21/2023: `shorttext` 1.6.1 released.
67
* 08/26/2023: `shorttext` 1.6.0 released.
@@ -81,6 +82,13 @@ News
8182
What's New
8283
----------
8384

85+
Released 2.1.0 (December 14, 2024)
86+
------------------------------
87+
88+
* Use of `pyproject.toml` for package distribution.
89+
* Removed Cython components.
90+
* Huge relative import refactoring.
91+
8492
Released 2.0.0 (July 13, 2024)
8593
------------------------------
8694

docs/requirements.txt

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
1-
Cython==3.0.11
2-
numpy==2.0.1
3-
scipy==1.14.0
1+
numpy==2.2.0
2+
scipy==1.14.1
43
joblib==1.4.2
5-
scikit-learn==1.5.1
6-
tensorflow==2.17.0
7-
keras==3.4.1
8-
gensim==4.3.3
9-
pandas==2.2.2
4+
scikit-learn==1.5.2
5+
tensorflow==2.18.0
6+
keras==3.7.0
7+
gensim==4.0.0
8+
pandas==2.2.3
109
snowballstemmer==2.1.0
11-
transformers==4.43.4
12-
torch==2.4.0
13-
python-Levenshtein==0.25.1
10+
transformers==4.47.0
11+
torch==2.5.1
1412
numba==0.60.0

0 commit comments

Comments
 (0)