Library implementing surprising frequent phrase detection as defined in "Characterising Semantically Coherent Classes of Text Using Feature Discovery".
If you use this tool, please include the following citation:
Robertson, Andrew David, 2019. Characterising semantically coherent classes of text through feature discovery (Doctoral thesis, University of Sussex).
Here's the bibtex:
@phdthesis{robertson2019characterising,
title={Characterising semantically coherent classes of text through feature discovery},
author={Robertson, Andrew David},
year={2019},
school={University of Sussex}
}In development... see complete Python version: https://github.com/andehr/sfpd/