Whatever is a toy search engine base on python2.7 and django1.8. It supports
- Boolean Search
- Vector space model
- Top K approximate search
- Wildcard search
- Phrase search
- Spell check
- Generating abstracts
- Synonym search
- Python 2.7
- Django 1.8 Please visit www.python.org to install python2.7, and then install django by
pip install django==1.8If you use anaconda and the current environment is python of other versions, please create a virtual environment of python 2.7, the details can be found in https://conda.io/docs/using/envs.html. Here we briefly show how to do this in Linux or MacOS:
conda create --name py27 python=2.7
source activate py27
python --versionYou will see the current version of python has already changed to 2.7. Anytime you need python2.7 environment, you can enter it by
source activate py27Now install django by:
conda install django==1.8Corpus is a collection of documents you want to search,
we provide a small corpus in corpus folder.
It contains several *.txt file and a index.txt.
index.txt saves file names of all documents you want to search on.
You can use your own corpus, but a index.txt must be provided.
Before starting search, whatever need to do some processing job. Type following command:
cd corpus
python -m http.server 9000Then open another terminal, in the root directory of project, input:
python whatever/crawl.pyWait for a while, dependent on how big the corpus is.
python manage.py runserver 8000Just input your query and push the search button.
We support (, ), NOT, AND, OR, the priority decrease orderly,
and the AND between two words can be omitted. An querry example is
automobile AND (tesla OR benz) AND NOT ferrari
Support '*' in the begin, middle and end of a word.
app*, tr*e, *cate all are legal query.