This project is a backend service built with Flask that provides endpoints for searching and retrieving citation data from a citation graph. It utilizes various algorithms such as PageRank and HITS to rank papers based on their citations.
- Python 3.6 or higher
- Flask
- Flask-CORS
- NetworkX
- NumPy
- Pickle
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install the required packages:
pip install Flask Flask-CORS networkx numpy
-
Download the Dataset:
- Download the citation dataset from AMiner.
- After downloading, ensure that the dataset is processed and saved as
citation_graph.pklin theoutputsdirectory.
To run the application, execute the following command:
python app.pyThe server will start on http://127.0.0.1:5000/ by default.
-
GET
/Returns a simple message indicating that the home page is working.
-
GET
/searchThis endpoint allows you to search for papers based on a query. You can also specify various weights for the ranking algorithms.
Query Parameters:
query: The search term to look for in paper titles.number_of_results: The number of results to return (default is 10).salsa: Weight for the SALSA algorithm (default is 0). - TODOhits: Weight for the HITS algorithm (default is 1).hits_hub: Weight for HITS hub score (default is 1).hits_authority: Weight for HITS authority score (default is 1).pagerank: Weight for the PageRank algorithm (default is 1).semantic_similarity: Weight for semantic similarity (default is 0). - TODOpublish_date: Weight for publish date (default is 0). - TODO
Response: Returns a JSON object containing the search results and the subgraph of citations.
This project is licensed under the MIT License - see the LICENSE file for details.