GraphIndex is an open-source project that provides semantic mapping of table columns to ontology terms. It enables users to map table columns to terms in an ontology, with the default ontology being schema.org. The project uses a vector index generated with Llama Index, which is then utilized by LLM (Llama Mapping) to produce the mappings.
graphindex1_edit_1.mp4
- Clone the repository:
git clone https://github.com/yourusername/graphindex.git
cd graphindex- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install the required dependencies:
pip install -r requirements.txtMake sure you have activated the virtual environment:
source venv/bin/activate
# On Windows: venv\Scripts\activateBefore starting the Flask app define the following environment variables:
OPENAI_API_KEY='<your_api_key>'
SOURCE_DIR='<path_to_store_or_read_local_ontology>'
OUTPUT_DIR='<path_to_store_or_read_local_index>'
LOGS_DIR='<path_to_store_app_logs>'
OPENAI_MODEL='<openai_model_to_generate_mapping>'
VALIDATION_MODEL='<openai_model_for_two_step_validation>'Run the Flask app
python app.pyOnce the app is running, you can access the API endpoint to get the semantic mapping. Use a tool like curl or Postman to make a POST request to http://localhost:5000/api/v1/mapping with a CSV file representing the table you want to map and providing the project id.
Optionally, you can provide your own ontology stored in a local folder. The app will use the default schema.org if no custom ontology is specified.
To do this, specify the path to the local ontology in the SemanticMapper object:
from src.graphindex.mapping import SemanticMapper
from src.graphindex.common.enumerations import IndexType
mapper = SemanticMapper(
ontology_source_dir='<path_to_local_ontology>',
index_output_dir='<path_to_store_index>',
openai_model='gpt-3.5-turbo-16k',
index_type=IndexType.VECTOR
)src/graphindex: Contains the logic and implementation of the semantic mapping.
app.py: Flask app file with the API endpoint for mapping table columns to ontology terms.
examples: Contains example CSV files for testing the API.
requirements.txt: Lists the required dependencies for the project.