DCAT-AP Dataset Relationship Indexer. Indexing linked data and relationships between datasets.
- Features:
- index a distribution or a SPARQL endpoint
- extract and index distributions from a DCAT catalog
- extract a DCAT catalog from SPARQL endpoint and index distributions from it
- generate a dataset profile
- show related datasets based mainly on DataCube and SKOS vocabularies
- indexing sameAs identities and related concepts
For DCAT-DRY service only:
docker build . -t dcat-dry
docker run -p 80:8000 --name dcat-dry dcat-dryFor the full environment use docker-compose:
docker-compose up --buildCPython 3.8+ is supported.
Install redis server first. In following example we will assume it runs on localhost, port 6379 and DB 0 is used.
Setup postgresql server as well. In the following example we will assume it runs on localhost, port 5432, DB is postgres and user/password is postgres:example
You will need some libraries installed: libxml2-dev libxslt-dev libleveldb-dev libsqlite3-dev and sqlite3
Run the following commands to bootstrap your environment
git clone https://github.com/eghuro/dcat-dry cd dcat-dry poetry install --with robots,gevent --without dev # Start redis and postgres servers # Export environment variables export REDIS_CELERY=redis://localhost:6379/1 export REDIS=redis://localhost:6379/0 export DB=postgresql+psycopg2://postgres:example@localhost:5432/postgres # Setup the database alembic upgrade head # Run concurrently celery -A tsa.celery worker -l debug -Q high_priority,default,query,low_priority -c 4 gunicorn -w 4 -b 0.0.0.0:8000 --log-level debug app:app nice -n 10 celery -l info -A tsa.celery beat
In general, before running shell commands, set the FLASK_APP and
FLASK_DEBUG environment variables
export FLASK_APP=autoapp.py export FLASK_DEBUG=1
To deploy:
export FLASK_DEBUG=0 # Follow commands above to bootstrap the environment
In your production environment, make sure the FLASK_DEBUG environment
variable is unset or is set to 0, so that ProdConfig is used.
To open the interactive shell, run
flask shell
By default, you will have access to the flask app.
To run all tests, run
flask test
# Prepare couchdb
curl -X PUT http://admin:password@127.0.0.1:5984/_users curl -X PUT http://admin:password@127.0.0.1:5984/_replicator curl -X PUT http://admin:password@127.0.0.1:5984/_global_changes
# Migrate database
alembic upgrade head
To start batch scan, run
flask batch -g /tmp/graphs.txt -s http://10.114.0.2:8890/sparql
Get a full result
/api/v1/query/analysis
Query a dataset
/api/v1/query/dataset?iri=http://abc