-
Notifications
You must be signed in to change notification settings - Fork 2
Home
The latest documentation is available at https://Prometheus-X-association.github.io/docs/
The Prometheus-X project revolves around the concept of a human-centric data ecosystem, where individuals have full control over their data and their destiny. The goal is to provide tools that empower people and organizations, enabling them to securely share and utilize data in a trustworthy manner. Prometheus-X aims to establish an open and decentralized data sharing infrastructure at scale, serving as a digital commons for this purpose.
The project focuses on creating an open ecosystem that facilitates data and service sharing, starting with use cases in the education and skills sector. Prometheus-X aims to provide a marketplace of interoperable technologies, offering data empowerment, intermediation, storage, and processing capabilities. Participants in the ecosystem will have the ability to share data in a trustworthy manner, independent of any single entity's control.
All building blocks within the Prometheus-X ecosystem adhere to the same trust and interoperability specifications, following the guidelines set out by the International Data Spaces Association (IDSA) and GAIA-X. The project provides open-source building blocks that enable anyone to deploy and operate these services. Prometheus-X does not operate these services directly, it only ensures a fair and common governance and funding of the necessary building blocks. This approach ensures the absence of an unchangeable dominant player, fostering the creation of open ecosystems while maintaining trust and transparency.
The primary objective of this technical specifications document is to provide a comprehensive framework and guidelines for the creation of Prometheus-X's building blocks. By documenting the necessary technical details, functionalities, and requirements, this document serves as a resource for the development, implementation, and maintenance of the catalog, contract and consent management building blocks needed to deploy human-centric data ecosystems.
The technical specifications outlined in this document aim to facilitate the establishment of human-centric data spaces thanks to Prometheus-X's building blocks. These data spaces act as a secure and controlled environment where individuals can manage and control their own data.
One of the key objectives is to enable the creation of data ecosystems within the personal data space, following the best practices and guidelines established by IDSA and GAIA-X. These ecosystems foster the secure sharing and utilization of data among individuals and trusted organizations. The technical specifications document outlines the necessary components, interfaces, and standards to ensure the interoperability and smooth functioning of data ecosystems within the data space.
To facilitate efficient data management in line with IDSA and GAIA-X principles, the technical specifications document provides guidelines for the implementation of catalogs within the data space. Catalogs serve as repositories of data ecosystems, datasets and services, enabling organizations to organize, categorize, and search for specific data and service resources as well as join and browse data ecosystems. The document specifies the required functionalities for the implementation of a catalog.
Prometheus-X emphasizes the importance of data sharing contractualization to ensure trust and compliance in data exchanges. This document outlines the necessary components and functionalities to support data sharing contractualization within the data spaces. It provides guidelines for the implementation of contract registries, contract generation, policy enforcement along with standards to follow.
Respecting individual privacy and consent is a core principle of Prometheus-X. This document guides the implementation of consent management functionalities within the personal data space. It defines the necessary components, consent verification processes, and revocation services to ensure that data exchanges occur with explicit end-user consent.
The scope of the project outlines the boundaries and extent of the work involved in creating the data space. It encompasses the specific areas and functionalities that will be addressed in the creation of a Data Ecosystem.
The scope covers the development of catalog functionality within the data space. The catalog will provide features for the management of data ecosystems, datasets & services. These include resource registration, formats and standards to follow for resources, resource discovery and ways to communicate with the catalog components.
The project includes the implementation of contract functionalities to support data sharing contractualization within the data space. This involves defining the necessary components, interfaces and mechanisms for creating, managing and enforcing data sharing agreements within the data space.
The scope encompasses the incorporation of end-user consent management capabilities within the data space. This includes designing and implementing features for obtaining, verifying, revoking and logging consent for data exchanges.
While developing the data space, certain constraints, limitations and assumptions need to be considered.
The project must comply with applicable data protection regulations, such as the General Data Protection Regulation (GDPR) and the Data Governance Act (DGA) in the European Union. The technical specifications and implementation should align with these regulations to ensure privacy and data protection.
To promote seamless data sharing, the personal data space should be designed with interoperability in mind. It should support standard data formats, protocols, and interfaces to facilitate compatibility with external systems and enable smooth integration with other data ecosystems.
The personal data space should be designed to handle varying scales of data and user demands. Considerations should be made to ensure the system's scalability, allowing it to accommodate increasing volumes of data and user interactions without compromising performance or functionality.
The project must prioritize robust security measures to protect the personal data stored within the personal data space. This includes implementing encryption, access controls, authentication mechanisms, and auditing functionalities to ensure data confidentiality, integrity, and availability.
The personal data space should be designed with a user-centric approach, focusing on providing a seamless and intuitive user experience. Considerations should be made to ensure usability, accessibility, and user satisfaction throughout the design and implementation process.
The Technical Specifications document primarily focuses on the Catalog, Contract, and Consent aspects of the system. While aspects related to identity, such as Decentralized Identifiers (DIDs), Verifiable Credentials (VCs), and other protocols, are mentioned in the technical details, this document does not delve into the comprehensive details of these identity-related components. The emphasis remains on providing detailed information about the Catalog, Contract, and Consent functionalities within the system.
The data space forms the foundation for secure and interoperable data sharing among participants. It enables individuals and organizations to connect, collaborate, and exchange data in a trusted and efficient manner. By leveraging the functionalities of the data ecosystem catalog, contract, and consent, participants can unlock the full potential of data-driven collaborations while adhering to governance rules and obtaining explicit end-user consents.
An ecosystem administrator, in accordance with the SITRA Rulebook, is required to properly describe a data ecosystem. The Sitra Rulebook serves as a comprehensive governance framework that describes an organization’s legal, business, technical and governance principles within a data ecosystem. By incorporating the guidelines outlined in the SITRA Rulebook, the ecosystem administrator can provide detailed documentation that describes the governance, business and value of the data ecosystem, along with the purposes, goals and requirements.
Following the SITRA Rulebook guidelines ensures that organizations have a standardized framework in place for data governance, making it easier to establish and maintain effective data ecosystems. Additionally, it emphasizes the importance of end-user consent in data sharing within the data ecosystem, as well as the need for business or client-related justifications when sharing data.
The system allows the data ecosystem orchestrator to define the specific needs and requirements of the data ecosystem. This functionality involves identifying the types of data, services and participants that are essential for the functioning of the ecosystem.
The system provides the data ecosystem orchestrator with the ability to define and assign roles and responsibilities within the ecosystem. This ensures clarity regarding the obligations, privileges and scope of each participant’s role. Additionally, the data ecosystem orchestrator can access a database of standardized roles & obligations that are pre-set, enabling them to easily configure and assign these roles to participants. This database serves as a valuable resource, offering a collection of established roles and corresponding obligations that can be readily utilized, reducing the complexity and effort required for role assignment and configuration.
The data ecosystem orchestrator should be able to manage requests from entities seeking to join the data ecosystem. This functionality includes implementing a review and approval process based on predefined criteria and compatibility with the ecosystem’s objectives.
Entities, such as service providers and data providers, have the capability to register their services and data offerings within the Catalog. This functionality enables these entities to describe and provide detailed information about their services, including terms and conditions, technical specifications, and other relevant details. By registering their offerings in the catalog, entities contribute to the ecosystem by making their services and data resources discoverable to other participants. This enhances collaboration and enables users to find and access the relevant services and data offerings that align with their needs and objectives.
The Catalog provides users with the capability to discover and browse resources available within the data space. Through intuitive search and navigation features, users can explore the catalog to find relevant services, data offerings, and data ecosystems. They can utilize search filters, keywords, and categorization to narrow down their search and discover resources that align with their specific requirements. This functionality enhances the user experience by facilitating the efficient discovery and exploration of available resources within the catalog. Users can easily access comprehensive information about each resource, including descriptions, terms and conditions, and technical specifications, enabling them to make informed decisions and effectively leverage the diverse range of services and data offerings provided by participants in the data ecosystem.
Entities who have registered resources or have described an ecosystem in the catalog should receive recommendations for matching services and data offerings or matching data ecosystems. The system enables custom matching logic to provide such recommendations to participants through the catalog User Interface.
The system allows participants to access a set of standard descriptions, including business models, data types, roles, obligations and purposes. This functionality assists participants in understanding the and aligning with the guidelines and requirements of the catalog.
The Data Ecosystem Contract functionality ensures the enforcement of data governance rules within the ecosystem. A data ecosystem orchestrator can implement mechanisms to uphold the agreed-upon governance principles and regulations. This includes defining and enforcing policies, access controls, and data usage guidelines to maintain compliance and data integrity.
The Data Ecosystem Contract provides the capability for the data ecosystem orchestrator to manage requests for accession to the data ecosystem. It can review and process access requests from organizations or individuals seeking to join the ecosystem. This includes verifying the eligibility of participants, assessing their compliance with contractual obligations, and granting access based on predefined criteria.
The Data Ecosystem Contract provides participants with access to a set of standard clauses that describe commonly used terms and conditions. These standard clauses cover various aspects, including security, confidentiality, privacy, data ownership, and data sharing rights. Participants can utilize these standard clauses as a starting point for creating their own contractual agreements, ensuring consistency and legal compliance across the ecosystem.
The Contract Generator is a vital functional requirement of the Data Ecosystem Contract within Prometheus-X. This functionality empowers the data ecosystem orchestrator to generate standardized contract templates based on predefined rules and parameters. By leveraging the Contract Generator, the orchestrator can streamline and automate the process of creating contractual agreements within the ecosystem. The Contract Generator should have the capability to dynamically populate contract templates with relevant information, such as participant details, data usage rights, and contractual obligations. This functionality ensures consistency and efficiency in generating contracts, saving time and effort for the orchestrator and participants. Moreover, the Contract Generator should support customization options, allowing the orchestrator to tailor the generated contracts to specific requirements or specific use cases within the data ecosystem.
Policy Enforcement and Decision Points This functionality enables the implementation of policy enforcement mechanisms and decision points to govern data sharing and access within the ecosystem. The Policy Enforcement Point (PEP) acts as the enforcement component, responsible for intercepting and evaluating requests for data access or usage. The Policy Decision Point (PDP) serves as the central authority for making access control decisions based on predefined policies and rules. The Policy Execution Point (PXP) executes the decisions made by the PDP, allowing or denying access to data based on the evaluated policies. Together, these components ensure that data sharing activities within the ecosystem adhere to the established contractual agreements and regulatory requirements. The PEP, PDP, and PXP functionalities provide a robust framework for policy enforcement, access control, and decision-making, enhancing the security, compliance, and trustworthiness of data exchanges within the data ecosystem.
Consent functionalities play a critical role in ensuring transparent and ethical data exchange within the data ecosystem.
Individual end users have the ability to manage their consents within the data ecosystem. This functionality allows users to give their explicit consent for data sharing between participants, review their existing consents at any time, and revoke consent if desired. By centralizing the consent management process, users can make informed decisions, stay informed about their data sharing agreements, and exercise control over the usage of their personal data within the ecosystem.
Personal Data Intermediaries (PDIs) are consent management components that facilitate consent management for end-users and the verification of consent during data exchanges within the ecosystem. PDIs provide user interfaces (UIs) or APIs through which end-users can manage their consents, including giving, reviewing, and revoking consent for data sharing. These PDIs also play a crucial role in verifying the validity of consent during data exchanges between participants. By leveraging PDIs, participants can ensure that the necessary consent has been obtained from the individual end-user before accessing or utilizing their personal data. This functionality promotes transparency, accountability, and compliance with data protection regulations, enhancing the trust and integrity of data exchanges within the data ecosystem.
The data ecosystem orchestrator can visualize statistics on generated consents across the ecosystem. This functionality provides insights into consent patterns, trends, and the overall consent landscape within the ecosystem. By accessing consent statistics, the orchestrator can gain a comprehensive understanding of consent-related activities, monitor compliance, and identify areas for improvement.
The system captures and logs consent-related activities, including consent given, revoked, and modified. This functionality ensures a comprehensive audit trail of consent activities within the ecosystem. Consent logging helps in maintaining transparency, accountability, and compliance with data protection regulations.
The user interface (UI) of the system should be intuitive, well-designed, and user-friendly. It should provide a seamless and efficient user experience, enabling users to easily navigate and interact with the system's functionalities.
The system should be accessible to a diverse range of users, including those with disabilities. It should comply with accessibility standards and guidelines, ensuring equal access and usability for all individuals.
The system should be compatible with various platforms and devices, including desktop computers, mobile devices, and tablets. It should be responsive and capable of delivering a consistent user experience across different platforms and screen sizes.
The system should support interoperability with external systems and technologies. It should adhere to industry standards and protocols, allowing for seamless integration and data exchange with other systems or data ecosystems.
The system should comply with relevant data protection regulations, such as the General Data Protection Regulation (GDPR) or other applicable privacy laws. It should ensure the secure handling, storage, and processing of personal data, and provide mechanisms to obtain and manage user consents appropriately.
The system should implement robust security measures to protect data and ensure the confidentiality, integrity, and availability of information. This includes encryption, access controls, authentication mechanisms, and regular security audits to identify and mitigate potential vulnerabilities.
The system should provide capabilities to generate compliance reports and logs. This functionality assists in demonstrating regulatory compliance and supporting auditing processes.
The system should support effective data governance practices. It should include mechanisms for defining and enforcing data governance policies, access controls, and data lifecycle management, ensuring compliance with regulatory requirements and internal policies.
The system should maintain comprehensive audit trails of data transactions, consent management activities, and system events. These audit logs help in tracing data flows, detecting unauthorized access, and supporting regulatory compliance audits.
The catalog is responsible for storing information about data ecosystems, datasets and service offerings, allowing users to discover, access and utilize resources. To ensure interoperability and compliance with the IDS specification, the catalog must adhere to the Catalog Protocol described in the IDSA documentation.
In line with the protocol, resources within the catalog are described using the W3C DCAT v3 (Data Catalog Vocabulary) standard and serialized into JSON-LD format. This choice of standards ensures compatibility and consistency across different systems and allows for easy integration with other IDS components and systems. By utilizing DCAT v3 and JSON-LD, the catalog enables comprehensive and machine-readable descriptions of data ecosystems, data offerings and service offerings facilitating effective resource discovery and access.
The Data Ecosystem Self Description is described using the W3C DCAT v3 (Data Catalog Vocabulary) standard. It can be serialized into JSON-LD format, allowing for easy interchange and integration with other systems. It aligns with the Gaia-X participant ontology and Gaia-X description of resources which also use JSON-LD format. The serialization from W3C DCAT v3 to JSON-LD compact form should be done as specified in the JSON-LD 1.1 Processing Algorithms and API.
The Rolebook takes the form of a public GitHub / GitLab repository, maintained by Prometheus-X, serving as a centralized hub for standardized interfaces on roles and obligations for a data ecosystem. This repository acts as a comprehensive resource accessible to data ecosystem orchestrators, providing them with a basis for defining and assigning roles and responsibilities to the participants of their ecosystems. With the Role Repository as a public resource, data ecosystem orchestrators benefit from a shared knowledge base and best practices. They can easily explore existing role interfaces, customize and adapt them to their specific ecosystem requirements and contribute back to the registry with their own insights and enhancements. This collaborative approach promotes consistency, interoperability and efficiency across the data ecosystems.
The Role Registry is a database issued by the Catalog Service. It implements the interfaces and models described in the Roles & Obligations Prometheus-X Repository and makes the standard roles and obligations available to the Catalog users. The interaction with the Role Registry is handled through an HTTP REST API. A GET request on /roles for example would trigger the Role Registry to fetch the standard roles in the database and respond with the roles in a JSON-LD format.
The Standard Descriptions component serves as a repository for predefined descriptions of business models, purposes, and datasets within the data ecosystem. These descriptions are represented using JSON-LD, ensuring consistency and interoperability across the ecosystem. The Standard Descriptions repository is housed within a public GitHub / GitLab repository maintained by Prometheus-X, serving as a central hub for pre-set interfaces and descriptions. By maintaining the repository, Prometheus-X ensures that participants have a reliable and up-to-date source of information. The repository serves as a reference point for establishing best practices, ensuring consistency, and promoting interoperability across data ecosystems.
The Data Ecosystem Manager is an API that allows for the generation, request, and modification of data ecosystem self-descriptions. It provides a REST API interface, enabling participants to programmatically interact with the manager for self-description management tasks. The Data Ecosystem Manager empowers participants to dynamically create, update, and retrieve self-descriptions, ensuring the agility and adaptability of the data ecosystem.
fig.1 - Registration and update of a data ecosystem self description
The Data Ecosystem Admin Interface is a user interface (UI) built on top of the REST API of the Data Ecosystem Manager. Utilizing web technologies, it offers a user-friendly interface for data ecosystem administrators to manage their respective ecosystems. The Admin Interface provides functionalities for creating, editing, and viewing self-descriptions, ensuring seamless administration of the data ecosystem.
The Data Ecosystem Catalog API is a REST API that enables communication, discovery, and browsing of resources within the catalog. Participants can use the API to search, access, and interact with resources available in the catalog, leveraging standard HTTP methods and response formats. The Catalog API # provides endpoints for querying and retrieving metadata about data offerings, services, and other relevant resources.
Following the IDSA Catalog Protocol guidelines, the Catalog API supports the CatalogRequestMessage which is sent by the consumer to the catalog service. The CatalogRequestMessage may include a filter property for implementing implementation-specific query or filter expressions supported by the catalog service. The service must respond to the CatalogRequestMessage with a Catalog, which contains all the offered assets that the requester should see. The response includes a Catalog schema to the defined message format.
In addition, the catalog API also supports the CatalogError message, which is used when an error occurs after a CatalogRequestMessage and the provider is unable to provide the catalog to the requester. The CatalogError message enables appropriate handling of errors and responses in case of failures and exceptions.
fig.2 - Interaction between Participant & the Catalog API
Catalog Request Message example:
Catalog Response example:
The Data Ecosystem Catalog API adheres to the DCAT vocabulary mapping which describes how the Internation Data Spaces (IDS) Information Models maps to DCAT resources. It defines the structure and attributes of the Asset Entry, Distributions, and DataService within the catalog. The API ensures compatibility and interoperability between IDS and DCAT resources, enabling effective integration and data exchange. To ensure technical considerations, the Data Ecosystem Catalog API supports queries and filter expressions as an implementation-specific feature. It is expected that query capabilities will be implemented by the consumer against the results of a CatalogRequestMessage, allowing client-side querying and periodic crawling of provider catalog services.
Security is an important consideration for catalog services, and while not mandatory, it is expected that catalog services implement access control mechanisms. Catalog services may require consumers to include a security token along with the CatalogRequestMessage, and the specifics of token inclusion and verification can be found in the relevant catalog binding specification.
The Data Ecosystem Catalog Interface is a user interface (UI) built on top of the Catalog API. It serves as the front-end component that presents the resources available in the catalog to users, facilitating resource discovery and exploration. The Catalog Interface utilizes web technologies to deliver an intuitive and user-friendly interface for browsing and interacting with catalog resources.
The Data Ecosystem Matcher is a component that runs a matching logic and provides matching capabilities between data ecosystems and relevant data offerings or service offerings. It exposes its functionality through a REST API, allowing participants to access the matching service programmatically. The Matcher employs advanced algorithms and criteria to analyze the characteristics and requirements of data ecosystems, providing suitable matches with relevant offerings.
Catalog Protocol
This section focuses on the components and functionalities related to contract management within a data ecosystem. In this environment, participants sign an accession agreement that serves as a proof of participation to an ecosystem and serves as a contract for data exchange between two participants within the same data ecosystem. The goal is to establish a standardized and efficient approach to contract management without the need for bilateral contracts.
This component serves as a centralized repository for contracts within the data ecosystem. Contracts, which are stored using the Open Digital Rights Language (ODRL) format, can be accessed and managed through this registry. It provides a comprehensive view of all contracts, enabling participants to reference and retrieve contract information as needed.
The Contract Generator is responsible for augmenting the accession agreement contracts for data exchange between participants. Using the ODRL format, this component enhances the accession agreement, ensuring that it contains all the necessary clauses and terms required for a specific data exchange. It automates the contract generation process, reducing manual effort and potential errors.
For participants to take part in a data ecosystem, they need to sign the accession agreement. In order to achieve this, a combination of Decentralized Identifiers (DIDs), Verifiable Credentials (VCs) and the eIDAS 2.0 framework is used. This approach ensures secure authenticated signatures for the contract, enhancing trust and data exchange integrity within the ecosystem.
The Eclipse Data Connector (EDC) is a component aligned with the International Data Spaces Association (IDSA) and GAIA-X’s specifications. It prioritizes data sovereignty and secure data exchange while ensuring compatibility with emerging standards and frameworks for data sharing and interoperability. Leveraging the use of DIDs and VCs, which are fundamental components of the IDS architecture, the EDC authenticates and authorizes data transfers within the ecosystem. Additionally, the EDC can act as a Policy Enforcement Point (PEP) by evaluating access requests against policies, formulating eXtensible Access Control Markup Language (XACML) requests for validation by the Policy Decision Point (PDP).
These components are responsible for enforcing and making decisions regarding policies during a data exchange within the ecosystem. Policies are described by service and data providers when registering their resources to the catalog. These policies are described using the XACML format which can then be processed by the EDC when formulating XACML requests for validation by the Policy Decision Point to ensure compliance with established policies and rules.
The Standard Clauses database serves as a centralized repository of standard clauses accessible to the data ecosystem. These causes establish common terms, obligations, and conditions that can be incorporated into contracts as needed. It promotes consistency and simplifies the contract creation process.
The Contract Protocol outlines the procedures and interactions involved in managing contracts within the data ecosystem. In this context, participants sign an accession agreement that acts as the authorized contract for data exchange between ecosystem participants. The protocol ensures a standardized approach to contract management, facilitating secure data sharing.
This section outlines the functionalities and processes related to managing consent within the data ecosystem. Consent plays a crucial role in ensuring that end-users have control over the sharing and usage of their data. The protocols and components within this section facilitate the collection, verification, storage and revocation of consents in a secure and compliant manner.
The Consent Manager component enables end-users to give their consent to share their data amongst ecosystem participants. It provides a user-friendly interface or API that allows individuals to provide explicit consent for data exchanges. The consent manager also ensures the privacy and security of consent-related information including encryption and storage measures. This component should be able to communicate with the catalog and contract components to run verifications on available datasets to share and services to use to have the ability to present consent information to the end-user. Consents should be generated and stored under the Kantara Consent Receipt format, which is an open standard designed to provide individuals with a record of their consent and the necessary information to manage their consent manually. The use of the Kantara Consent Receipt standard offers several benefits for different stakeholders involved in a data space such as being compatible with the GDPR, designed to improve digital privacy for both consumers and businesses and is designed to be interoperable allowing different systems to exchange and understand consent-related information under a common format.
Kantara Consent Receipt JSON-LD example
To enhance integrity and authenticity of the consent, its verification is a crucial functionality of the data space. The verification component ensures that the consent provided by participants is valid and up-to-date. It validates the authenticity and integrity of the consent data, allowing ecosystem participants to rely on the verified consent for data exchanges.
To comply with GDPR, end-users need to have the right to revoke their consent at any time. The consent revocation service enables individuals to revoke their previously given consent. It ensures that the revocation is recorded and propagated throughout the ecosystem, ensuring that data exchanges cease based on the revoked consent.
In addition to the consent-related functionalities, the Consent Logging component records and maintains a log of all consent-related activities within the data ecosystem. This log serves as an audit trail, capturing consent-related events, including consent granting, revocation, and verification activities. It provides transparency and accountability, ensuring compliance with data protection regulations.
The Consent Protocol outlines the processes and interactions involved in managing consent. It provides a framework for capturing, verifying and facilitating data exchanges based on explicit user consent. It ensures that data exchanges within the data ecosystem provide end-users with transparency, accountability, and control over their data.
The Consent Revocation process ensures that individuals have the ability to revoke their previously given consent for data usage within the data ecosystem. It demonstrates the importance of providing individuals with the ability to control and manage their consents within the data ecosystem.
Gaia-X participant ontology: https://gaia-x.gitlab.io/gaia-x-community/gaia-x-self-descriptions/participant/participant.html# GXFS Catalog Features: https://www.gxfs.eu/core-catalogue-features/ GXFS Federated Services: https://www.gxfs.eu/specifications/ IDSA Catalog Protocol: https://docs.internationaldataspaces.org/dataspace-protocol/catalog/catalog.protocol JSON-LD Processing Algorithms and API: https://www.w3.org/TR/json-ld11-api/#compaction-algorithms Kantara Consent Receipt Specification: https://kantarainitiative.org/file-downloads/consent-receipt-specification-v1-1-0/ IDSA Catalog Message types examples: https://github.com/International-Data-Spaces-Association/ids-specification/tree/main/catalog/message Sitra Rulebook: https://www.sitra.fi/en/publications/rulebook-for-a-fair-data-economy/#download-the-rulebook










