Skip to content

Conversation

@EttoreM
Copy link

@EttoreM EttoreM commented Dec 5, 2025

This PR addresses #45

Implemented rules

# Rule sh:severity
1 RootDataEntity --> mentions SHOULD reference the CheckValue entity sh:Warning
2 The CheckValue entity MUST have additionalType with @id of https://w3id.org/shp#CheckValue (i) sh:Violation
3 CheckValue entity MUST be of type AssessAction sh: Violation
4 The name of the CheckValue entity MUST provide a human-readable summary of the review and result (ii) sh:Violation
5 CheckValue --> object SHOULD point to the root of the RO-Crate. sh:Warning
6 CheckValue --> instrument SHOULD point to an entity typed schema:DefinedTerm sh:Warning
7 CheckValue entity MAY have startTime property if action began sh:Info
8 CheckValue --> startTime MUST follows the RFC 3339 standard. sh:Violation
9 CheckValue entity SHOULD have endTime property if action ended sh:Warning
10 CheckValue --> endTime MUST follows the RFC 3339 standard. sh:Violation
11 CheckValue entity SHOULD have actionStatus property sh:Warning
12 CheckValue --> actionStatus MUST have one of the allowed values sh:Violation
13 CheckValue --> agent SHOULD reference the agent who initiated the check (iii) sh:Warning

(i) Rule 2 is not an actual rule, but rather a definition of what constitutess a CheckPhase object. Hence it is not possible to code it, given its axiomatic nature.
(ii) This cannot be entirely done without a manual check. Here we only check that the name property of the Check Phase object is a string of at least 20 characters.
(iii) All we are enforcing / testing here is that checkValue has a property agent which points to an IRI.


Rules not implemented (they need some clarification)

# Rule sh:severity
1 Submitted crate SHOULD be checked for integrity and completeness against the BagIt payload manifest and tag manifest, considering at least the SHA-512 algorithm. sh:Warning
2 This phase MAY also check any cryptographic signatures. sh:Info

@EttoreM EttoreM linked an issue Dec 5, 2025 that may be closed by this pull request
@EttoreM EttoreM self-assigned this Dec 5, 2025
@douglowe
Copy link

here as well I think we tackle the outstanding rules in another PR

Copy link

@douglowe douglowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good - just a couple of changes to make I think.

sh:description "CheckValue MUST have a human readable name string of at least 20 characters." ;
sh:path schema:name ;
sh:datatype xsd:string ;
sh:minLength 20 ;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the requirement for a minimum length for the name string. I think we should only automate checking the existence of such a string, and leave the user to determine if it is human readable or not.

Comment on lines +86 to +111
def test_5src_check_value_name_not_long_enough():
sparql = """
PREFIX schema: <http://schema.org/>
PREFIX shp: <https://w3id.org/shp#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

DELETE {
?this schema:name ?name .
}
INSERT {
?this schema:name "Short" .
}
WHERE {
?this schema:additionalType shp:CheckValue .
}
"""

do_entity_test(
rocrate_path=ValidROC().five_safes_crate_result,
requirement_severity=Severity.REQUIRED,
expected_validation_result=False,
expected_triggered_requirements=["CheckValue"],
expected_triggered_issues=["CheckValue MUST have a human readable name string of at least 20 characters."],
profile_identifier="five-safes-crate",
rocrate_entity_mod_sparql=sparql,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have this test, as we shouldn't be defining how long the name string should be.

?this schema:name ?name .
}
INSERT {
?this schema:name 123 .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is ambiguous, as we are not explicitly setting the data type here. Can we set the type of this object as well as providing a non-string value?

Alternatively, we replace it with a test which just checks for the existence of a name object.

Comment on lines +52 to +76
five-safes-crate:CheckValueObjectShouldPointToRootDataEntity
a sh:NodeShape ;
sh:name "CheckValue" ;
sh:description "" ;
sh:target [
a sh:SPARQLTarget ;
sh:select """
PREFIX schema: <http://schema.org/>
PREFIX shp: <https://w3id.org/shp#>
SELECT ?this
WHERE {
?this schema:additionalType shp:CheckValue .
}
""" ;
] ;

sh:property [
a sh:PropertyShape ;
sh:name "object" ;
sh:path schema:object ;
sh:minCount 1 ;
sh:hasValue <./> ;
sh:severity sh:Warning ;
sh:message "`CheckValue` --> `object` SHOULD point to the root of the RO-Crate" ;
] .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should point to an rocrate:RootDataEntity here, rather than checking for an object that has a <./> value. Just in case the definition of the root data entity changes later.

See this code for how I did this for the Sign-Off phase:

sh:sparql [
a sh:SPARQLConstraint ;
sh:description "Check if the Sign Off phase lists the workflow as an object" ;
sh:select """
PREFIX schema: <http://schema.org/>
PREFIX rocrate: <https://github.com/crs4/rocrate-validator/profiles/ro-crate/>
SELECT $this
WHERE {
?root a schema:Dataset ;
schema:mainEntity ?mainEntity ;
rdf:type rocrate:RootDataEntity .
FILTER NOT EXISTS {
$this schema:object ?mainEntity .
}
}
""" ;
sh:severity sh:Warning ;
sh:message "The Sign-Off Phase SHOULD list the workflow (mainEntity) as an object" ;
];

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Check Phase Ruleset

3 participants