Make the plugin ready for the new data model#3
Conversation
|
Hey @notactuallyfinn, here is another case where @SKernchen and I refactored a HERMES plugin to use the new data model. Would you be willing to take a look at the code to see if this makes sense? Thanks! 😄 |
notactuallyfinn
left a comment
There was a problem hiding this comment.
Looks good to me, only a few very minor things.
| ctx = HermesCacheManager() | ||
| validation_file = ctx.cache_dir / "curate" / "validation.json" | ||
| validation_file.parent.mkdir(exist_ok=True, parents=True) | ||
| self._validation_graph.serialize(validation_file, format="json-ld") |
There was a problem hiding this comment.
Why not use the HermesCacheManager the way it was intended to be used?
(validation_file would have to be moved, but is there another reason?)
There was a problem hiding this comment.
What would be the intended way? Do you mean with a SoftwareMetadata object?
We tried that but failed because the validation graph is a list and SoftwareMetadata can only be initialized with a dict. It looks like this:
The interesting bit in this case is the validation report in ll. 443 - 454. The rest is just SHACL copied into the graph.
There was a problem hiding this comment.
I tried working around it, taking a concise bounded description:
report_refs = list(
self._validation_graph.subjects(
predicate=RDF.type, object=SH.ValidationReport
)
)
assert len(report_refs) == 1
report_ref = report_refs[0]
self._validation_graph.cbd(report_ref).serialize(
validation_file, format="json-ld"
)But this also creates a list:
[
{
"@id": "_:Nd5d1172d2d2d46829819e6fdce9722a9",
"@type": [
"http://www.w3.org/ns/shacl#ValidationReport"
],
"http://www.w3.org/ns/shacl#conforms": [
{
"@type": "http://www.w3.org/2001/XMLSchema#boolean",
"@value": true
}
]
}
]Maybe the JSON-LD serializer in RDFlib always does this.
I could return the single element from that list. But that won't work for failed validations because in those cases, the report is split into multiple objects rather than nested:
Known bug: The new data model uses
schema: <http://schema.org/>while rdflib used in the validation code usesschema: <https://schema.org/>which causes validation to be skipped for all non-https things.