-
Notifications
You must be signed in to change notification settings - Fork 3
/
syntax.text
731 lines (607 loc) · 31.8 KB
/
syntax.text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
= A modest proposal for a refactored SHACL =
== Introduction ==
=== Major Changes ===
==== Collapsing Shapes and Constraints ====
The central change in this proposal is to collapse constraints and shapes into one
construct. Instead of, for example, having sh:or and similar component properties
take shapes as arguments and produce constraint components they take shapes and
produce shapes. This results in smaller shapes and a more uniform syntax, replacing
ex:s1 a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:or
( [ a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:class ex:Person
]
]
[ a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:in ( ex:elvis )
]
] )
] .
with
ex:s1 a sh:Shape ;
sh:or
( [ a sh:Shape ;
sh:class ex:Person
]
[ a sh:Shape ;
sh:in ( ex:elvis )
] ) .
==== No Property or Inverse Property Constraints ====
There is also no longer any division between node constraints/shapes, property
constraints/shapes, and inverse property constraints/shapes. The difference in
behaviour between node, property, and inverse property constraints/shapes is instead
controlled by the link to the constraint/shape from its enclosing shape. When
processing a shape the objects of sh:property links in the shape are treated as
shapes on the values of the sh:predicate argument and the objects of
sh:inverseProperty links are treated as shapes on the values of the inverse of the
sh:predicate arguments. This change replaces
ex:s1 a sh:Shape ;
sh:constraint [
a sh:PropertyConstraint ;
sh:predicate ex:parent ;
sh:shape [
a sh:Shape ;
sh:constraint [
a sh:PropertyConstraint ;
sh:predicate ex:parent ;
sh:shape [
a sh:Shape ;
sh:constraint [
a sh:PropertyConstraint ;
sh:predicate ex:age ;
sh:minExclusive 50
]
]
]
]
] .
with
ex:s1 a sh:Shape ;
sh:property [
a sh:Shape ;
sh:predicate ex:parent ;
sh:property [
a sh:Shape ;
sh:predicate ex:parent ;
sh:property [
a sh:Shape ;
sh:predicate ex:age ;
sh:minExclusive 50
]
]
] .
A previous version of this proposal pulled the sh:predicate constructs out of the
shapes and had sh:property take a two-element list, resulting in
ex:s1 a sh:Shape ;
sh:property ( ex:parent [
a sh:Shape ;
sh:property ( ex:parent [
a sh:Shape ;
sh:property ( ex:age [
a sh:Shape ;
sh:minExclusive 50
] )
] )
] ) .
The result of these two changes is that there is only one major combining construct
in the syntax - shapes. There are no longer any constraints, or any of the various
subcategories of constraints.
=== Other Changes ===
There are several other changes made in this proposal that are somewhat
orthogonal to the central changes.
==== Single-Argument Components ====
Each component of a shape consists of a single triple attached to the shape. If a
component needs more than a single parameter then its parameters are bundled into a
single node. So a pattern component would not have an optional flag parameter but
would use a single composite parameter when a flag is present, either a list as in
ex:s2 a sh:Shape ;
sh:pattern ( "^John" "i" ) .
or a node with its own properties as in
ex:s2 a sh:Shape ;
sh:pattern [ sh:regex "^John" ; sh:flags "i" ] .
instead of
ex:s2 a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:pattern "^John" ;
sh:flags "i"
] .
==== Repeated Components ====
Constraint components can be repeated, replacing
ex:s3 a sh:Shape ;
sh:constraint [
sh:and
( [ a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:class ex:Person
] ]
[ a sh:Shape ;
sh:constraint [
a sh:NodeConstraint ;
sh:class ex:Alien
] ] ) .
with
ex:s3 a sh:Shape ;
sh:class ex:Person ;
sh:class ex:Alien .
==== Property Paths ====
Paths of properties and their inverses are allowed instead of just single properties.
A new construct, with property sh:propValues, is used for this, replacing sh:property
and sh:inverseProperty. '''The name is not great and probably should be changed.'''
Property paths permit the shape at the beginning of this document to be written as
ex:s1 a sh:Shape ;
sh:propValues [
a sh:Shape ;
sh:path ( ex:parent ex:parent ex:age ) ;
sh:minExclusive 50
] .
or using the older list syntax as
ex:s1 a sh:Shape ;
sh:propValues ( ( ex:parent ex:parent ex:age ) [
a sh:Shape ;
sh:minExclusive 50
] .
==== RDFS Typing in Shape Graphs ====
This proposal removes the need for most shapes include an explicit rdf:type link to
sh:Shape. It does this by using rdfs:domain and rdfs:range for various properties,
which are given in the metamodel for SHACL, and then using RDFS typing for the shapes
graph. If all shapes have explicit typing then there is no need for this mechanism.
==== Readable Strings are rdfs:langString ====
All strings that contain language to be presented to users are of type
rdf:langString.
=== What Follows ===
The details of the proposal follow. To set the context of the proposal there are
preamble and conformance sections. Then there are sections on the core syntax, an
informal semantics, and an extension mechanism. Each of these three sections builds
on the previous, but the syntax changes are independent of the semantics which is
independent of the extension mechanism.
== Preamble ==
SHACL is designed to determine whether some information satisfies a set of
conditions. This process is called validation in SHACL. The main functionality
provided by a SHACL engine is to determine whether one RDF graph, called the
''data graph'' and containing the information to be validated, validates against another RDF
graph, containing the conditions and called the ''shapes graph'' because the main
syntactic construct in SHACL is the shape. Data graphs and nodes in the data graph
are said to validate against shapes and shapes graphs.
SHACL is divided into a core portion and an extension portion. The extension portion
of SHACL uses SPARQL 1.1 Query Language queries from the shapes graph in validation.
This document uses Turtle syntax for graphs and nodes, with the following prefix
bindings:
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
xsd: http://www.w3.org/2001/XMLSchema#
sh: http://www.w3.org/ns/shacl#
SHACL depends on a notion of instance that uses vocabulary from RDF (rdf:type) and
RDFS (rdfs:subClassOf) but in a way different from both. A node is a ''SHACL instance''
in the data graph of another node precisely when there is a triple in the data graph
with the instance node as subject, rdf:type as predicate, and object that is either
the other node itself or is linked to that node via a chain of triples in the data
graph each with rdfs:subClassOf as predicate.
All SHACL constructs described here are composed only of triples in the shapes graph.
SHACL uses lists in the shapes graph in its syntax. The ''list nodes'' of a node in the
shapes graph are defined as the nodes in transitive-reflexive closure of rdf:rest
triples in the shapes graph considered as a directed relationship between nodes. A
node in the shapes graph is a SHACL ''list'' precisely when
#rdf:nil is one of its list nodes,
#all its list nodes have a single rdf:rest, except that rdf:nil has none, and
#all its list nodes have a single rdf:first, except that rdf:nil has none.
The rdf:first values for the list nodes are the ''elements'' of the list, in order.
A list where all the elements have some characteristic is called a
''list of that characterstic'', e.g., a list where all the elements are IRIs
is called a list of IRIs.
The objects of triples with a given subject and predicate are ''values'' of the predicate
for the subject. That subject is said to have these objects for the predicate.
== Conformance ==
A SHACL validation engine MUST provide an interface that validates a data graph
against a shapes graph containing core SHACL constructs, as described herein, and
returns all the validation results for each scoped shape in the shapes graph. A SHACL
validation engine MAY provide other interfaces that return only some of the
validation results. These other interfaces MUST also return a boolean result that is
true only if this set is the entire set of validation results. A SHACL validation
engine MAY provide an interface that returns true if there is at least one validation
result with severity sh:Violation, false otherwise.
If the shapes graph provided to it is syntactically illegal a SHACL
validation engine MUST so signal. The engine MAY try to use the illegal shapes
graph for validation or MAY terminate processing without performing validation.
A SHACL validation engine MAY provide interfaces that construct the shapes graph and
data graph from other inputs and perform validation on these constructed graphs. If
so, the engine must provide an interface that takes an RDF dataset and optionally the
IRI of a graph in that dataset. The engine constructs the data graph starting with
the graph with that name, or the default graph of the dataset if not provided with a
graph name, and incorporating other graphs from the datatset using owl:imports. The
engine constructs the shapes graph by starting with all graphs in the dataset with
names taken from objects of sh:shapesGraph triples in the data graph and then using
owl:imports. '''Does this need to be more precise?'''
A SHACL validation engine MUST implement all the constructs of core SHACL. If a
SHACL validation engine implements any of the extended constructs of SHACL then it
MUST implement them all.
A SHACL validation engine MAY produce warnings if nodes that are expected to be in
the data graph are not present there (e.g., for sh:scopeNode) or are not a SHACL
instance there of an expected type (e.g., sc:scopeClass values and rdfs:Class).
A SHACL validation engine MUST provide an interface that signals an error if the
shapes graph contains SHACL instances of sh:Shape that do not conform to the syntax
herein or that have themselves as a shape in the transitive closure of their
components and filters. '''Does this need to be more precise?''' If the SHACL
validation engine implements any of the extended constructs of SHACL then this MUST
also consider the syntax of template components and template scopes.
== Core SHACL Syntax ==
The main SHACL construct is a shape. A ''shape'' is a node in the shapes graph that is
an RDFS instance of sh:Shape in the shapes graph. (See below for a
different definition of shapes in a shape graph.)
Shapes in core SHACL have
#zero or more ''scopes'' (triples with the shape as subject and sh:scopeNode, sh:scopeClass, sh:scopePropertyObject, sh:scopePropertySubject, sh:scopeAllObjects, or sh:scopeAllSubjects as the predicate),
#zero or more ''filters'' (triples with sh:filter as the predicate), whose objects are themselves shapes, and
#zero or more ''components'' (triples with the shape as subject and one of the component properties below as property).
The scopes of a shape select nodes for validation. Filters further cut down
on the nodes for validation. Components state conditions that nodes need to
satisfy to validate against the shape.
The syntax of the various core SHACL components are:
PROPERTY VALUE
sh:in list of IRIs and literals
sh:class IRI (which is used as a class)
sh:classIn list of IRIs (which are used as classes)
sh:datatype IRI (which is used as a datatype)
sh:datatypeIn list of IRIs (which are used as datatypes)
sh:directType IRI (which is used as a class)
sh:minLength non-negative integer
sh:maxLength non-negative integer
sh:minExclusive literal
sh:minInclusive literal
sh:maxExclusive literal
sh:maxInclusive literal
sh:nodeKind sh:BlankNode or sh:IRI or sh:Literal or
BlankNodeOrIRI or BlankNodeOrLiteral or IRIOrLiteral
sh:pattern string used as a regular expression pattern or
node with sh:regex value as pattern and sh:flags value as flags
sh:equals two paths
sh:disjoint two paths
sh:lessThan two paths
sh:lessThanOrEqual two paths
sh:propValues shape with single sh:path value which is a path
sh:list shape
sh:shape shape
sh:and list of shapes
sh:or list of shapes
sh:not shape
sh:hasValue IRI or literal
sh:minCount non-negative integer
sh:maxCount non-negative integer
sh:uniqueLang boolean
sh:partition list of shapes
sh:closed list of path parts (which are used as (inverse) properties)
sh:constraint shape (for compatability)
sh:property shape with sh:predicate (for compatability)
sh:inverseProperty shape with sh:predicate (for compatability)
A SHACL ''path'' is a path part or a blank node that is a list of SHACL path
parts. A core SHACL ''path part'' is either an IRI or a non-list blank node with a
single value for sh:inverse.
Several other properties and IRIs are used in SHACL
PROPERTY Value
sh:severity sh:Info or sh:Warning or sh:Violation
sh:message language tagged-string
sh:name language-tagged string
sh:description language-tagged string
sh:order decimal literal
sh:group property group
sh:defaultValue IRI or literal
Note: Qualified cardinalities are replaced by an embedded shape where the embedded
shape's filter has the same role as the value for sh:qualifiedValueShape.
=== Examples ===
exs:personShape a sh:Shape;
sh:scopeNode ex:John ;
sh:scopePropertyObject ex:parent ;
sh:propValues [ sh:path ex:child ; sh:class ex:Person ] ;
sh:propValues [ sh:path ex:age ;
sh:datatype xs:integer;
sh:minCount 1 ; sh:maxCount 1 ] .
A graph validates against exs:personShape if all SHACL instances of
ex:Person in that graph have all their stated names be strings, all their
stated children belonging to ex:Person, and have exactly one stated age,
which is an integer.
exs:SJG a sh:Shape;
sh:scopeClass ex:Person ;
sh:filter [ sh:propValues [ sh:path ex:gender ; sh:hasValue ex:female ] ] ;
sh:filter [ sh:propValues [ sh:path ( ex:child ex:child ) ; sh:minCount 5 ] ] ;
sh:propValues ( ex:child
[ sh:filter [ sh:propValues [ sh:path ex:gender ; sh:hasValue ex:male ] ] ;
sh:class ex:Professional ] ) .
A graph validates against exs:SJG if all SHACL instances of ex:Person (the scope)
that have ex:female as gender (the first filter) and have at least five grandchildren
(the second filter) have all their male children be SHACL instances of
ex:Professional.
ex:IssueShape a sh:Shape ;
sh:scopeClass ex:Issue;
sh:propValues [ sh:path ex:state ;
sh:in (ex:unassigned ex:assigned) ;
sh:minCount 1 ; sh:maxCount 1 ] ;
sh:propValues [ sh:path ex:reportedBy
; sh:shape ex:UserShape ;
sh:minCount 1 ; sh:maxCount 1 ] .
ex:UserShape a sh:Shape ;
sh:propValues [ sh:path foaf:name ;
sh:datatype xsd:string ;
sh:minCount 1 ; sh:maxCount 1 ] ;
sh:propValues [ sh:path foaf:mbox ;
sh:nodeKind sh:IRI ;
sh:minCount 1 ] .
Example 1 from SHACL document
exs:sample a sh:Shape ;
sh:scopeNode ex:John ;
sh:scopePropertyObject ex:parent ;
sh:propValues [ sh:path ex:dependent ;
sh:class ex:Person ; sh:class ex:Child ;
sh:shape ex:sample2 ] ;
sh:propValues [ sh:path ex:age ;
sh:datatype xs:integer ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:minInclusive 12 ] ;
exs:sample2 a sh:Shape ;
sh:propValues [ sh:path ex:age ;
sh:datatype xs:integer ;
sh:minCount 1 ; sh:maxCount 1 ;
sh:minInclusive 25 ] .
Just showing some other syntax.
== Validation results ==
Validation results are essentially the same as in the current design.
sh:subject, sh:predicate, and sh:object are set as appropriate to each component
here. sh:sourceConstraint is not used as there are no longer any constraints.
sh:sourceTemplate is the renamed as sh:Component and is the component property.
== Semantics for Core SHACL ==
A data graph ''validates'' against a shapes graph if the data graph
validates against every node in the shapes graph that is the subject of at
least one triple whose predicate is one of the scope properties. These
nodes are called the ''scoped shapes'' of the shapes graph. Any node in the
shapes graph that is used as a shape during validation is then a shape
whether it has SHACL type sh:shape or not. (This is a different definition
of shapes but it does not require any use of typing to determine what is a
shape.)
A data graph ''validates'' against a scoped shape if the set of nodes of the graph
selected by any scope of the scoped shape validates against the scoped shape.
Scopes ''select'' certain RDF terms (or nodes, but RDF only uses node with
respect to the nodes of an RDF graph).
A sh:scopeNode value selects that term.
A sh:scopeClass value selects each term that is a SHACL instance of the
class in the data graph.
A sh:scopePropertyObject value selects each term that is the object of some
triple in the data graph with that property as predicate.
A sh:scopePropertySubject value selects each term that is the subject of a
triple in the data graph with that property as predicate.
A sh:scopeAllObjects selects all terms that are the object of some triple in
the data graph.
A sh:scopeAllSubjects selects all terms that are the subject of some triple in
the data graph.
A set of RDF terms (the ''focus nodes'') validates against a shape as
follows. The ''in-filter nodes'' are those focus nodes which validate
against each of the filters of the shape. Focus nodes that are not
in-filter nodes are called ''out-of-filter nodes''.
To uniformly handle all components the semantics for components is defined in terms
of a set of of nodes (the in-filter nodes) and not in terms of a single node.
Some components (sh:shape, sh:and, sh:or, sh:not, sh:hasValue, sh:minCount,
sh:maxCount, sh:uniqueLang, and sh:partition) work on this set as a whole. There is
one or more elements in the ''validation result'' precisely when one or more in-filter
node fails to validate against the component. The component separately determines
which in-filter nodes validate or fail to validate against the component.
The other components (sh:in, ..., sh:pattern, sh:propValues) work on each
in-filter node independently. The ''validation results'' for the component
contains an element for each in-filter node that failed to validate against
that component. in-filter nodes for which there is no validation result are
considered to ''validate'' against the component.
A set of focus nodes ''validates'' against a shape if the set of in-filter
nodes from this set of focus nodes validates against each of the shape's
components.
The core components work as follows, where all typing determinations are made in the
data graph, all triples come from the data graph, and paths are processed as in the
SPARQL 1.1 Query Language. For components that are defined using SPARQL operators,
an error result is treated as if the operator had returned false.
* sh:in list of node
:: Each in-filter node is a member of the list.
* sh:class class
:: Each in-filter node has class as one of its SHACL types.
* sh:classIn ( class ... )
:: Each in-filter node has one of the classes as one of its SHACL types.
* sh:datatype datatype
:: Each in-filter node has datatype as its datatype.
* sh:datatypeIn ( datatypes ... )
:: Each in-filter node has one of the datatypes as its datatype.
* sh:directType class
:: Each in-filter node has an rdf:type link to the class.
* sh:minLength non-negative integer
:: The SPARQL STR representation of each in-filter node is at least this long.
* sh:maxLength non-negative integer
:: The SPARQL STR representation of each in-filter node is at most this long.
* sh:minExclusive literal
:: Each in-filter node is greater than this literal, using SPARQL > test.
* sh:minInclusive literal
:: Each in-filter node is greater than or equal to this literal, using SPARQL >= test.
* sh:maxExclusive literal
:: Each in-filter node is less than this literal, using SPARQL < test.
* sh:maxInclusive literal
:: Each in-filter node is less than or equal to this literal, using SPARQL <= test.
* sh:nodeKind sh:BlankNode or sh:IRI or sh:Literal or BlankNodeOrIRI or BlankNodeOrLiteral or IRIOrLiteral
:: Each in-filter node is a blank node, an IRI, a literal, a blank node or IRI, a blank node or literal, or an IRI or literal, respectively
* sh:pattern regex or [ sh:regex regex; sh:flags flags ]
:: The SPARQL STR representation each in-filter node matches regex using SPARQL REGEX, with flags if present.
* sh:equals ( path1 path2 )
:: Each in-filter node has the same values for the paths starting at the node.
* sh:disjoint ( path1 path2 )
:: Each in-filter node has disjoint sets of values for the paths starting at the node.
* sh:lessThan ( path1 path2 )
:: For each in-filter node every value for the first path is SPARQL < every value for the second path.
* sh:lessThanOrEqual ( path1 path2 )
:: For each in-filter node each value for the first path is SPARQL <= every value for the second path.
* sh:propValues shape with sh:path
:: For each in-filter node the values of path validate against shape.
* sh:list shape
:: Each in-filter node is a SHACL list and its list elements validate against shape.
* sh:closed ( property ... )
:: Each in-filter node has no values for any property not in the list or that is not the property of a sh:propValues component of the shape.
* sh:shape shape
:: The nodes that validate are those that validate against the shape.
* sh:and ( shape ... shape )
:: The nodes that validate are those that validate against each shape.
* sh:or ( shape ... shape )
:: The nodes that validate are those that validate against some shape.
* sh:not shape
:: The nodes that validate are those that do not validate against the shape.
* sh:hasValue node
:: Some in-filter node is the same as the node.
* sh:minCount non-negative integer
:: There are at least integer in-filter nodes.
* sh:maxCount non-negative integer
:: There are at most integer in-filter nodes.
* sh:uniqueLang true
:: An in-filter node fails to validate if it has the same language tag as another in-filter node.
* sh:partition ( shape_1, ..., shape_n )
:: Let input_1 be the set of in-filter nodes.
:: Let input_i+1 be the out-of-filter nodes of shape_i for input_i.
:: Nodes that fail to validate for shape_i on input_i fail for the component, for 1<=i<=n.
:: Nodes in input_n+1 fail to validate for the component.
* sh:constraint shape
:: Same as sh:shape shape.
* sh:property shape with single value for sh:predicate.
:: Same as sh:propValues shape with sh:predicate property replaced by sh:path property.
* sh:inverseProperty shape with single value for sh:predicate.
:: Same as sh:propValues shape with sh:predicate property replaced by sh:path [sh:inverse property].
== SHACL Extension ==
This is a description of a SPARQL extension for SHACL based on substitution. This
extension mechanism for SHACL does not depend on any extensions to SPARQL and thus
can be used with any SPARQL implementation or service, including SPARQL endpoints
=== Using SPARQL Queries in SHACL ===
The extension uses solutions in SPARQL result sets as validation results. These
solutions have bindings for at least ?this and ?severity. A set of nodes validates
against a component in the SPARQL extension if evaluating the SPARQL query for the
component over the data graph would produce no query results with ?severity bound to
sh:Violation when joined to a result set with solutions having ?this bound to
elements of the set. Validation fails on nodes that have a solution in the joined
result set with ?this bound to the node and ?severity bound to sh:Violation.
The extension uses solutions in SPARQL result sets as scopes and as synthetic
properties. For scopes, the set of nodes in a scope are the set of bindings for
?scope in the solutions of a result set. For synthetic properties, the set of
subject, object pairs in a synthetic property are the set of pairs of bindings for
?subject and ?object in the solutions of a result set.
The SPARQL queries in this extension are always augmented with namespace prefix
bindings for rdf:, rdfs:, xsd:, and sh: as above. '''A mechanism for augmenting these
bindings might be useful.'''
The SPARQL queries in this extension are always run over the data graph.
(The shapes in templates validate arguments to components and scopes of
shapes, so the data graph for them is the shapes graph.)
=== Component Templates ===
Component templates provide components based on parameterized SPARQL code.
A ''component template'' is an IRI that is an RDFS instance of
sh:ComponentTemplate in the shapes graph. Each triple in the shapes graph
whose subject is a shape and whose predicate is a component template is a
''template component'' of the shape.
A component template is used as a shape to check the syntactic validity of
template components. The object of a template component triple is validated
against the component template considered as a shape. If the validation
fails the template component is syntactically illegal.
The sh:propValues components in component templates shapes can also have
sh:argumentName and sh:defaultValue triples that are used in the process of
constructing the query for template components.
Component templates provide SPARQL queries (actually strings that are used
to produce SPARQL code via the substitution process described below) to define
the query via their value for
sh:templateQuery parameterized SPARQL query
Alternatively the query can be constructed from the values of other properties
sh:templateFilter parameterized SPARQL expression
sh:templatePattern parameterized SPARQL pattern
sh:templateHaving parameterized SPARQL expression
sh:templateMessage parameterized language-tagged string
The sh:templateFilter evaluates to true for ?this bindings that would
validate against the component. The sh:templatePattern eliminates bindings
for ?this that would validate against the component. The sh:templateHaving
eliminates groups of solutions for ?this bindings that would validate
against the component. The values for sh:templateMessage are used to
construct the message part of validation results. These four property
values are all used to construct a parameterized query for the template
that looks like
SELECT [projection] ?this ?message ?severity (?this AS ?object)
WHERE { [outer] [inner] <pattern> FILTER ( ! <filter> ) }
[group] HAVING <having>
VALUES (?message ?severity) { ( <message> [severity] ) }
where the FILTER ( ! <filter> ) part is only present if there is a value for
sh:templateFilter and the [group] HAVING <having> part is only present if
there is a value forsh:templateHaving.
The actual query for the template component is produced by substitution into
the parameterized query. Values for all five of the above properties can
have substrings of the form
[expression]
that are treated as substitution expressions subject to replacement.
Substition expressions can be
# a quoted string, where the string is substituted
# a simple name, where the name's mapping in the context is substituted
# p(name), where the name's mapping is processed as a path
# s(name), where the name's mapping is processed as a shape
# c(path name), where the name's mapping is processed as a shape to produce a shape for path values
# l(expression string), where the list values for the value of identifier in expression are processed instead of its values and the results joined with string
'''This syntax needs to be defined more precisely.''' All processing
happens in the shapes graph.
The values for names come from two sources. Values for several standard
names are set up automatically:
# argument is the object of the template component triple
# projection is a SPARQL variable that serves as a context, but can also be the empty string
# outer is a SPARQL pattern that produces bindings for the projection variable, but can also be the empty string
# inner is a SPARQL pattern that produces bindings for ?this
# group is a SPARQL GROUP BY fragment that groups by the projection variable, but can also be the empty string
# severity is the severity to be used in the component
Other values come from sh:propValues components in the component template that
have sh:argumentName triples in their shape argument. Each such sh:propValues
component produces a value for the object of its sh:argumentName triple that
is either one of the values for its path starting from the its argument or
the value of its sh:defaultValue triple if there are no values for the path.
Note: Having multiple sh:propValues with the same sh:argumentName leads to
indeterminate results. Having multiple values for an argument path leads to
indeterminate results. '''It might be better to instead collect all the values
somehow, but there would need to be some analysis to determine how to do this best.'''
=== Scope Templates ===
Scope templates provide scopes based on parameterized SPARQL queries. A
''scope template'' is an IRI that is an RDFS instance of sh:ScopeTemplate in
the shapes graph. Each triple in the shapes graph whose subject is a shape
and whose predicate is a scope template is a template scope of the shape. A
scope template is also used as a shape in the same way that a component
template is.
Scope templates provide SPARQL queries to define the query via
sh:templateQuery parameterized SPARQL query
The query for a template scope is constructed in the same way as for a template
component except that the alternative query consruction method is not used. As well,
the only standard name is argument.
The nodes ''selected'' by a template scope are the bindings for ?scope in
solutions in the result set of the SPARQL query run over the data graph.
=== Direct Use of SPARQL Queries ===
SPARQL queries are directly allowed in shape components via
sh:query SPARQL query
The SPARQL query for a sh:sparql component is the provided query with no
substitutions.
SPARQL queries are directly allowed in scopes via
sh:scopeQuery SPARQL query
The SPARQL query for a sh:scopeQuery scope is the provided query with no substitution.
SPARQL queries are also allowed as path parts. The bindings of ?subject and
?object in solutions of its result set are considered as subjects and
objects of a synthetic property. '''This produces truly awful syntax, and
should be fixed somehow.''' Note: The derived values equality constraint is
replaced by an sh:equals with one of the paths using SPARQL code.
=== Examples ===
ex:example a sh:ComponentTemplate ;
sh:propValues [ sh:path ex:predicate ;
sh:class rdf:Property;
sh:minCount 1 ; sh:maxCount 1 ] ;
sh:propValues [ sh:path ex:lang ;
sh:datatype xs:string ;
sh:minCount 1 ; sh:maxCount 1 ] ;
sh:sparqlTemplate """... ?this ...""" .
ex:exampleShape a sh:Shape ;
ex:example [ ex:predicate ex:p ; ex:lang "de" ] .
sh:class a sh:ComponentTemplate ; rdfs:domain sh:Shape ; rdfs:range rdfs:Class ;
sh:nodeKind sh:IRI ;
sh:templateMessage "Does not have required class [argument]" ;
sh:messsage "Classes need to be IRIs" ;
sh:templateFilter "EXISTS { ?this rdf:type/rdfs:subClassOf* [argument] }" .
ex:listexShape a sh:Shape ;
ex:listex ( ex:person ex:p2 ) .