Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Allow plus signs in labels #1926

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
10 changes: 5 additions & 5 deletions src/schema/objects/columns.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ derived_from:
`sample-<label>` entity from which a sample is derived,
for example a slice of tissue (`sample-02`) derived from a block of tissue (`sample-01`).
type: string
pattern: ^sample-[0-9a-zA-Z]+$
pattern: ^sample-[0-9a-zA-Z+]+$
desc_id:
name: desc_id
display_name: Description Label
Expand All @@ -125,7 +125,7 @@ desc_id:
its `desc_id` column SHOULD contain all labels of the `desc` entity)
used across the entire derivative dataset.
type: string
pattern: ^desc-[0-9a-zA-Z]+$
pattern: ^desc-[0-9a-zA-Z+]+$
description:
name: description
display_name: Description
Expand Down Expand Up @@ -369,7 +369,7 @@ participant_id:
A participant identifier of the form `sub-<label>`,
matching a participant entity found in the dataset.
type: string
pattern: ^sub-[0-9a-zA-Z]+$
pattern: ^sub-[0-9a-zA-Z+]+$
placement__motion:
name: placement
display_name: Placement
Expand Down Expand Up @@ -434,7 +434,7 @@ sample_id:
A sample identifier of the form `sample-<label>`,
matching a sample entity found in the dataset.
type: string
pattern: ^sample-[0-9a-zA-Z]+$
pattern: ^sample-[0-9a-zA-Z+]+$
sample_type:
name: sample_type
display_name: Sample type
Expand Down Expand Up @@ -466,7 +466,7 @@ session_id:
A session identifier of the form `ses-<label>`,
matching a session found in the dataset.
type: string
pattern: ^ses-[0-9a-zA-Z]+$
pattern: ^ses-[0-9a-zA-Z+]+$
sex:
name: sex
display_name: Sex
Expand Down
12 changes: 10 additions & 2 deletions src/schema/objects/formats.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,16 @@ index:
label:
display_name: Label
description: |
Freeform labels without special characters.
pattern: '[0-9a-zA-Z]+'
Free-form labels with alphanumeric and plus (+) characters.

Plus signs MAY be used to concatenate multiple applicable labels,
but no relationship is established by a partial match.
In particular, the inheritance principle does not connect files
containing entities such as `<name>-x+y` with either `<name>-x` or `<name>-y`.
For example, metadata stored in a file at the root of the dataset with name `/acq-6p_T2w.json`
does not apply to files with partially matching "acquisition" entity values
such as `/sub-1/anat/sub-1_acq-6p+s2_T2w.nii`.
pattern: '[0-9a-zA-Z+]+'
# Metadata types
boolean:
display_name: Boolean
Expand Down
2 changes: 1 addition & 1 deletion src/schema/objects/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3676,7 +3676,7 @@ TaskName:
Name of the task.
No two tasks should have the same name.
The task label included in the filename is derived from this `"TaskName"` field
by removing all non-alphanumeric characters (that is, all except those matching `[0-9a-zA-Z]`).
by removing all non-alphanumeric characters (that is, all except those matching `[0-9a-zA-Z+]`).
For example `"TaskName"` `"faces n-back"` or `"head nodding"` will correspond to task labels
`facesnback` and `headnodding`, respectively.
type: string
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

SUMMARY:
0 out of 1 files were successfully validated, using the following regular expressions:
- `.*?/sub-(?P<subject>[0-9a-zA-Z]+)/(|ses-(?P<session>[0-9a-zA-Z]+)/)anat/sub-(?P=subject)(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z]+))(|_ce-(?P<ceagent>[0-9a-zA-Z]+))(|_rec-(?P<reconstruction>[0-9a-zA-Z]+))(|_run-(?P<run>[0-9a-zA-Z]+))(|_part-(?P<part>(mag|phase|real|imag)))_(T1w|T2w|PDw|T2starw|FLAIR|inplaneT1|inplaneT2|PDT2|angio|T2star)\.(nii.gz|nii|json)$`
- `.*?/sub-(?P<subject>[0-9a-zA-Z+]+)/(|ses-(?P<session>[0-9a-zA-Z+]+)/)anat/sub-(?P=subject)(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z+]+))(|_ce-(?P<ceagent>[0-9a-zA-Z+]+))(|_rec-(?P<reconstruction>[0-9a-zA-Z+]+))(|_run-(?P<run>[0-9a-zA-Z+]+))(|_part-(?P<part>(mag|phase|real|imag)))_(T1w|T2w|PDw|T2starw|FLAIR|inplaneT1|inplaneT2|PDT2|angio|T2star)\.(nii.gz|nii|json)$`
The following files were not matched by any regex schema entry:
* `/home/chymera/.data2/datalad/000026/noncompliant/sub-EXC022/anat/sub-EXC022_ses-MRI_flip-1_VFA.nii.gz
The following mandatory regex schema entries did not match any files:
8 changes: 4 additions & 4 deletions tools/schemacode/bidsschematools/tests/test_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ def test_entity_rule(schema_obj):
nii_rule = rules._entity_rule(rule, schema_obj)
assert nii_rule == {
"regex": (
r"sub-(?P<subject>[0-9a-zA-Z]+)/"
r"(?:ses-(?P<session>[0-9a-zA-Z]+)/)?"
r"sub-(?P<subject>[0-9a-zA-Z+]+)/"
r"(?:ses-(?P<session>[0-9a-zA-Z+]+)/)?"
r"(?P<datatype>anat)/"
r"(?(subject)sub-(?P=subject)_)"
r"(?(session)ses-(?P=session)_)"
Expand Down Expand Up @@ -50,8 +50,8 @@ def test_entity_rule(schema_obj):
json_rule = rules._entity_rule(rule, schema_obj)
assert json_rule == {
"regex": (
r"(?:sub-(?P<subject>[0-9a-zA-Z]+)/)?"
r"(?:ses-(?P<session>[0-9a-zA-Z]+)/)?"
r"(?:sub-(?P<subject>[0-9a-zA-Z+]+)/)?"
r"(?:ses-(?P<session>[0-9a-zA-Z+]+)/)?"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid that is not all ... I will push a few commits on that end in a few minutes (I hope you don't mind).
Some other might need adjustment and I even start feeling that we might need to come up with some term (like "literal" but there might be better) to encompass "alphanumeric" and +.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yarikoptic Note this is a regression test that shows the specific output of a specific synthetic rule.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comment is not really about this test -- I meant that changes in this PR (just this test) aren't sufficient. pushed now

r"(?:(?P<datatype>anat)/)?"
r"(?(subject)sub-(?P=subject)_)"
r"(?(session)ses-(?P=session)_)"
Expand Down
24 changes: 12 additions & 12 deletions tools/schemacode/bidsschematools/tests/test_validator.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,12 @@ def test_write_report(tmp_path):

validation_result["schema_tracking"] = [
{
"regex": ".*?/sub-(?P<subject>[0-9a-zA-Z]+)/"
"(|ses-(?P<session>[0-9a-zA-Z]+)/)anat/sub-(?P=subject)"
"(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z]+))"
"(|_ce-(?P<ceagent>[0-9a-zA-Z]+))"
"(|_rec-(?P<reconstruction>[0-9a-zA-Z]+))"
"(|_run-(?P<run>[0-9a-zA-Z]+))"
"regex": ".*?/sub-(?P<subject>[0-9a-zA-Z+]+)/"
"(|ses-(?P<session>[0-9a-zA-Z+]+)/)anat/sub-(?P=subject)"
"(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z+]+))"
"(|_ce-(?P<ceagent>[0-9a-zA-Z+]+))"
"(|_rec-(?P<reconstruction>[0-9a-zA-Z+]+))"
"(|_run-(?P<run>[0-9a-zA-Z+]+))"
"(|_part-(?P<part>(mag|phase|real|imag)))"
"_(T1w|T2w|PDw|T2starw|FLAIR|inplaneT1|inplaneT2|PDT2|angio|T2star)"
"\\.(nii.gz|nii|json)$",
Expand All @@ -86,12 +86,12 @@ def test_write_report(tmp_path):
]
validation_result["schema_listing"] = [
{
"regex": ".*?/sub-(?P<subject>[0-9a-zA-Z]+)/"
"(|ses-(?P<session>[0-9a-zA-Z]+)/)anat/sub-(?P=subject)"
"(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z]+))"
"(|_ce-(?P<ceagent>[0-9a-zA-Z]+))"
"(|_rec-(?P<reconstruction>[0-9a-zA-Z]+))"
"(|_run-(?P<run>[0-9a-zA-Z]+))"
"regex": ".*?/sub-(?P<subject>[0-9a-zA-Z+]+)/"
"(|ses-(?P<session>[0-9a-zA-Z+]+)/)anat/sub-(?P=subject)"
"(|_ses-(?P=session))(|_acq-(?P<acquisition>[0-9a-zA-Z+]+))"
"(|_ce-(?P<ceagent>[0-9a-zA-Z+]+))"
"(|_rec-(?P<reconstruction>[0-9a-zA-Z+]+))"
"(|_run-(?P<run>[0-9a-zA-Z+]+))"
yarikoptic marked this conversation as resolved.
Show resolved Hide resolved
"(|_part-(?P<part>(mag|phase|real|imag)))"
"_(T1w|T2w|PDw|T2starw|FLAIR|inplaneT1|inplaneT2|PDT2|angio|T2star)"
"\\.(nii.gz|nii|json)$",
Expand Down