-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Allow plus signs in labels #1926
base: master
Are you sure you want to change the base?
Conversation
In this PR, plus signs are allowed for any label-format value, but I believe some folks (@effigies?) were thinking of limiting them to special multi-label cases. If we want to go down that route, then we'll need to create a new "format" (e.g., "multilabel") that applies to specific entities. I'm not 100% which entities we would want the multilabel format for... perhaps |
Reading #1165 (comment) and the following discussion, I think the overall consensus was toward permissiveness; i.e., allow in any label instead of splitting out a new multi-label concept. I feel like the discussion about "semantics" was not entirely clear. I personally feel we should refrain from implying a relation between
|
I like that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple of nit picks
Co-authored-by: Chris Markiewicz <effigies@gmail.com>
…ts to gain + === Do not change lines below === { "chain": [], "cmd": "git-sedi '\\[0-9a-zA-Z\\]' '[0-9a-zA-Z+]'", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^
r"(?:sub-(?P<subject>[0-9a-zA-Z]+)/)?" | ||
r"(?:ses-(?P<session>[0-9a-zA-Z]+)/)?" | ||
r"(?:sub-(?P<subject>[0-9a-zA-Z+]+)/)?" | ||
r"(?:ses-(?P<session>[0-9a-zA-Z+]+)/)?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid that is not all ... I will push a few commits on that end in a few minutes (I hope you don't mind).
Some other might need adjustment and I even start feeling that we might need to come up with some term (like "literal" but there might be better) to encompass "alphanumeric" and +
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yarikoptic Note this is a regression test that shows the specific output of a specific synthetic rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my comment is not really about this test -- I meant that changes in this PR (just this test) aren't sufficient. pushed now
Just a sidenote FTR: There is also a question of either to allow it in suffixes (e.g. would converge DANDI layout closer (dandi/dandi-cli#1498) but since we do not have any "incident" of that in BIDS ATM, it is not appropriate to change ATM even though in general I see that suffixes also should be of the same nature as "labels", in particular that IIRC their values could "migrate" into |
I pushed some changes which I think are due as well, although might need tune ups -- individual commits might have more information/reasoning in the commit messages. But there is also IMHO outstanding review and possibly tune up needed to src/metaschema.json it does have plenty of "alphanumeric" and _ allowances and I am not yet 100% sure none of them somehow overlaps with "alphanumeric" possible in filename etc... most likely none, but still worth looking at IMHO❯ git grep 'a-zA-Z0-9[^+]' -- src/metaschema.json
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9][a-zA-Z0-9_-]*$": {
src/metaschema.json: "^_[a-zA-Z0-9_-]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": { "$ref": "#/definitions/suffixRule" }
src/metaschema.json: "^[a-zA-Z0-9_]+$": { "$ref": "#/definitions/suffixRule" }
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "^[a-zA-Z0-9_]+$": {
src/metaschema.json: "items": { "type": "string", "pattern": "^[a-zA-Z0-9]+$" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For something like this I would much prefer that one bids example is modified to showcase this so that downstream validator and parsers have got some food to sink their digital teeth into.
@Remi-Gau that's a great point. Do you think it makes more sense to create a new example dataset or to modify an existing one? I took a quick look at the BIDS examples and https://github.com/bids-standard/bids-examples/tree/master/ds004332 looks promising. There are complex EDIT: If you think creating a new dataset makes more sense, then I think it would be good to use something like CUBIDS since that will use the |
I am afraid that if we start adding a new dataset for every new aspect of the spec we need to validate we will end up with too many examples. How about adding this to the synthetic example? |
That works! I'll look into modifying https://github.com/bids-standard/bids-examples/tree/master/synthetic. |
@@ -12,7 +12,7 @@ For example, if a file has an acquisition and reconstruction label, the | |||
acquisition entity must precede the reconstruction entity. | |||
REQUIRED and OPTIONAL entities for a given file type are denoted; | |||
empty cells imply that entities MUST NOT be specified. | |||
Entity formats indicate whether the value is alphanumeric | |||
Entity formats indicate whether the value is alphanumeric (potentially including `+` character(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for uniformity
Entity formats indicate whether the value is alphanumeric (potentially including `+` character(s)) | |
Entity formats indicate whether the value is alphanumeric (and possibly including `+` character(s)) |
or should we use some other similar phrase?
Closes #1165. Adds
+
to the valid characters for the "label" format, which is currently only used for a subset of entities.