Add dynamic run index <<>> option #195

dzemanov · 2023-08-09T20:00:41Z

Feature request: dynamic run index naming

Hello,

I would like to suggest a small extension to the way how dynamic run indexes are handled. Right now, with option run <<1>>, each filename will get run keyval, even when it doesn’t need it when no more runs are present. It would be amazing if there was an option to automatically increment run index while not to include it in the filename if it is not needed, as that pollutes a bit the filenames.

I suggest a solution which preserves the original behavior for dynamic <<1>> run index, but for <<>> empty run index files are named without run keyval. Only when file with the same name is found, bidscoin adds run-2 index to its filename and retrospectively adds run-1 to the corresponding other file with which it has matched. Next found run gets run-3, run-4 etc.

What do you think about the solution?
I might have missed some errors this could introduce, although I tested the results.

As an alternative approach, I've been thinking about making a script that would run after bidscoin, renaming the files. But that would also mean all references to the files would need to be renamed.

marcelzwiers · 2023-08-14T08:32:57Z

Hi, thanks for your suggestion. I considered this in the past but, without making things much more complicated, I figured you may end up with folders such as:

bids/sub-001/func/                                                                                                                            
|-- sub-001_task-rest_bold.json                                                                                         
|-- sub-001_task-rest.nii.gz                                                                                       
|-- sub-001_task-rest_run-2_bold.json                                                                                       
`-- sub-001_task-rest_run-2_bold.nii.gz

I expected that may be confusing for bids apps, when they encounter a run-2 image, but cannot find a run-1 image (you'd have to rename the run-less image into a run-1 image). I'm not even sure if that is BIDS compliant (haven't tested it with bids-validator). I therefore decided it was better to be always explicit about the run-index, unless you are sure there is always only 1 run (in which case you can just delete <<1>> run-index to have it removed from the filename)

marcelzwiers · 2023-08-14T08:36:09Z

I see you wrote code to retrospectively rename the run-1 images, I'll have a look and see if it doesn't complicate things too much

marcelzwiers · 2023-08-14T10:16:08Z

bidscoin/bids.py

@@ -6,6 +6,7 @@

 @author: Marcel Zwiers
 """
+import os


Please replace the obsolete os.path usage with the (much better) pathlib usage (Path is already imported, i.e. this is an unnecesary import)

marcelzwiers · 2023-08-14T14:00:39Z

bidscoin/bids.py

            bidsvalue = run['datasource'].dynamicvalue(bidsvalue, cleanup=True, runtime=runtime)
        if bidsvalue:
            if cleanup:
                bidsvalue = cleanup_value(bidsvalue)
+                if not bidsvalue:
+                    continue


I believe there are a few too many nested if statements now, i.e. the first if bidsvalue: can be discarded and then unindent and this and change it to if bidsvalue: bidsname = .. (i.e. reverse the order of the two original if statements)

marcelzwiers · 2023-08-14T14:10:33Z

bidscoin/bids.py

+    old_bidsname = remove_run_keyval(bidsname)
+    new_bidsname = insert_bidskeyval(bidsname, 'run', '1', False)
+    scanpath = outfolder.relative_to(bidsses)
+    for ext in ('.nii.gz', '.nii', '.json', '.tsv', '.tsv.gz', '.bval', '.bvec'):


It's better to not hardcode BIDS specifications but instead to use something generic like outfolder.glob(bidsname + '.*') and then determine the extension from the result

marcelzwiers · 2023-08-14T14:44:20Z

bidscoin/plugins/nibabel2bids.py

@@ -14,6 +14,7 @@
 from typing import Union
 from pathlib import Path
 from bidscoin import bids
+from bidscoin.bids import add_run1_keyval


Discarding this import statement and using bids.add_run1_keyval() below is more clean / readable. Idem for other plugins

marcelzwiers · 2023-08-14T14:55:59Z

bidscoin/bids.py

+    new_bidsname = insert_bidskeyval(bidsname, 'run', '1', False)
+    scanpath = outfolder.relative_to(bidsses)
+    for ext in ('.nii.gz', '.nii', '.json', '.tsv', '.tsv.gz', '.bval', '.bvec'):
+        if os.path.exists((outfolder / old_bidsname).with_suffix(ext)):


Use pathlib's .is_file() method

marcelzwiers · 2023-08-14T14:56:36Z

bidscoin/bids.py

+    scanpath = outfolder.relative_to(bidsses)
+    for ext in ('.nii.gz', '.nii', '.json', '.tsv', '.tsv.gz', '.bval', '.bvec'):
+        if os.path.exists((outfolder / old_bidsname).with_suffix(ext)):
+            os.rename((outfolder / old_bidsname).with_suffix(ext), (outfolder / new_bidsname).with_suffix(ext))


Use pathlib's .rename() method

marcelzwiers · 2023-08-14T15:07:46Z

bidscoin/bids.py

@@ -1869,25 +1873,64 @@ def insert_bidskeyval(bidsfile: Union[str, Path], bidskey: str, newvalue: str, v
    return newbidsfile


+def remove_run_keyval(bidsname: str) -> str:


I think this function is not needed and can be achieved by calling insert_keyval(bidsname, 'run', '', False)

marcelzwiers · 2023-08-15T08:32:42Z

If you have some subjects with one run and others with multiple runs, you do get inconsistent naming of the first run when using <<>>. It can be desirable, just saying:

bids/sub-001/func/                                                                                                                            
   |-- sub-001_task-rest_bold.json                                                                                       
   `-- sub-001_task-rest_bold.nii.gz                                                                                     
bids/sub-002/func/                                                                                                                            
   |-- sub-002_task-rest_run-1_bold.json                                                                                       
   |-- sub-002_task-rest_run-1_bold.nii.gz                                                                                       
   |-- sub-002_task-rest_run-2_bold.json                                                                                       
   `-- sub-002_task-rest_run-2_bold.nii.gz

marcelzwiers · 2023-08-17T10:55:34Z

Your solution looks good to me, I might even make the new <<>> run-index the default. Would you happen to have time to implement my suggestions?

marcelzwiers · 2023-08-18T06:02:43Z

Perfect, thanks a lot! I like the new <<>> option more than I expected and will make it the new default :-)

- Move core dynamic run value code from the plugins to the bids module - Move `add_run1_keyval()` code to `increment_runindex` (as they should always be executed in conjunction) Bugfix dcm2niix (add_run1_keyval: bidsname -> newbidsname)

marcelzwiers · 2023-08-18T12:38:29Z

I refactored (3c9b896) your PR a bit, i.e. I moved code that is central to bidscoin out of the plugins. But now I messed up the tests a bit... I'm not really familiar with unittest mocking, and I'm struggling to get it all right. It would be really nice if you can fix your tests and help me out :-)

…tem pathlib library

marcelzwiers · 2023-08-21T16:33:28Z

Ok, I rewrote your mock tests because I think that mocking system libraries is fragile and a bad idea

dzemanov · 2023-08-21T18:01:58Z

Ok, I rewrote your mock tests because I think that mocking system libraries is fragile and a bad idea

Hello, sorry for replying so late.
I have time tomorrow evening to rewrite the tests but if you are content with the ones you wrote that is fine by me :).

Mocking is good for test isolation, when a test fails we know that this concrete tested function failed and not other external factor e.g function that it used. Furthermore, one doesn't need to actually create the resources.

I can see that you want to use code that you fully understand so it is easier to maintain :).

marcelzwiers · 2023-08-21T21:09:59Z

You don't need to rewrite your tests, but if you like you can review the new tests to see if it misses anything (you'll see I need less and easier code to generate some temporary files than you did to mock pathlib.glob).

I think mocking can be suitable for testing code in cases where you don't have control over certain functions (such as functions that read out sensors in an industrial process, or functions that communicate with web servers). Otherwise, I think you should stay away from mocking if you can -- especially from mocking common system libraries. The reason is that mocking is very fragile: you are mocking the functions in the code you like to test, but in fact, mocking can have side effects outside the scope of the code you like to test. In other words, if your mocking works today, it may be broken tomorrow if a dependency of the code you want to test gets updated and also starts using the function that you mocked. For instance, you are mocking pathlib and then one of your dependencies switches from using os.path to the more modern pathlib functions. Perfectly fine and sensible update, except that your test may now fail (while it shouldn't)...

dzemanov · 2023-08-22T18:58:47Z

Yeah, you are right that my tests were a bit tightly coupled to the implementation and the new tests look much cleaner and cover everything needed :).

marcelzwiers · 2023-08-22T19:37:19Z

Thanks again for your contribution, I like it so much that I made it the new default :-)

dzemanov · 2023-08-26T15:08:29Z

I am happy I could contribute a bit :).

marcelzwiers self-requested a review August 14, 2023 10:05

marcelzwiers requested changes Aug 14, 2023

View reviewed changes

Dominika Zemanovicova added 7 commits August 18, 2023 00:17

Add dynamic run index <<>> option

88aed23

Use pathlib instead of os

a50d969

Simplify get_bidsname

761709c

Use generic extensions

076cf43

Delete separate add_run1_keyval import

9918d74

Delete remove_run_keyval

4852b3e

Fix format "" vs ''

33f78a1

dzemanov force-pushed the run-naming branch from d58fab6 to 33f78a1 Compare August 17, 2023 22:19

marcelzwiers merged commit ed0deef into Donders-Institute:master Aug 18, 2023
6 checks passed

marcelzwiers added a commit that referenced this pull request Aug 18, 2023

Make the new <<>> run-index the default (github PR #195)

baad2f8

marcelzwiers added a commit that referenced this pull request Aug 21, 2023

Debugging of PR #195

a49f5de

marcelzwiers added a commit that referenced this pull request Aug 21, 2023

Rewrite pytests of PR #195, i.e. avoid the fragile mocking of the sys…

2ca8af8

…tem pathlib library

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dynamic run index <<>> option #195

Add dynamic run index <<>> option #195

dzemanov commented Aug 9, 2023

marcelzwiers commented Aug 14, 2023 •

edited

Loading

marcelzwiers commented Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers Aug 14, 2023

marcelzwiers commented Aug 15, 2023

marcelzwiers commented Aug 17, 2023

marcelzwiers commented Aug 18, 2023

marcelzwiers commented Aug 18, 2023

marcelzwiers commented Aug 21, 2023

dzemanov commented Aug 21, 2023 •

edited

Loading

marcelzwiers commented Aug 21, 2023 •

edited

Loading

dzemanov commented Aug 22, 2023

marcelzwiers commented Aug 22, 2023

dzemanov commented Aug 26, 2023

		@@ -1869,25 +1873,64 @@ def insert_bidskeyval(bidsfile: Union[str, Path], bidskey: str, newvalue: str, v
		return newbidsfile


		def remove_run_keyval(bidsname: str) -> str:

Add dynamic run index <<>> option #195

Add dynamic run index <<>> option #195

Conversation

dzemanov commented Aug 9, 2023

Feature request: dynamic run index naming

marcelzwiers commented Aug 14, 2023 • edited Loading

marcelzwiers commented Aug 14, 2023

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers Aug 14, 2023

Choose a reason for hiding this comment

marcelzwiers commented Aug 15, 2023

marcelzwiers commented Aug 17, 2023

marcelzwiers commented Aug 18, 2023

marcelzwiers commented Aug 18, 2023

marcelzwiers commented Aug 21, 2023

dzemanov commented Aug 21, 2023 • edited Loading

marcelzwiers commented Aug 21, 2023 • edited Loading

dzemanov commented Aug 22, 2023

marcelzwiers commented Aug 22, 2023

dzemanov commented Aug 26, 2023

marcelzwiers commented Aug 14, 2023 •

edited

Loading

dzemanov commented Aug 21, 2023 •

edited

Loading

marcelzwiers commented Aug 21, 2023 •

edited

Loading