-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caps generator #9
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR and the initial work @thibaultdvx !
I haven't looked into the details but on the module architecture, I would avoid having utils
everywhere since this is not very informative.
Looking at this PR, it seems like we need a clinicaio.generators
module in which both BIDS and CAPS generators will be defined (each in a sub-module I think).
I think we also need a module in which we define the models we have (the enumerations, dataclasses, and so on which are modeling the BIDS and caps specifications). Something like clinicaio.models
(or something equivalent).
It would look more or less like that (I haven't written all files for clarity):
├── clinicaio
│ ├── models
│ │ ├── bids
│ │ │ ├── path.py
│ │ ├── caps
│ │ ├── pet
│ └── generators
│ │ ├── bids
│ │ ├── caps
│ │ │ ├── t1_linear.py
│ │ │ ├── pet_linear.py
│ │ │ ├── generator.py
Also, we need to think about what is private and what is public in each module. Looks like everything is public for the moment. For example, it feels like your CAPSGenerator
class would be public but all the build_xxx
functions would be private as users would go through the CAPSGenerator
to generate, right ?
Hi @NicolasGensollen, I do agree with all your remarks. I changed the architecture, tell me what do you think. I also think that it is more user-friendly if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thibaultdvx !
I think the new architecture looks good.
I had a very quick look with some comments. I'm mostly wondering about the best way to model BIDS entities. I didn't have the time to finish my thinking but gave a couple ideas (to be discussed...)
|
||
|
||
def _build_flair_linear( | ||
root: Union[str, Path], subject: int, session: int, crop: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a private function called by the public CAPSGenerator.build_pipeline()
method. You data should be validated already and you can use the rich data type directly here:
root: Union[str, Path], subject: int, session: int, crop: bool = True | |
root: Path, subject: SubjectEntity, session: SessionEntity, crop: bool = True |
The validation would be done in CAPSGenerator.build_pipeline
. WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, at first those functions were public so the validation was made in there, but if they are private, I do agree that it should be done in CAPSGenerator
.
from clinicaio.models.bids_entities import SessionEntity, SubjectEntity | ||
from clinicaio.models.caps import Extension, Resolution, Space, Suffix | ||
|
||
from .filename import _get_caps_filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably means that this function should be public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it that uncommon to import private functions that are located in another file, but very closely (i.e. in the same module)?
What I mean is that I split caps/generator
into several .py
files only not to have one huge script, but I would like CAPSGenerator
to be the only public object so I turned all the rest into private objects.
key = "sub" | ||
|
||
@classmethod | ||
def _process_value(cls, value: int) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this, I'm wondering how we could model BIDS entities properly.
Here is the definition of an Entity from the BIDS specifications: https://bids-specification.readthedocs.io/en/stable/common-principles.html#entities
And here is the full list of BIDS entities: https://bids-specification.readthedocs.io/en/stable/appendices/entities.html
First of all, they are composed of a key and a value, which can be a label, i.e. an alphanumeric string, or an index, i.e a positive integer.
Furthermore, you can see in the list that subjects and sessions are "label" entities, not "index" entities, so the value doesn't have to be an integer.
I think we need to define proper types for Label
and Index
, something like this maybe:
class Label(UserString):
def __init__(self, value: str):
super().__init__(self.validate(value))
@classmethod
def validate(cls, value: str) -> str:
if value.isalnum():
return value
raise ValueError(
f"Label '{value}' is not a valid BIDS label: it must be composed only by letters and/or numbers."
)
class Index:
value: PositiveInt
length_as_string: PositiveInt # for padding with zeros
# Implement the constraint and the string representation taking into account the padding...
Then you can define:
class Entity:
key: Label
value : Label | Index
def __str__(self) -> str:
return f"{self.key}-{self.value}"
class SubjectEntity(Entity):
key = Label("sub")
def __init__(self, value: str):
self.value = Label(value)
...
Still need to think a bit about that.
V1 of
CAPSGenerator
to easily generate fake caps (useful for testing). Since our last discussion, I have removed the random part. So,CAPSGenerator
is now deterministic, with two methods:build_pipeline
: to simulate a pipeline for a list of subjects and a list of sessions (e.g.generator.build_pipeline("pet-linear", subjects=[1, 2], sessions=[0, 1], tracer="18FFDG", suvr_ref_region="pons")
),remove_pipeline
: to remove a pipeline for a specific (subject, session) (e.g.generator.remove_pipeline("pet-linear", subject=1, session=0)
).For example, the combination of the two previous commands leads to the following CAPS:
For the moment, the supported pipelines are
t1-linear
,flair-linear
andpet-linear
.I also had to add
nibabel
andscipy
as dependencies (I used the same poetry specifications as inclinica
).