You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the anomalib library for anomaly detection in custom dataset, and I'm running into an issue where the library is unable to find the mask files even though they are present in the directory structure. I'm getting an AssertionError with the following message: "missing mask files, mask_dir=/workspace/mydata/mydata/ground_truth".
I have checked that the mask files are correctly labeled and present in the ground_truth folder. The error occurs in the anomalib/data/folder.py file. Here's the relevant part of the YAML file:
dataset:
name: mvtec
format: folder
path: mydata/mydata
normal_dir: train/good
abnormal_dir: test
normal_test_dir: test/good
mask: ground_truth/broken
extensions: ['.png','.PNG']
task: segmentation
train_batch_size: 32
eval_batch_size: 32
num_workers: 8
image_size: 256 # dimensions to which images are resized (mandatory)
center_crop: null # dimensions to which images are center-cropped after resizing (optional)
normalization: imagenet # data distribution to which the images will be normalized: [none, imagenet]
transform_config:
train: null
eval: null
test_split_mode: from_dir # options: [from_dir, synthetic]
test_split_ratio: 0.2 # fraction of train images held out testing (usage depends on test_split_mode)
val_split_mode: same_as_test # options: [same_as_test, from_test, synthetic]
val_split_ratio: 0.5 # fraction of train/test images held out for validation (usage depends on val_split_mode)
create_validation_set: true
tiling:
apply: false
tile_size: null
stride: null
remove_border_count: 0
use_random_tiling: False
random_tile_count: 16
model:
name: padim
backbone: resnet18
pre_trained: true
layers:
- layer1
- layer2
- layer3
normalization_method: min_max # options: [none, min_max, cdf]
metrics:
image:
- F1Score
- AUROC
threshold:
method: adaptive #options: [adaptive, manual]
manual_image: null
manual_pixel: null
visualization:
show_images: False # show images on the screen
save_images: True # save images to the file system
log_images: True # log images to the available loggers (if any)
image_save_path: null # path to which images will be saved
mode: full # options: ["full", "simple"]
project:
seed: 42
path: ./results
logging:
logger: [] # options: [comet, tensorboard, wandb, csv] or combinations.
log_graph: false # Logs the model graph to respective logger.
optimization:
export_mode: null # options: torch, onnx, openvino
# PL Trainer Args. Don't add extra parameter here.
trainer:
enable_checkpointing: true
default_root_dir: null
gradient_clip_val: 0
gradient_clip_algorithm: norm
num_nodes: 1
devices: 1
enable_progress_bar: true
overfit_batches: 0.0
track_grad_norm: -1
check_val_every_n_epoch: 1 # Don't validate before extracting features.
fast_dev_run: false
accumulate_grad_batches: 1
max_epochs: 1
min_epochs: null
max_steps: -1
min_steps: null
max_time: null
limit_train_batches: 1
limit_val_batches: 1
limit_test_batches: 1
limit_predict_batches: 1
val_check_interval: 1.0 # Don't validate before extracting features.
log_every_n_steps: 50
accelerator: gpu # <"cpu", "gpu", "tpu", "ipu", "hpu", "auto">
strategy: null
sync_batchnorm: false
precision: 32
enable_model_summary: true
num_sanity_val_steps: 0
profiler: null
benchmark: false
deterministic: false
reload_dataloaders_every_n_epochs: 0
auto_lr_find: false
replace_sampler_ddp: true
detect_anomaly: false
auto_scale_batch_size: false
plugins: null
move_metrics_to_cpu: false
multiple_trainloader_mode: max_size_cycle
I have also checked the naming convention of the images in the test set and the images in the ground_truth folder and they according to the make_folder_dataset function in folder.py script. The make_folder_dataset function in folder.py expects the mask files to have the same filename as the corresponding image files, but with the _mask suffix added to the end. For example, if the image file is img001.png, then the corresponding mask file should be named img001_mask.png. For example the name of the images in the anoamly1 directory for the groud_truth is
and the name of the corresponding images in the anomaly1 in the test directory is
Steps taken:
I have checked that the mask files are correctly labeled and present in the ground_truth folder. I have also tried changing the path and using both relative and absolute paths, but the error persists.
Dataset
Other
Model
PaDim
I am using the latest anomalib library version. I am using the command !pip install git+https://github.com/openvinotoolkit/anomalib.git. My python version is 3.10.
The complete traceback is as follows
AssertionError Traceback (most recent call last)
Cell In[83], line 1
----> 1 trainer.fit(model=model, datamodule=datamodule)
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:608, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
606 model = self._maybe_unwrap_optimized(model)
607 self.strategy._lightning_module = model
--> 608 call._call_and_handle_interrupt(
609 self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
610 )
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py:38, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
36 return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
37 else:
---> 38 return trainer_fn(*args, **kwargs)
40 except _TunerExitException:
41 trainer._call_teardown_hook()
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:650, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
643 ckpt_path = ckpt_path or self.resume_from_checkpoint
644 self._ckpt_path = self._checkpoint_connector._set_ckpt_path(
645 self.state.fn,
646 ckpt_path, # type: ignore[arg-type]
647 model_provided=True,
648 model_connected=self.lightning_module is not None,
649 )
--> 650 self._run(model, ckpt_path=self.ckpt_path)
652 assert self.state.stopped
653 self.training = False
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1051, in Trainer._run(self, model, ckpt_path)
1048 self.strategy.setup_environment()
1049 self.__setup_profiler()
-> 1051 self._call_setup_hook() # allow user to setup lightning_module in accelerator environment
1053 # check if we should delay restoring checkpoint till later
1054 if not self.strategy.restore_checkpoint_after_setup:
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1298, in Trainer._call_setup_hook(self)
1295 self.strategy.barrier("pre_setup")
1297 if self.datamodule is not None:
-> 1298 self._call_lightning_datamodule_hook("setup", stage=fn)
1299 self._call_callback_hooks("setup", stage=fn)
1300 self._call_lightning_module_hook("setup", stage=fn)
File /opt/conda/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:1375, in Trainer._call_lightning_datamodule_hook(self, hook_name, *args, **kwargs)
1373 if callable(fn):
1374 with self.profiler.profile(f"[LightningDataModule]{self.datamodule.__class__.__name__}.{hook_name}"):
-> 1375 return fn(*args, **kwargs)
File /opt/conda/lib/python3.10/site-packages/anomalib/data/base/datamodule.py:102, in AnomalibDataModule.setup(self, stage)
96 """Setup train, validation and test data.
97
98 Args:
99 stage: str | None: Train/Val/Test stages. (Default value = None)
100 """
101 if not self.is_setup:
--> 102 self._setup(stage)
103 assert self.is_setup
File /opt/conda/lib/python3.10/site-packages/anomalib/data/base/datamodule.py:118, in AnomalibDataModule._setup(self, _stage)
115 assert self.train_data is not None
116 assert self.test_data is not None
--> 118 self.train_data.setup()
119 self.test_data.setup()
121 self._create_test_split()
File /opt/conda/lib/python3.10/site-packages/anomalib/data/base/dataset.py:161, in AnomalibDataset.setup(self)
159 """Load data/metadata into memory."""
160 if not self.is_setup:
--> 161 self._setup()
162 assert self.is_setup, "setup() should set self._samples"
File /opt/conda/lib/python3.10/site-packages/anomalib/data/folder.py:233, in FolderDataset._setup(self)
231 def _setup(self) -> None:
232 """Assign samples."""
--> 233 self.samples = make_folder_dataset(
234 root=self.root,
235 normal_dir=self.normal_dir,
236 abnormal_dir=self.abnormal_dir,
237 normal_test_dir=self.normal_test_dir,
238 mask_dir=self.mask_dir,
239 split=self.split,
240 extensions=self.extensions,
241 )
File /opt/conda/lib/python3.10/site-packages/anomalib/data/folder.py:162, in make_folder_dataset(normal_dir, root, abnormal_dir, normal_test_dir, mask_dir, split, extensions)
158 samples.loc[index, "mask_path"] = str(mask_dir / rel_image_path)
160 # make sure all the files exist
161 # samples.image_path does NOT need to be checked because we build the df based on that
--> 162 assert samples.mask_path.apply(
163 lambda x: Path(x).exists() if x != "" else True
164 ).all(), f"missing mask files, mask_dir={mask_dir}"
166 # Ensure the pathlib objects are converted to str.
167 # This is because torch dataloader doesn't like pathlib.
168 samples = samples.astype({"image_path": "str"})
AssertionError: missing mask files, mask_dir=/workspace/mydata/mydata/ground_truth/anomaly1
What can be the potential root cause of the error? Does the folder.py script support the custom dataset where the number of anomalies is more than one?
I shall be grateful for the help.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Description
I'm using the anomalib library for anomaly detection in custom dataset, and I'm running into an issue where the library is unable to find the mask files even though they are present in the directory structure. I'm getting an AssertionError with the following message: "missing mask files, mask_dir=/workspace/mydata/mydata/ground_truth".
I have checked that the mask files are correctly labeled and present in the ground_truth folder. The error occurs in the anomalib/data/folder.py file. Here's the relevant part of the YAML file:
dataset:
The directory structure of mydata is as follows
I have also checked the naming convention of the images in the test set and the images in the ground_truth folder and they according to the make_folder_dataset function in folder.py script. The make_folder_dataset function in folder.py expects the mask files to have the same filename as the corresponding image files, but with the _mask suffix added to the end. For example, if the image file is img001.png, then the corresponding mask file should be named img001_mask.png. For example the name of the images in the anoamly1 directory for the groud_truth is
and the name of the corresponding images in the anomaly1 in the test directory is
Steps taken:
I have checked that the mask files are correctly labeled and present in the ground_truth folder. I have also tried changing the path and using both relative and absolute paths, but the error persists.
Dataset
Other
Model
PaDim
I am using the latest anomalib library version. I am using the command
!pip install git+https://github.com/openvinotoolkit/anomalib.git
. My python version is 3.10.The complete traceback is as follows
What can be the potential root cause of the error? Does the folder.py script support the custom dataset where the number of anomalies is more than one?
I shall be grateful for the help.
Beta Was this translation helpful? Give feedback.
All reactions