-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train Panoptic Segmentation model on custom dataset #1691
Comments
At the moment we don't have instructions to train panoptic segmentation models on custom dataset. Our code now expects panoptic segmentation data in COCO format, which is unfortunately not well-documented. |
Here the link to the COCO format ... then go to point 4: Panoptic Segmentation. I would also recommend to download the json annotations and mask labels in png format, have a look at them and create yours in the same format. I am still on my way do it. After converting the annotations into the right format, how to register panoptic annotations for training? is it the same than with instance segmentation and object detection? Since we have now an extra directory with the masks, I am wondering if it will use the same code to register the annotations. I guess the answer is no, since you created an specific script to do that: |
@JavierClearImageAI what is your way to make your custom panoptic segmentation annotations? |
I ended up using detectron2.data.datasets.register_coco_panoptic_separated I realized that using detectron2.data.datasets.register_coco_panoptic wasn't working for a custom dataset with new categories, since it registers a “standard” version of COCO panoptic. To be honest, I don't really like that you have to provide the images, the masks, json for the instances and json for instance segmentation, since the instance segmentation and the semantic segmentation info could be automated by FAIR from the png binary masks... but you will have to do it if you want to register data from the jsons. Otherwise you might need to dive into the code and create your own dataloader, which might take you longer to code. So, just follow the "coco standard panoptic format" for the panoptic coco-json and the "coco standard instance segmentation format" for the instances. In my case "sem_seg_root" = "panoptic_root", since I use the same masks for both. Summarizing:
PD: here my dataset registration code: Method:
My_code
Appendix1: Panoptic annotations format
|
@JavierClearImageAI Thank you for your detailed answer. So, are you using Panoptic FPN? (not panoptic deeplab?) |
I don't think detectron2 has panoptic deeplab backbone network implemented. It seems to be a Google research group backbone net. At least I couldn't see it in configuration options. They have 4 fpn backbone nets for panoptic only. Please let me know if I am wrong |
The Panoptic-DeepLab project in Detectron2 might be a good reference. The |
@JavierClearImageAI When you say
Standard means that it is follows the same format from the COCO dataset and uses the same category ids that are already defined. Custom means that is follows the same format from the COCO dataset but uses new category ids. Therefore if I want to train an already pre-trained PanopticFPN model from the model zoo with new categories, I would have to use a custom dataset in COCO format with the new category ids. It is confusing to me because "standard" version would seem like it is referring just to the format, but it seems like it is also referring to the already defined categories. Therefore to train a model with new categories we need to use the Lastly when you say
I understand the coco standard panoptic format but is the coco standard instance segmentation format the object detection format from https://cocodataset.org/#format-data (i.e # 1 on the list)? Thanks! |
@JavierClearImageAI And do you know how to use panoptic result to do a evaluation caculate? I just found the PQ metric caculation need the json file and PNG file, don't we use the panoptic result directly? Thank you. |
@zhangliyun9120, I am sorry. My code is property of my company and part of our pipeline. Nevertheless I can do something better, I will create a Pull Request to automate the process of generating the json annotations and data registering so you only need to pass a directory with the png masks. I will integrate it with detectron2 and share it here as well. I will try to do it this weekend. Regards |
Thank you https://www.celantur.com/blog/panoptic-segmentation-in-detectron2/ Looking forward to hearing soon. |
Can you do that? thanks |
I just said I will |
I don't have time to write a post for you, I am very sorry. regards |
I am not sure what the "sem_seg_root" should contain. I do not think "sem_seg_root" should be the same as "panoptic_root" which contains the panoptic masks. Should the "sem_seg_root" directory contain greyscale masks for the stuff categories? And if so what values do these masks have? The category ids of the stuff classes as they appear in the instances JSON? |
You are right. Let me conclude: Good luck! |
Ok, but since register_coco_panoptic_seperated takes as input both the instances JSON for the things categories and the semantic segmentation greyscale masks for the stuff categories why does it need the panoptic masks and the panoptic JSON as well? |
If you saw the formats of Instance, semantic, and panoptic annotation, you will understand. Because as for seperated the sequence is different. And I have already finished the panoptic FPN model training codes, considering that I also spent mush time on PanopticFPN model training and no one provide some real-code-help, so in order to provide after-coming people convinence, I will paste my worked codes here to help people who fused, as follow, pls refer to it: (note that, the things and stuffs are just for visualization, so in in Visualizer.py , draw_panoptic_seg(), you need to align the index, e.g. actually things is 0~ 79, stuff is 80~133, but I setup stuffs as a 54-list, so we have to do: Train:
Inference:
|
Thanks for your engagement! I have managed to train the panoptic model using almost identical code to the one you provided. However, why do you set cfg.MODEL.SEM_SEG_HEAD.NUM_CLASSES to 134 and cfg.MODEL.ROI_HEADS.NUM_CLASSES to 134? From what I understand, ROI_HEADS.NUM_CLASSES is used for the thing classes and SEM_SEG_HEAD.NUM_CLASSES for the stuff classes. So for your case ROI_HEADS.NUM_CLASSES should be 80 and SEM_SEG_HEAD.NUM_CLASSES should be 54. Setting both to 134 won't throw an error but you set redundant heads for both the roi and seg head making the network harder to train. Furthermore, your code does not show how the network uses the panoptic masks and the panoptic json. I tried training setting both the panoptic_json and the panoptic_root to empty strings "" in register_coco_panoptic_separated and it works fine. So I still do not understand the use of these arguments. |
When I say standard I mean coco-json format (check Coco dataset webpage or other related sources). When I say custom dataset I mean your own dataset with different classes. Your custom dataset also needs to follow the standard format |
I think you are right, but why I changed it cannot work, as follow: How do you setup? |
What error does it throw? That is how I made it work: with open("categories.json") as json_file: # A json file describing all the categories (including the background) according to the images_dir = "images" register_name = "myDataset" dataset_dicts = DatasetCatalog.get(register_name) stuff_names = [f["name"] for f in categories if f["isthing"] == 0] metadata.dict["stuff_classes"] = stuff_names cfg = get_cfg() os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) |
I think
I think your codes have some big problem: thing_dataset_id_to_contiguous_id you even didn't define, ? |
cfg.MODEL.SEM_SEG_HEAD.NUM_CLASSES = 15 This is as it should as I have 15 things and 15 stuff classes. I did not have to define thing_dataset_id_to_contiguous_id as it was already in the metadata after I run: metadata = MetadataCatalog.get(register_name) metadata.dict["stuff_classes"] and metadata.dict["stuff_dataset_id_to_contiguous_id"] were missing from metadata so I put them in the metadata manually. |
Can you show me your categories.json file? But my work, I use the pretrained panoptic model to generate my annotation data, so I put thing_classes and stuff_classes in a whole annotation file for a without overlapping, so my catgegories.json: things 0-79 stuffs 80-133, SO I think this is my error condition. Am I wrong, if wong for my case, how to setup? |
This is my categories.json file. The categories have a hierarchy field that I wrote and use it to handle overlapping polygons when I produce the semantic segmentation grayscale masks. If a polygon with a lower hierarchy overlaps with one with a higher hierarchy, the latter is ignored for the overlapping region. I had to do that to produce the panoptic masks from the coco detection json using the detection2panoptic_coco_format.py from the coco-panoptic API. Otherwise, it throws errors for overlapping polygons. I am not sure I understand the rest of what you say. Does your model produce overlaps for stuff and things? |
I checked yours and mine, I think categories.json should be no problem. I got the things after But when I setup I always got this error, Do you know why??? And a small details I wanna consult, you use the instance converter command is with things_only, right? : Also for semantic converter, how did you setup this value? the default of it is OTHER_CLASS_ID = 183 these are my panoptic.json, categories.json, instance.json.: I'm really confused about: |
I did not use --things_only argument in converters/panoptic2detection_coco_format.py. That is not so important however because as I mentioned before I do not think regster_coco_panoptic_seperated actually uses the panoptic masks and panoptic jsons and I did not use them either to produce the grayscale semantic segmentation masks. As for your error, I would suggest debugging it using cfg.MODEL.DEVICE='cpu'. This runs the network on CPU instead of GPU and will hopefully produce a better error message. You should also set the number of workers to zero (cfg.DATALOADER.NUM_WORKERS = 0). I suspect the error has to do with wrong input shape. If you get inside the code and reach the lib/python3.8/site-packages/detectron2/modeling/meta_arch/panoptic_fpn.py script you should check this line sem_seg_results, sem_seg_losses = self.sem_seg_head(features, gt_sem_seg) Check gt_sem_seg dimensions and see if they are as they should be. You should get inside this line into the semantic_seg.py and reach the point where the loss is calculated. Check if targets and predictions dimensions match. |
This is not dimension problem for sure. Did you check this issue: I'm very confusing about this, that person setup in both thing and stuff as n, n+1 can work well. So what's the correct way? @ppwwyyxx |
I think you are wrong, I checked the code actually both |
I've read the source code,
So indeed, you can ignore them for the training part |
I am trying to implement this approach for a custom dataset. I was able to register my dataset with the nuilt in function --> detectron2.data.datasets.register_coco_panoptic_separated. I included the category IDs in my sem_seg png files in the corresponding pixels for all the Corridors (my only stuff class). Any leads on what I could have done wrong? |
Hello, I am trying to train a panoptic segmentation model on Detectoron2 with a custom dataset. { |
Hello @zhangliyun9120 Can you share these files with me ? https://cocodataset.org/#format-data I can't download from here. I need an example of json files and masks. Best regards,
|
❓ How to train Panoptic Segmentation on a custom dataset ?
Hello everyone,
My question is two-fold :
Thanks,
Cyril
The text was updated successfully, but these errors were encountered: