Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Update and Dataset Details #3

Open
lofrienger opened this issue Dec 12, 2023 · 8 comments
Open

Code Update and Dataset Details #3

lofrienger opened this issue Dec 12, 2023 · 8 comments

Comments

@lofrienger
Copy link

Hi Yuying, really nice work!

  1. Could you please update the code, especially regarding the dataset definition part for Endoslam and UCL for reference?
    Just some code snippets will also be very useful if you are busy at this moment.

  2. In your paper, only 4500 frames are selected from endoslam dataset. Could you tell how you choose?

Thanks in advance!

@yuyingliu-1
Copy link
Owner

Thank you for your interest in this work.
Regarding the UCL dataset, we downloaded the dataset from this URL http://cmic.cs.ucl.ac.uk/ColonoscopyDepth/ and divided it. The UCL dataset contains three sub-datasets of different textures: T1, T2, and T3, which are divided according to the ratio of 6:1:3. Depending on the number of images in each sub-dataset, you can divide it separately.
Regarding the Endoslam dataset, we chose the UnityCam part with Pixelwise Depths. Colon, Small Intestine, and Stomach are included in UnityCam. UnityCam contains a total of more than 30,000 frames of data, which is a huge burden on computer hardware. We referred to the dataset size selected when training the model mentioned in Reference "Endoslam dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos" and chose 4500 frames of images. In theory, if you use more frames to train the model, you will get better results.

@lofrienger
Copy link
Author

@yuyingliu-1
Thanks for your prompt reply!

  1. Do you have a plan to update the code? Currently, the code is missing some important files like dataset.py, netrowrk.py, etc.
  • For the UCL dataset, may I know where can I find the camera intrinsic K? It seems it is not available in http://cmic.cs.ucl.ac.uk/ColonoscopyDepth/.
  • For the EndoSLAM dataset, I am still not sure which subset of dataset is used for experiments. In the Endoslam paper and their github repo, I cannot find such a description or a number like 4500. There is a pending issue regarding the data used in the paper: Data used for training and validation in the paper CapsuleEndoscope/EndoSLAM#19
    Instead, I could only find below numbers,
    image
    image
    Do you adopt any criteria to manually select the data for experiments? Can you share a copy of your used data if possible?
  1. There is another pending issue about depth GT misalignment of the UnityCam data:Unity RGB & Depthmaps misalignment CapsuleEndoscope/EndoSLAM#13.
    Do you deal with this problem?

Appreciate your continued support!

@yuyingliu-1
Copy link
Owner

Good question!

  1. We will continue to update the code. We are sorting out all the open source data about the model, and the progress of related projects may be slightly delayed (it is possible that two versions will be given. This version focuses on how to better obtain depth information, and subsequent work will involve how to reconstruct it).
  2. The K of the relevant UCL dataset :
    float f_x = 591.5604f;
    float f_y = 623.043f;
    float c_x = 640.2557f;
    float c_y = 503.4438f.
    On the other hand, the use and composition of the Endoslam dataset can be made according to the needs of relevant doctors or projects. For example, if the topic of the project is for the entire digestive tract, then the data should cover all of it; if the topic is for the stomach, then the data source should only be selected from the stomach. And so on.
  3. Questions related to the Endoslam dataset can wait for the original author's reply. We used the original data for training, and we did see this problem during visualization.

@lofrienger
Copy link
Author

@yuyingliu-1
Thank you for providing such thorough explanations. They have been extremely beneficial and enlightening!
Best wishes with your ongoing projects, and I hope to see your work published in the near future!

@lofrienger
Copy link
Author

Hi @yuyingliu-1
Sorry to disturb you again.
But I realize that the image is of size 256x256 in the UCL dataset. The values in your K matric are much larger than 256.
This looks strange for the camera's intrinsic parameters.
May I know where you get the K values?

@yuyingliu-1
Copy link
Owner

Hi @yuyingliu-1 Sorry to disturb you again. But I realize that the image is of size 256x256 in the UCL dataset. The values in your K matric are much larger than 256. This looks strange for the camera's intrinsic parameters. May I know where you get the K values?
You can find detailed instructions at the first author's (Anitarau) github.

@lofrienger
Copy link
Author

@yuyingliu-1
Thank you for your continuous support.
I think that is for the original image which has a different size from 256*256.
The dataset website has such a description: "The images were resized to 256 x 256 pixels."
Anyway, I have raised an issue in https://github.com/anitarau/DepthFromColonoscopy/issues/1 to ask about the original image size so that I can scale the camera intrinsic parameters.

@yuyingliu-1
Copy link
Owner

yuyingliu-1 commented Jan 22, 2024

Yes, the images were resized to 256 x 256 pixels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants