Code Update and Dataset Details #3

lofrienger · 2023-12-12T06:02:19Z

Hi Yuying, really nice work!

Could you please update the code, especially regarding the dataset definition part for Endoslam and UCL for reference?
Just some code snippets will also be very useful if you are busy at this moment.
In your paper, only 4500 frames are selected from endoslam dataset. Could you tell how you choose?

Thanks in advance!

yuyingliu-1 · 2023-12-12T11:07:32Z

Thank you for your interest in this work.
Regarding the UCL dataset, we downloaded the dataset from this URL http://cmic.cs.ucl.ac.uk/ColonoscopyDepth/ and divided it. The UCL dataset contains three sub-datasets of different textures: T1, T2, and T3, which are divided according to the ratio of 6:1:3. Depending on the number of images in each sub-dataset, you can divide it separately.
Regarding the Endoslam dataset, we chose the UnityCam part with Pixelwise Depths. Colon, Small Intestine, and Stomach are included in UnityCam. UnityCam contains a total of more than 30,000 frames of data, which is a huge burden on computer hardware. We referred to the dataset size selected when training the model mentioned in Reference "Endoslam dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos" and chose 4500 frames of images. In theory, if you use more frames to train the model, you will get better results.

lofrienger · 2023-12-13T06:28:22Z

@yuyingliu-1
Thanks for your prompt reply!

Do you have a plan to update the code? Currently, the code is missing some important files like dataset.py, netrowrk.py, etc.

For the UCL dataset, may I know where can I find the camera intrinsic K? It seems it is not available in http://cmic.cs.ucl.ac.uk/ColonoscopyDepth/.
For the EndoSLAM dataset, I am still not sure which subset of dataset is used for experiments. In the Endoslam paper and their github repo, I cannot find such a description or a number like 4500. There is a pending issue regarding the data used in the paper: Data used for training and validation in the paper CapsuleEndoscope/EndoSLAM#19
Instead, I could only find below numbers,

Do you adopt any criteria to manually select the data for experiments? Can you share a copy of your used data if possible?

There is another pending issue about depth GT misalignment of the UnityCam data:Unity RGB & Depthmaps misalignment CapsuleEndoscope/EndoSLAM#13.
Do you deal with this problem?

Appreciate your continued support!

yuyingliu-1 · 2023-12-15T12:12:13Z

Good question！

We will continue to update the code. We are sorting out all the open source data about the model, and the progress of related projects may be slightly delayed (it is possible that two versions will be given. This version focuses on how to better obtain depth information, and subsequent work will involve how to reconstruct it).
The K of the relevant UCL dataset :
float f_x = 591.5604f;
float f_y = 623.043f;
float c_x = 640.2557f;
float c_y = 503.4438f.
On the other hand, the use and composition of the Endoslam dataset can be made according to the needs of relevant doctors or projects. For example, if the topic of the project is for the entire digestive tract, then the data should cover all of it; if the topic is for the stomach, then the data source should only be selected from the stomach. And so on.
Questions related to the Endoslam dataset can wait for the original author's reply. We used the original data for training, and we did see this problem during visualization.

lofrienger · 2023-12-15T12:19:21Z

@yuyingliu-1
Thank you for providing such thorough explanations. They have been extremely beneficial and enlightening!
Best wishes with your ongoing projects, and I hope to see your work published in the near future!

lofrienger · 2024-01-13T13:21:23Z

Hi @yuyingliu-1
Sorry to disturb you again.
But I realize that the image is of size 256x256 in the UCL dataset. The values in your K matric are much larger than 256.
This looks strange for the camera's intrinsic parameters.
May I know where you get the K values?

yuyingliu-1 · 2024-01-14T13:44:43Z

Hi @yuyingliu-1 Sorry to disturb you again. But I realize that the image is of size 256x256 in the UCL dataset. The values in your K matric are much larger than 256. This looks strange for the camera's intrinsic parameters. May I know where you get the K values?
You can find detailed instructions at the first author's (Anitarau) github.

lofrienger · 2024-01-15T10:11:17Z

@yuyingliu-1
Thank you for your continuous support.
I think that is for the original image which has a different size from 256*256.
The dataset website has such a description: "The images were resized to 256 x 256 pixels."
Anyway, I have raised an issue in https://github.com/anitarau/DepthFromColonoscopy/issues/1 to ask about the original image size so that I can scale the camera intrinsic parameters.

yuyingliu-1 · 2024-01-22T12:56:50Z

Yes, the images were resized to 256 x 256 pixels.

lofrienger closed this as completed Dec 15, 2023

lofrienger reopened this Jan 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Update and Dataset Details #3

Code Update and Dataset Details #3

lofrienger commented Dec 12, 2023

yuyingliu-1 commented Dec 12, 2023

lofrienger commented Dec 13, 2023

yuyingliu-1 commented Dec 15, 2023

lofrienger commented Dec 15, 2023

lofrienger commented Jan 13, 2024

yuyingliu-1 commented Jan 14, 2024

lofrienger commented Jan 15, 2024

yuyingliu-1 commented Jan 22, 2024 •

edited

Loading

Code Update and Dataset Details #3

Code Update and Dataset Details #3

Comments

lofrienger commented Dec 12, 2023

yuyingliu-1 commented Dec 12, 2023

lofrienger commented Dec 13, 2023

yuyingliu-1 commented Dec 15, 2023

lofrienger commented Dec 15, 2023

lofrienger commented Jan 13, 2024

yuyingliu-1 commented Jan 14, 2024

lofrienger commented Jan 15, 2024

yuyingliu-1 commented Jan 22, 2024 • edited Loading

yuyingliu-1 commented Jan 22, 2024 •

edited

Loading