-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uncertainty regarding the optical center and depth width / height in metadata
#94
Comments
Hello, the problem with your code is twofold; Problem 1You should downscale not only the optical center, but also the focal length ( float scale = (float) depthImgSize.width / rgbImgSize.width;
float fx_depth = K[0] * scale;
float fy_depth = K[4] * scale;
float cx_depth = K[6] * scale;
float cy_depth = K[7] * scale; Record3D videos can have varying resolution of RGB images, so it is advised to always compute the Problem 2As you correctly said, subtracting half of the image's resolution in addition to subtracting the optical center's coordinates ( float f_x = ((float)(x) - cx) * depth / fx;
float f_y = ((float)(y) - cy) * depth / fy; I believe the above did answer your 3 question, but in short:
Off-topic performance-related tip: assuming that for (int y = 0; y < depthMapHeight; y++)
for (int x = 0; x < depthMapWidth; x++) |
Thank you so much for the timely response! I just tried your math with various recordings and it does seem to work :) One more question, and that is not directly related to Record3D, but I'd still appreciate your input: what is the "industry standard" when it comes to parsing and using the confidence levels? I was thinking about culling any pixel with confidence <2 (like the linked example from above). Is this what is usually done? Do you use this data somehow for the built-in vizualiser in Record3D? Thanks again! |
Apologies for the delay. I'm not sure what is the industry standard for processing the confidence values. They are not used in the in-app viewer of Record3D, but if I were to use them, then I would consider doing a simple thresholding as per your suggestion. |
Thank you for all the input. My questions are answered, closing this issue. |
Hey, I am attempting to write a Unity importer for the r3d format. I have been following this example as a base of my code.
In this example the depth float32 is resized via numpy to match the rgb image resolution (720x960). All of the 2d-to-3d reprojection math that follows, involving the focal lengths and optical center extracted from
K
inmetadata
, relies on the fact that the depth matches the rgb image.In my implementation I would rather not upscale the depth to match the rgb. This way I can deal with 49152 points per frame (192 * 256) instead of more than half a million points per frame (720 * 960).
So I ported all of the math from the linked example, however I do downscale the optical center of the camera by a factor of 3.75 to bring it to the depth map resolution range.
Here is my code:
Now this kind of works? Here is it rendered in unity:
However other captures break when I do:
If I remove the division the point cloud renders, albeit wrongly offseted.
My three questions are:
cx
andcy
take care of this (I tried it and they are offseted to only one quadrant, i.e. the center of the point clod is shifted and not in the center of the viewport)?Thanks in advance!
The text was updated successfully, but these errors were encountered: