Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'background' #48

Open
karthik101200 opened this issue Jun 20, 2024 · 21 comments
Open

KeyError: 'background' #48

karthik101200 opened this issue Jun 20, 2024 · 21 comments

Comments

@karthik101200
Copy link

karthik101200 commented Jun 20, 2024

When I train on custom dataset and keep auto-poses-scale False i get the following error. I have used Lidar odomtery instead of COLMAP and created a transforms,json. I have hence kept the load-pcd-points false as well
/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/models/splatfacto.py", line 861, in get_loss_dict gt_img = self.composite_with_background(self.get_gt_img(batch["image"]), outputs["background"]) KeyError: 'background'
ns-train dn-splatter --max-num-iterations 100000 normal-nerfstudio --data DATA_DIR --load-pcd-normals False --load-3D-points False --auto-scale-poses False

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

Without having access to your data, I have a few guesses: 1) the poses are maybe wrong and the camera sees no gaussians and results in a crash. Colmap uses OpenCV coordinate system and nerfstudio/dn-splatter uses OpenGL. The nerfstudio-data parser assumes OpenGL coordinates, but the colmap/coolermap dataparser changes from OpenCV to OpenGL. 2) gaussians are not being randomly initialized (since you don't have SfM points). Can you try adding this command:

--pipeline.model.random_init

e.g.
ns-train dn-splatter --max-num-iterations 100000 --pipeline.model.random_init normal-nerfstudio --data DATA_DIR --load-pcd-normals False --load-3D-points False --auto-scale-poses False

Btw, 100k iterations is maybe too much ... 30k even is probably at the best performance, but it depends on your scene scale.

@karthik101200
Copy link
Author

Nopes. Same issue. I set this to True

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

Nopes. Same issue. I set this to True

Okay it could be the camera pose conventions then? If the crash comes at the first training iteration, this could explain it, i.e. the camera poses do not see any gaussians.

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

Does the normal splatfacto model work with your dataset?

ns-train splatfacto [some dataparser like colmap or leave blank for default nerfstudio-data] --data [path to data]

@karthik101200
Copy link
Author

I used splatfacto with normal-nerfstudio flag as I dont want to load the sfm points but it shows the following (grads are either not computed or found)
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

@karthik101200
Copy link
Author

karthik101200 commented Jun 20, 2024

My best guess is the opencv conventions but this isnt the issue when I keep auto scale ON but here the depth and normal predictions are bad for any nerf or splat method (rendered images are right but still)

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

I used splatfacto with normal-nerfstudio flag as I dont want to load the sfm points but it shows the following (grads are either not computed or found) RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Yes, this error occurs at the first training iteration when no gaussians are projected into the camera frame resulting in zero grads basically at the first iteration. This suggests to me that there is something off in the poses. How large do you think is the scale of your poses? I think the random initialization is hardcoded to be in a 10 unit cube, so if your poses are way larger than this, a camera could be outside of this cube and sees none of the initially random initialized gaussians.

The exact line of code where the random init occurs is here

@karthik101200
Copy link
Author

Ill just list my experiments for better clarity
1 - I tried with a phone camera with lidar (gives trasnforms json with ns-process record3d) on a custom dataset without loading the pcd points and autoscale on and off both. It works fine and the predictions are okay
2- I try with a external lidar and camera setup. With auto poses scale on i get it to train but predictions are bad so i tried without that and these issues persist

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

How are you generating the transforms.json for the 2) option? Bascially the above errors about the "background" in dn-splatter and the zero grad in splatfacto are a bit misleading, since the root cause of the issue I believe is the fact that the camera sees no valid gaussians and the crash occurs at iter = 0 when computing the loss.

@karthik101200
Copy link
Author

I used splatfacto with normal-nerfstudio flag as I dont want to load the sfm points but it shows the following (grads are either not computed or found) RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Yes, this error occurs at the first training iteration when no gaussians are projected into the camera frame resulting in zero grads basically at the first iteration. This suggests to me that there is something off in the poses. How large do you think is the scale of your poses? I think the random initialization is hardcoded to be in a 10 unit cube, so if your poses are way larger than this, a camera could be outside of this cube and sees none of the initially random initialized gaussians.

The exact line of code where the random init occurs is here

My poses were initially higher than 10 as I was assing trasnforms wrt a world frame now i calculated transforms wrt the first camera frame and the position should be closer

@karthik101200
Copy link
Author

karthik101200 commented Jun 20, 2024

How are you generating the transforms.json for the 2) option?

I get the camera to world transformation, images, camera info from ROS. This is in OpenCV convention right. So does the trasnforms from ns-process data convert to opengl convention

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

How are you generating the transforms.json for the 2) option?

I get the camera to world transformation, images, camera info from ROS

Yeah, I would carefully check what camera pose convention is used by your ROS application. There is some info here that could help you. Since its ROS... it could very well be OpenCV (my bias is just that this is probably the more conventional coordinate system for robotics people and ROS) and thus a transform of something like pose @ torch.diag([1,-1,-1,1], dtype=pose.dtype, device=pose.device) is required.

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

ns-process data and the output transforms.json is all in opengl coordinate system. The colmap and coolermap (subclassed from colmap with additional normal stuff needed in this project) dataparsers have a conversion from the OpenCV colmap output to opengl.

@karthik101200
Copy link
Author

one more doubt regarding the auto poses scale. Will this play a role in better/correct rendering. Because for the phone data I dont find a lot of improvement in the end except there may be scale ambiuity in case of final mesh rendering.

@karthik101200
Copy link
Author

one more doubt regarding the auto poses scale. Will this play a role in better/correct rendering. Because for the phone data I dont find a lot of improvement in the end except there may be scale ambiuity in case of final mesh rendering.

Sure let me try this. Ill get back. Thank you

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

one more doubt regarding the auto poses scale. Will this play a role in better/correct rendering. Because for the phone data I dont find a lot of improvement in the end except there may be scale ambiuity in case of final mesh rendering.

Anecdotally, I think I might have observed similar behaviour. But I think the scale of the reconstruction should not have an effect on the final quality... so it is indeed strange but I have no good answer for this right now.

@karthik101200
Copy link
Author

Okay got it.

@karthik101200
Copy link
Author

karthik101200 commented Jun 20, 2024

Initial render in the viewer seems much better. Have to check the depth predictions though. Do you have some idea of the params for a large scale data like a street view dataset which can give better rendering?

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

Street view could be quite challenging for dn-splatter which is mainly meant for indoor scenes of finite scale. With outdoor scenes, the background like sky is hard to model with just Gaussian. I have not personally tested street view scenes so would be interesting to see what happens.

@karthik101200
Copy link
Author

karthik101200 commented Jun 20, 2024

It is rather a large indoor scene. To clarify im collecting data from a robot in an hospital env in simualtion. But yes I agree

@maturk
Copy link
Owner

maturk commented Jun 20, 2024

Cool! @karthik101200, maybe you can post some screenshots if you get dn-splatter to work on your data, I would be interested to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants