Large gradient at mesh vertices #144

Ra1nbowChan · 2023-09-22T08:04:37Z

Hi! Thanks for your great work!
I found that in the training process, the mesh vertex tensor sometimes receives a very large gradient, and the gradient value is usually as the form 2^n (maybe 4^n, i'm not sure) for some integer n
To reproduce, just add

        # geometry/dmtet.py L210
        self.mesh_verts = opt_mesh.v_pos
        if self.mesh_verts.requires_grad:
            self.mesh_verts.retain_grad()

        # train.py L443
            if geometry.mesh_verts.grad.max() > 100.:
                import ipdb; ipdb.set_trace()

and run the example command:

python train.py --config configs/bob.json

Then the program will stop at the breakpoint when large gradient occurs.
Obtaining such a large gradient will hurt when parametrizing SDF using MLP, since the MLP will collapse after the optimizer step
I've tested on Windows 10, MSVC 14.35.32215, and torch2.0+cu11.8/torch1.13.0+cu11.6. I didn't test on cuda11.3 since I didn't find the way to install the corresponding version of tinycudann on Windows currently
Any advice? Thanks!

The text was updated successfully, but these errors were encountered:

YuxuanSnow · 2024-04-08T09:14:47Z

It's very interesting observation! I also very often run into the issue that no mesh can be extracted. You can try to add clip_grad_norm, which clips the gradient to a value:

optimizer.zero_grad()        
loss, hidden = model(data, hidden, targets)
loss.backward()

torch.nn.utils.clip_grad_norm(model.parameters(), args.clip)
optimizer.step()

more discussion see pytorch/pytorch#309

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large gradient at mesh vertices #144

Large gradient at mesh vertices #144

Ra1nbowChan commented Sep 22, 2023

YuxuanSnow commented Apr 8, 2024

Large gradient at mesh vertices #144

Large gradient at mesh vertices #144

Comments

Ra1nbowChan commented Sep 22, 2023

YuxuanSnow commented Apr 8, 2024