-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Thank you for the excellent work and for sharing this framework!
I'm encountering the following error during camera optimization while using the tutorial code:
Training Progress ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% ? 0/ 20000 0:00:00 --:-- ╭
│ /pointrix/examples/gaussian_splatting/launch.py:25 in │
│ main │
│ │
│ 22 │ │ │ │ │ │ │ cfg.name │
│ 23 │ │ │ │ │ │ │ ) │
│ 24 │ │ if cfg.trainer.training: │
│ ❱ 25 │ │ │ gaussian_trainer.train_loop() │
│ 26 │ │ │ model_path = os.path.join( │
│ 27 │ │ │ │ cfg.exp_dir, │
│ 28 │ │ │ │ "chkpnt" + str(gaussian_trainer.global_step) + ".pth" │
│ │
│ /pointrix/pointrix/engine/default_trainer.py:48 in │
│ train_loop │
│ │
│ 45 │ │ │ # update learning rate │
│ 46 │ │ │ self.schedulers.step(self.global_step, self.optimizer) │
│ 47 │ │ │ # model forward step │
│ ❱ 48 │ │ │ self.train_step(batch) │
│ 49 │ │ │ # update optimizer and densify point cloud │
│ 50 │ │ │ with torch.no_grad(): │
│ 51 │ │ │ │ self.controller.f_step(**self.optimizer_dict) │
│ │
│/pointrix/pointrix/engine/default_trainer.py:81 in │
│ train_step │
│ │
│ 78 │ │ # } │
│ 79 │ │ │
│ 80 │ │ self.loss_dict = self.model.get_loss_dict(render_results, batch, step=self.globa │
│ ❱ 81 │ │ self.loss_dict['loss'].backward() │
│ 82 │ │ # structure of optimizer_dict: {} │
│ 83 │ │ # example of optimizer_dict = { │
│ 84 │ │ # "loss": loss, │
│ │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/_tensor.py:525 in backward │
│ │
│ 522 │ │ │ │ create_graph=create_graph, │
│ 523 │ │ │ │ inputs=inputs, │
│ 524 │ │ │ ) │
│ ❱ 525 │ │ torch.autograd.backward( │
│ 526 │ │ │ self, gradient, retain_graph, create_graph, inputs=inputs │
│ 527 │ │ ) │
│ 528 │
│ │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/autograd/__init__.py:267 in │
│ backward │
│ │
│ 264 │ # The reason we repeat the same comment below is that │
│ 265 │ # some Python versions print out the first line of a multi-line function │
│ 266 │ # calls in the traceback and some print out the last line │
│ ❱ 267 │ _engine_run_backward( │
│ 268 │ │ tensors, │
│ 269 │ │ grad_tensors_, │
│ 270 │ │ retain_graph, │
│ │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/autograd/graph.py:744 in │
│ _engine_run_backward │
│ │
│ 741 │ if attach_logging_hooks: │
│ 742 │ │ unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs) │
│ 743 │ try: │
│ ❱ 744 │ │ return Variable._execution_engine.run_backward( # Calls into the C++ engine to │
│ 745 │ │ │ t_outputs, *args, **kwargs │
│ 746 │ │ ) # Calls into the C++ engine to run the backward pass │
│ 747 │ finally: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Function _EWAProjectBackward returned an invalid gradient at index 2 - got [4] but expected shape
compatible with [1, 4]
The command I used:
scene_path="./datasets/Tanks/Family"
output_path="./outputs/Family"
config_file="./pointrix/examples/gaussian_splatting/configs/colmap.yaml"
python launch.py --config $config_file \
trainer.enable_gui=False \
trainer.datapipeline.dataset.data_path=$scene_path \
trainer.output_path=$output_path \
trainer.datapipeline.batch_size=1 \
trainer.max_steps=20000 \
trainer.val_interval=500 \
trainer.datapipeline.dataset.scale=1 \
trainer.model.renderer.name=MsplatRender \
trainer.model.camera_model.enable_training=True \
trainer.optimizer.optimizer_1.camera_params.lr=1e-3
This occurs while optimizing the camera model. Could you please advise on how to debug or resolve this issue?
Any suggestions would be greatly appreciated. Thanks again!
Metadata
Metadata
Assignees
Labels
No labels