Skip to content

Error During Camera Pose Optimization #10

@angchen-dev

Description

@angchen-dev

Thank you for the excellent work and for sharing this framework!

I'm encountering the following error during camera optimization while using the tutorial code:

Training Progress ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% ?      0/ 20000 0:00:00 --:--  ╭
│ /pointrix/examples/gaussian_splatting/launch.py:25 in     │
│ main                                                                                             │
│                                                                                                  │
│   22 │   │   │   │   │   │   │   cfg.name                                                        │
│   23 │   │   │   │   │   │   │   )                                                               │
│   24 │   │   if cfg.trainer.training:                                                            │
│ ❱ 25 │   │   │   gaussian_trainer.train_loop()                                                   │
│   26 │   │   │   model_path = os.path.join(                                                      │
│   27 │   │   │   │   cfg.exp_dir,                                                                │
│   28 │   │   │   │   "chkpnt" + str(gaussian_trainer.global_step) + ".pth"                       │
│                                                                                                  │
│ /pointrix/pointrix/engine/default_trainer.py:48 in        │
│ train_loop                                                                                       │
│                                                                                                  │
│    45 │   │   │   # update learning rate                                                         │
│    46 │   │   │   self.schedulers.step(self.global_step, self.optimizer)                         │
│    47 │   │   │   # model forward step                                                           │
│ ❱  48 │   │   │   self.train_step(batch)                                                         │
│    49 │   │   │   # update optimizer and densify point cloud                                     │
│    50 │   │   │   with torch.no_grad():                                                          │
│    51 │   │   │   │   self.controller.f_step(**self.optimizer_dict)                              │
│                                                                                                  │
│/pointrix/pointrix/engine/default_trainer.py:81 in        │
│ train_step                                                                                       │
│                                                                                                  │
│    78 │   │   # }                                                                                │
│    79 │   │                                                                                      │
│    80 │   │   self.loss_dict = self.model.get_loss_dict(render_results, batch, step=self.globa   │
│ ❱  81 │   │   self.loss_dict['loss'].backward()                                                  │
│    82 │   │   # structure of optimizer_dict: {}                                                  │
│    83 │   │   # example of optimizer_dict = {                                                    │
│    84 │   │   #   "loss": loss,                                                                  │
│                                                                                                  │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/_tensor.py:525 in backward        │
│                                                                                                  │
│    522 │   │   │   │   create_graph=create_graph,                                                │
│    523 │   │   │   │   inputs=inputs,                                                            │
│    524 │   │   │   )                                                                             │
│ ❱  525 │   │   torch.autograd.backward(                                                          │
│    526 │   │   │   self, gradient, retain_graph, create_graph, inputs=inputs                     │
│    527 │   │   )                                                                                 │
│    528                                                                                           │
│                                                                                                  │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/autograd/__init__.py:267 in       │
│ backward                                                                                         │
│                                                                                                  │
│   264 │   # The reason we repeat the same comment below is that                                  │
│   265 │   # some Python versions print out the first line of a multi-line function               │
│   266 │   # calls in the traceback and some print out the last line                              │
│ ❱ 267 │   _engine_run_backward(                                                                  │
│   268 │   │   tensors,                                                                           │
│   269 │   │   grad_tensors_,                                                                     │
│   270 │   │   retain_graph,                                                                      │
│                                                                                                  │
│ /opt/anaconda3/envs/pointrix/lib/python3.9/site-packages/torch/autograd/graph.py:744 in          │
│ _engine_run_backward                                                                             │
│                                                                                                  │
│   741 │   if attach_logging_hooks:                                                               │
│   742 │   │   unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs)               │
│   743 │   try:                                                                                   │
│ ❱ 744 │   │   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to    │
│   745 │   │   │   t_outputs, *args, **kwargs                                                     │
│   746 │   │   )  # Calls into the C++ engine to run the backward pass                            │
│   747 │   finally:                                                                               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Function _EWAProjectBackward returned an invalid gradient at index 2 - got [4] but expected shape 
compatible with [1, 4]

The command I used:

scene_path="./datasets/Tanks/Family"
output_path="./outputs/Family"
config_file="./pointrix/examples/gaussian_splatting/configs/colmap.yaml"

python launch.py --config $config_file \
    trainer.enable_gui=False \
    trainer.datapipeline.dataset.data_path=$scene_path \
    trainer.output_path=$output_path \
    trainer.datapipeline.batch_size=1 \
    trainer.max_steps=20000 \
    trainer.val_interval=500 \
    trainer.datapipeline.dataset.scale=1 \
    trainer.model.renderer.name=MsplatRender \
    trainer.model.camera_model.enable_training=True \
    trainer.optimizer.optimizer_1.camera_params.lr=1e-3

This occurs while optimizing the camera model. Could you please advise on how to debug or resolve this issue?

Any suggestions would be greatly appreciated. Thanks again!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions