Skip to content

Training not converging well, Dataset available #22

@samhodge-aiml

Description

@samhodge-aiml

Here are my modifications to the source code

diff --git a/configs/Test/images.yaml b/configs/Test/images.yaml
index 81a5824..4435fb2 100644
--- a/configs/Test/images.yaml
+++ b/configs/Test/images.yaml
@@ -12,4 +12,5 @@ training:
   auto_scheduler: True
   eval_pose_every: -1
 extract_images:
-  resolution: [540, 960]
\ No newline at end of file
+  resolution: [3024, 4032]
+with_depth: False
diff --git a/configs/preprocess.yaml b/configs/preprocess.yaml
index c56b1fd..d3ec72c 100644
--- a/configs/preprocess.yaml
+++ b/configs/preprocess.yaml
@@ -1,9 +1,9 @@
 depth:
   type: DPT
 dataloading:
-  path: data/nerf_llff_data
-  scene: ['fern']
+  path: data/Test
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
:
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg
~
~
~
(END)
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg

And my dataset

https://drive.google.com/drive/folders/1ZZgZUrFrnP47rx8bN5K6yvYnSC50a-9G?usp=sharing

When what I have done to start training is put the images in

data/Test/images/images

then run the preprocess and train commands

and I have found the tensorboard attached here:

log.zip

Screenshot from 2023-09-03 13-07-27

Is this OK?

or did I muck up the intrinsics?

attached in a JPG to look at the EXIF information

6063

Screenshot from 2023-09-03 13-09-33

I think it may be 14 rather than 13 I will try again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions