Scaling Down NeRF with NeRF2D

For more information, see the Blog Post

Code for NeRF2D, a 2D analogue of NeRF designed for facilitating experimentation with Neural Radiance Fields, and Novel View Synthesis algorithms.

NeRF2D is trained with 1D views of a 2D scene, and learns to reconstruct a 2D radiance field. Conceptually, this is the same as 3D novel view synthesis, but requires much less compute, and is conceptually easier to understand and visualize:

We show that we can reformulate NeRF in 2D by reconstructing a 2D shape from 1D views of it. Fitting a 2D NeRF is very fast and we propose this as a viable toy dataset for performing quick experimentation on NeRF

Generating 2D Novel View Synthesis Datasets

To train a 3D NeRF, we need a 2D multi-view dataset. Since these are not readily available we include a blender script for rendering 1D images of an object:

With the addon, we can generate training, validation and testing datasets of any object, with different distribution of camera poses.

Since each training view is just a 2D line, we visualize all of the views together by concatenating each view horizontally and plotting them together. For the example scene above we get the following:

Experiments

We perform experiments on four testing scenes:

2D NeRF

We fit a 2D NeRF using 50 views of resolution 100 in under a minute. We show the reconstructed testing views after training each scene:

Additionally, since we are working in 2D space, we can visualize the density field by simply uniformly sampling $x,y$ coordinates and querying the density field over space, enabling us to visualize the reconstructed geometries:

Positional Encoding

In NeRF, a critical component to their success was the use of positional encoding. The spectral bias of neural networks makes it difficult for them to express high-frequency spatial functions. They found that a simple solution is to pass the coordinates through a positional encoding $\gamma$ before feeding it to the MLP:

Where $\gamma(p)$ is a series $L$ of alternating sines and cosines of $p$ with exponentially increasing frequencies, where $L$ is a hyperparameter. In NeRF they found this critical, only fitting a blurry version of the scene without it:

We validated this in NeRF2D, with the "Bunny" scene and unsurprisingly found that without positional encoding $(L=1)$ the learned density field is very low frequency, but as we increase $L$ we can fit higher frequency signals, additionally leading to increased PSNR:

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
blender_scripts		blender_scripts
config		config
data		data
test		test
visualizations		visualizations
.gitignore		.gitignore
.project-root		.project-root
NeRF_from_Nothing.ipynb		NeRF_from_Nothing.ipynb
README.md		README.md
camera_model_2d.py		camera_model_2d.py
dummy_volume.py		dummy_volume.py
log_dataset.py		log_dataset.py
nerf2d.py		nerf2d.py
nerf2d_dataset.py		nerf2d_dataset.py
nerf_model.py		nerf_model.py
positional_encoding.py		positional_encoding.py
scratch.ipynb		scratch.ipynb
test_nerf2d.py		test_nerf2d.py
train_nerf2d.py		train_nerf2d.py
transform2d.py		transform2d.py
wandb_utils.py		wandb_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scaling Down NeRF with NeRF2D

Generating 2D Novel View Synthesis Datasets

Experiments

2D NeRF

Positional Encoding

About

Uh oh!

Releases

Packages

Languages

Jorgeromeu/NeRF2D

Folders and files

Latest commit

History

Repository files navigation

Scaling Down NeRF with NeRF2D

Generating 2D Novel View Synthesis Datasets

Experiments

2D NeRF

Positional Encoding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages