3d-reconstruction

Currently working on migrating to Python!

How to run

Run

builder.m

on matlab, no parameter required.

There are 2 script files:

builder.m : Calls image2points and plots the result
image2points.m : script that does the actual reconstruction

And input files:

habitat-img/* : Habitat output depth image files, generated using examples.py from Habitat Sim (only 50 entries)
parsedOutput.csv : Contains (x, y, z, qa, qb, qc, qd) data from habitat output (only 50 entries)

Data files not used in code, but is uploaded to repo:

testOutput.txt : Contains habitat output

3d-reconstruction

Samples -> NDC coord

Samples from habitat are obtained as images: (row j, col i, depth)

For any pixel in image, we can get coordinates

$$NDC \ Coords: \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} \ , \ \ x_n, y_n \in [-1, 1] \ , \ \ z_n \in [0, 1]$$

Where x: right, y: up, z: towards screen and camera located at (0, 0, 0).

Code:

% Sampling pixel (i, j) from image im
% 1 <= i <= w, 1 <= j <= h, 0 <= im(j, i) <= 1.0
x_n(count) = (i - w/2) / w;
y_n(count) = ((-1)*j + h/2)/h;
z_n(count) = im(j, i);

Downsampling & Stepsize

For fast computation the images are downsampled:

% Downsampling using stepSize parameter
count = 0;
x_n = [];
y_n = [];
z_n = [];
for i=1:stepSize:w
	for j=1:stepSize:h
		if (im(j, i) < 0.15 || im(j, i) > 0.85), continue; end
		count = count + 1;
		x_n(count) = (i - w/2)/w;
		y_n(count) = ((-1)*j + h/2)/h;
		z_n(count) = im(j, i);
	end
end
NDC = [x_n; y_n; z_n];

Intrinsics

Input from Habitat: width, height, near, far, hfov

Near: 0.01, Far: 1000.0

Near, Far: float
- CameraSensor Class doc
- Hard coded in src/esp/sensor/VisualSensor.h : here
  - near: 0.01, far: 1000.0

hfov: 90 (degrees)

hfov: degree
- from settings.py

width, height: 640, 480 - from settings.py

Near plane size (screen x, y-length)

screen x-length : $sx = 2 \cdot near \cdot tan(\large\frac{hfov}{2})$

screen y-length: $sy = sx \cdot \large\frac{resolution_x}{resolution_y}$ where resolution: width, height.

width = 640; height = 480;
near = 0.01; far = 1000.0; hfov = 90*(2*PI/180);
s_x = 2 * near * tan(hfov/2);
s_y = s_x * width/height;

Near plane top/bottom/right/left

top: sy/2, bottom: -sy/2, right: sx/2, left: -sx/2

Code:

near = 0.01; far = 1000.0;
[h, w] = size(im);
hfov = 90*(2*pi/360);
s_x = 2 * near * tan(hfov/2);
s_y = s_x * h/w;
r = s_x/2;
t = s_y/2;

Extrinsics

Code to parse habitat output log to csv: [Parse output to csv]('Parse output to csv.md')

(nothing special with this code, just reading habitat output and parsing into csv file)

Outputs 3 values for location and 4 values for rotation quaternion.

Modifying camera location

Our code:

x: right, y: up, z: towards screen

Habitat output:

x: towards screen, y: up, z: right

Modifying quaternion

Hence

modifiedCameraPos = [extrinsics(i,3); extrinsics(i,2); extrinsics(i,1)]';
modifiedQuaternion = [extrinsics(i,4); (-1)*extrinsics(i,7); extrinsics(i,6); (-1)*extrinsics(i,5)]';

Unprojection

Unprojection input

near, far, top, right samples

x: right, y: up, z: towards screen, camera placed at (0,0,0)

Unprojection output (goal)

x, y, z in camera coord.

x: Camera right, y: Camera up, z: Camera front direction, camera placed at (0,0,0)

Formulation

Define coordinates in Camera and NDC:

$$Output = Camera: \begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix} , \ \ Input = NDC: \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix}$$

Here Camera is located at the origin for both coordinates, z_c > 0. For NDC, $x_n, y_n \in [-1, 1] , z_n \in [0, 1]$ where x points right, y points up, and z points toward the screen. (Left-handed coord system)

Camera -> NDC direction

Let's define a homogeneous coord version of NDC:

$$\begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} \equiv \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = ProjMat \cdot \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} \ , \ \ w_n' = z_c$$

where $\equiv$ represents equivalence in homogeneous coordinates and $=$ represents element-wise equality of vectors. Reason for choice of $w_n'$ is in the construction of ProjMat.

Perspective Projection on near Plane

As a step for constructing ProjMat Let's define Proj on near plane: homog coord equivalent to NDC where z_p is equal to n (z_p = n > 0).

By drawing triangles

we get the relation

$$\begin{bmatrix} x_p \\ y_p \\ z_p \\ w_p \end{bmatrix} = \begin{bmatrix} x_c \cdot (\frac{n}{z_c}) \\ y_c \cdot (\frac{n}{z_c}) \\ n \\ (\frac{n}{z_c}) \end{bmatrix} = \begin{bmatrix} x_c \\ y_c \\ z_c \ 1 \end{bmatrix} \cdot (\frac{n}{z_c}) \ \ \ \ \ \ \ \ \ \ \ \ (1)$$

(Note that z_c varies for each points)

Since we chose $x_n, y_n \in [-1, 1] , z_n \in [0, 1]$, we need a proper projection matrix.

we get the relation

$$ x_n = \frac{1}{r} \cdot x_p \ , \ \ \ y_n = \frac{1}{t} \cdot y_p \ \ \ \ \ \ \ \ \ \ \ \ (2) $$

Substituting (1) for (2), we get

$$ x_n = \frac{1}{r} \cdot (\frac{n}{z_c}) \cdot x_c \ , \ \ \\ y_n = \frac{1}{r} \cdot (\frac{n}{z_c}) \cdot y_c \ \ \ \ \ \ \ \ \ \ \ \ (3) $$

We need a way to divide x_n and y_n by z_c, so we decide to use homogeneous coordinates and set the w_n' term equal to z_c. The resulting projection matrix would be:

$$ \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} \equiv \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = \begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ ? & ? & ? & ? \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} \ , \ \ w_n' = z_c $$

z_n' term: We know that z_n' doesn't depend on x_c or y_c, so we can set the corresponding elements in the matrix to 0.

$$ \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} \equiv \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = \begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & A & B \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} $$

Solving for A and B: By projection matrix,

$$ z_n = \frac{z_n'}{w_n'} = \frac{A \cdot z_c + B}{z_c} $$

As we decide to map $z_c \in [n, f]$ to $z_n \in [0, 1]$ linearly, we can obtain the relation as

$$ 0 = \frac{A \cdot n + B}{n} \ , \ \ 1 = \frac{A \cdot f + B}{f} $$

Gives us the result

$$ A = \frac{f}{f-n} \ , \ \ B = -\frac{nf}{f-n} $$

So we can complete the projection matrix

$$ \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} \equiv \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = \begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{f}{f-n} & -\frac{nf}{f-n} \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} $$

NDC to Camera reconstruction

Since our goal is to obtain the camera coordinates from NDC, we have to go the opposite direction.

However

$$ \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = \begin{bmatrix} x_n w_n \\ y_n w_n \\ z_n w_n \\ w_n \end{bmatrix} \ , \ \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} = inv(\begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{f}{f-n} & -\frac{nf}{f-n} \\ 0 & 0 & 1 & 0 \end{bmatrix}) \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} $$

is impossible from the first step because we don't know w_n'. In order to get z_c we need w_n', but we need z_c to get w_n'.

This can be solved by using the relationship

$$ z_n = \frac{A \cdot z_c + B}{z_c} $$

and bringing z_c to LHS gives

$$ z_c = \frac{B}{z_n + A} = \frac{-\frac{nf}{f-n}}{z_n + \frac{f}{f-n}} $$

which the RHS is fully known.

Conclusion

Using

$$ Output = Camera: \begin{bmatrix} x_c \\ y_c \\ z_c \end{bmatrix} , \ \ Input = NDC: \begin{bmatrix} x_n \\ y_n \\ z_n \end{bmatrix} $$

Calculate in order:

$$ w_n' = \frac{-\frac{nf}{f-n}}{z_n + \frac{f}{f-n}} \ , \ \\ \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} = \begin{bmatrix} x_n w_n \\ y_n w_n \\ z_n w_n \\ w_n \end{bmatrix} \ , \ \begin{bmatrix} x_c \\ y_c \\ z_c \\ 1 \end{bmatrix} = inv(\begin{bmatrix} \frac{n}{r} & 0 & 0 & 0 \\ 0 & \frac{n}{t} & 0 & 0 \\ 0 & 0 & \frac{f}{f-n} & -\frac{nf}{f-n} \\ 0 & 0 & 1 & 0 \end{bmatrix}) \begin{bmatrix} x_n' \\ y_n' \\ z_n' \\ w_n' \end{bmatrix} $$

Code:

% A and B components used in Projection Matrix
A_comp = (far)/(far-near);
B_comp = near*far/(far-near)*(-1);

% NDC mapped to frustum - near plane
w_n_prime = z_n + A_comp;
w_n_prime = 1./w_n_prime;
w_n_prime = w_n_prime * B_comp;

% 2. Plane coordinates
planeProj = [x_n.*w_n_prime; y_n.*w_n_prime; z_n.*w_n_prime; w_n_prime];

% Inverse projection
projMat = [(near/r), 0,0,0; ...
	0, (near/t), 0,0; ...
	0,0, A_comp, B_comp; ...
	0,0,1,0];

cameraSpaceCoords = projMat\planeProj;

Reverting rotation & translation

Building rotation matrix

Rotation matrix can be obtained by matlab's built in rotmat function Code:

% Revert rotation
% building rotation matrix
quatVector = quatnormalize(quat);
imquat = quaternion(quatVector);
rotationMat = rotmat(imquat, 'point'); % 3x3
rotationMat = [rotationMat, zeros(3,1); zeros(1,3), 1]; % 4x4

% get coord before rotation
beforeRotCoords = rotationMat\cameraSpaceCoords;

Building translation matrix

Code:

% Revert translation
moveMat = eye(4);
moveMat(1, 4) = cameraPos(1);
moveMat(2, 4) = cameraPos(2);
moveMat(3, 4) = cameraPos(3);

% get coord before translation
beforeMove = moveMat\beforeRotCoords;

And finally converting homogeneous coord to cartesian coord

result = hom2cart(beforeMove')';

We can complete the image -> points function.

Plotting

Combing all resulting points into 1 array:

<loop>
	results = % Call image2points function
		
	% Concatenate results into 1 big 3*n array
	all_points = [all_points points];

Plotting coords in World coord: Matlab uses y direction up:

scatter3(all_points(1,:), all_points(3,:), all_points(2,:), '.b');
xlabel('x'); ylabel('z'); zlabel('y');

In builder.m, we produce 4 plots

Plot 1) All sample points
Plot 2) Depth image of choice
Plot 3) Corresponding samples in world coord
Plot 4) Corresponding samples in camera coord

We expect that:

Plot 1 will be point cloud of samples in world coord, reconstructing the whole sample scene.
Plot 3 will be the reconstruction of Plot 2
Plot 4 will be the reconstruction of Plot 2, but in camera space.

Result

???

Where did it go wrong?

The depth value is scaled here

https://github.com/facebookresearch/habitat-sim/blob/main/examples/demo_runner.py#L73

Maybe I need some extra steps?
Projection matrix may be wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to run

3d-reconstruction

Samples -> NDC coord

Downsampling & Stepsize

Intrinsics

Input from Habitat: width, height, near, far, hfov

Near plane size (screen x, y-length)

Near plane top/bottom/right/left

Extrinsics

Modifying camera location

Modifying quaternion

Unprojection

Unprojection input

Unprojection output (goal)

Formulation

Camera -> NDC direction

Perspective Projection on near Plane

NDC to Camera reconstruction

Conclusion

Reverting rotation & translation

Building rotation matrix

Building translation matrix

Plotting

Result

Where did it go wrong?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
habitat-img		habitat-img
img		img
Parse output to csv.md		Parse output to csv.md
README.md		README.md
builder.m		builder.m
image2points.m		image2points.m
parsedOutput.csv		parsedOutput.csv
testOutput.txt		testOutput.txt

Folders and files

Latest commit

History

Repository files navigation

How to run

3d-reconstruction

Samples -> NDC coord

Downsampling & Stepsize

Intrinsics

Input from Habitat: width, height, near, far, hfov

Near plane size (screen x, y-length)

Near plane top/bottom/right/left

Extrinsics

Modifying camera location

Modifying quaternion

Unprojection

Unprojection input

Unprojection output (goal)

Formulation

Camera -> NDC direction

Perspective Projection on near Plane

NDC to Camera reconstruction

Conclusion

Reverting rotation & translation

Building rotation matrix

Building translation matrix

Plotting

Result

Where did it go wrong?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages