Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 34 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,46 @@
CUDA Rasterizer
===============

[CLICK ME FOR INSTRUCTION OF THIS PROJECT](./INSTRUCTION.md)

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Wenli Zhao
* Tested on: Windows 10 Pro, Intel Xeon CPU CPU E5-1630 v4 @ 3.70GHz 32GB, NVIDIA GeForce GTX 24465MB (Sig Lab)

### (TODO: Your README)
### README

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
![](renders/Capture2.PNG)

In this project, I implemented a simplified graphics rasterizer pipeline which includes vertex shading, primitive assembly, rasterization, fragment shading, and a framebuffer.

The core features I implemented included:
* Vertex shading.
* Primitive assembly with support for triangles read from buffers of index and
vertex data.
* Rasterization.
* Fragment shading.
* A depth buffer for storing and depth testing fragments.
* Fragment-to-depth-buffer writing with atomics
* A fragment shader with simple Blinn-Phong lighting scheme.

In addition to the basic rasterizer, I implemented UV texture mapping and support for rasterization with points and lines.

#### Texture Mapping
![](renders/Capture3.PNG)

#### Points
![](renders/points.PNG)

#### Lines
![](renders/lines.PNG)

### Analysis

![](renders/chart.png)

![](renders/image.png)

The features I implemented didn't have too much of a performance impact on the models I tested. For example, the first three bars of the chart have a similar distribution. For rasterization of points and lines, I didn't change vertex assembly very much, so the bottleneck remained there. Vertex assembly contains a lot of global memory calls that slow down the pipeline. In general, the fragment shading and rasterization were pretty quick. I think I might have corrupted the cow model since it gave a very different distribution. I'm still not exactly sure why. There is a lot more I could do to accelerate various parts of the rasterization pipeline. I could potentially use shared memory for texture sampling and used tile-based rendering to accelerate my pipeline.

### Credits

Expand Down
Binary file added renders/Capture.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/Capture2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/Capture3.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/lines.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/points.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,8 @@ void runCuda() {
scale * ((float)width / (float)height),
-scale, scale, 1.0, 1000.0);

P = glm::perspective(45.0f, scale*(float)width / (float)height, 1.0f, 1000.0f);

glm::mat4 V = glm::mat4(1.0f);

glm::mat4 M =
Expand Down
Loading