Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@ cmake_minimum_required(VERSION 3.0)

project(cis565_rasterizer)

# CUDA linker options
find_package(Threads REQUIRED)
find_package(CUDA 8.0 REQUIRED)
set(CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE ON)
set(CUDA_SEPARABLE_COMPILATION ON)

set(CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake" ${CMAKE_MODULE_PATH})

# Set up include and lib paths
Expand Down Expand Up @@ -75,12 +81,6 @@ if (WIN32)
list(APPEND CORELIBS legacy_stdio_definitions.lib)
endif()

# CUDA linker options
find_package(Threads REQUIRED)
find_package(CUDA 8.0 REQUIRED)
set(CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE ON)
set(CUDA_SEPARABLE_COMPILATION ON)

#add_subdirectory(stream_compaction) # TODO: uncomment if using your own stream compaction
add_subdirectory(src)
add_subdirectory(util)
Expand Down
47 changes: 42 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,51 @@ CUDA Rasterizer

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Joseph Klinger
* Tested on: Windows 10, i5-7300HQ (4 CPUs) @ ~2.50GHz, GTX 1050 6030MB (Personal Machine)

### (TODO: Your README)
### README

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
This week, I took on the task of implementing a rasterizer in CUDA. I have already written a CPU rasterizer (almost 2 years ago, in the introductory
graphics course CIS 460), but implementing a basic graphics pipeline on the GPU was a different beast.

The features included in this rasterizer are:
- Texture mapping
- Supersampling Antialiasing
- Color interpolation across triangles

[Demo video here.](https://vimeo.com/238849683)

Rasterization, in very brief summary, is taking a 3d shape and deciding how to color the pixels that the object overlaps. In this project, that involves transforming
the input GLTF models' vertex data, creating triangles from that data, projecting the triangles into view->clip->NDC/screen->viewport space, computing line intersection
with the edges of the triangle, and shading the overlapping fragments.

Here is an image of the given Duck GLTF model rasterized with texture mapping:

![](/renders/duck_noaa.PNG)

For comparison, here is the same Duck but rendered with SSAA (supersampling antialiasing). This process involves simply rendering to an image of higher resolution than the
screen, then downsampling that information into the final image:

![](/renders/duck_ssaa.PNG)

### Performance Analysis

I benchmarked my rasterizer's performance using the Duck GLFT model, which has ~4000 tris, at a close up and far zoom level. Here are the results:

![](/renders/graph1.png)

![](/renders/graph2.png)

As we can see, rasterization is by far the most expensive operation compared to vertex transform, primitive assembly, fragment shading and downsampling.
Additionally, SSAA, as expected, makes the rasterization process much more costly because we have to render to an image of twice the size of the final,
so more fragments must be computed and be checked with the depth test. Lastly, clearly far zoom makes the rasterization process more costly as there are
simply more fragments overlapping each triangle.

One experiment I did try was comparing rasterization performance when computing line intersection with the triangle edges as opposed to simply checking all
fragments within the bounding box of the primitive. As expected, it did improve performance, as we were able to avoid computing barycentric weights for every
potential fragment, only having to replace that with a few lines of line intersection code, where the most expensive operation is a divide (as opposed to a
cross product).

### Credits

Expand Down
Binary file added renders/duck_noaa.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/duck_ssaa.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/graph1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added renders/graph2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ set(SOURCE_FILES

cuda_add_library(src
${SOURCE_FILES}
OPTIONS -arch=sm_20
OPTIONS -arch=sm_61
)
4 changes: 1 addition & 3 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,7 @@ void runCuda() {
// No data is moved (Win & Linux). When mapped to CUDA, OpenGL should not use this buffer
dptr = NULL;

glm::mat4 P = glm::frustum<float>(-scale * ((float)width) / ((float)height),
scale * ((float)width / (float)height),
-scale, scale, 1.0, 1000.0);
glm::mat4 P = glm::perspective(glm::radians(45.0f), ((float)width) / ((float)height), 0.1f, 10.0f);

glm::mat4 V = glm::mat4(1.0f);

Expand Down
Loading