CIS565-Fall-2017 · wpchop · Oct 16, 2017 · Oct 17, 2017 · Oct 18, 2017 · Oct 19, 2017
diff --git a/README.md b/README.md
@@ -1,18 +1,46 @@
 CUDA Rasterizer
 ===============
 
-[CLICK ME FOR INSTRUCTION OF THIS PROJECT](./INSTRUCTION.md)
 
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Wenli Zhao
+* Tested on: Windows 10 Pro, Intel Xeon CPU CPU E5-1630 v4 @ 3.70GHz 32GB, NVIDIA GeForce GTX 24465MB (Sig Lab)
 
-### (TODO: Your README)
+### README
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+![](renders/Capture2.PNG)
 
+In this project, I implemented a simplified graphics rasterizer pipeline which includes vertex shading, primitive assembly, rasterization, fragment shading, and a framebuffer.
+
+The core features I implemented included:
+* Vertex shading. 
+* Primitive assembly with support for triangles read from buffers of index and
+  vertex data.
+* Rasterization.
+* Fragment shading.
+* A depth buffer for storing and depth testing fragments.
+* Fragment-to-depth-buffer writing with atomics
+* A fragment shader with simple Blinn-Phong lighting scheme.
+
+In addition to the basic rasterizer, I implemented UV texture mapping and support for rasterization with points and lines.
+
+#### Texture Mapping
+![](renders/Capture3.PNG)
+
+#### Points
+![](renders/points.PNG)
+
+#### Lines
+![](renders/lines.PNG)
+
+### Analysis
+
+![](renders/chart.png)
+
+![](renders/image.png)
+
+The features I implemented didn't have too much of a performance impact on the models I tested. For example, the first three bars of the chart have a similar distribution. For rasterization of points and lines, I didn't change vertex assembly very much, so the bottleneck remained there. Vertex assembly contains a lot of global memory calls that slow down the pipeline. In general, the fragment shading and rasterization were pretty quick. I think I might have corrupted the cow model since it gave a very different distribution. I'm still not exactly sure why. There is a lot more I could do to accelerate various parts of the rasterization pipeline. I could potentially use shared memory for texture sampling and used tile-based rendering to accelerate my pipeline.
 
 ### Credits
 

diff --git a/renders/Capture.PNG b/renders/Capture.PNG
diff --git a/renders/Capture2.PNG b/renders/Capture2.PNG
diff --git a/renders/Capture3.PNG b/renders/Capture3.PNG
diff --git a/renders/chart.png b/renders/chart.png
diff --git a/renders/image.png b/renders/image.png
diff --git a/renders/lines.PNG b/renders/lines.PNG
diff --git a/renders/points.PNG b/renders/points.PNG
diff --git a/src/main.cpp b/src/main.cpp
@@ -108,6 +108,8 @@ void runCuda() {
 		scale * ((float)width / (float)height),
 		-scale, scale, 1.0, 1000.0);
 
+	P = glm::perspective(45.0f, scale*(float)width / (float)height, 1.0f, 1000.0f);
+
 	glm::mat4 V = glm::mat4(1.0f);
 
 	glm::mat4 M =