Completed Spring of 2022 as a part of Lehigh University's CSE 202: Computer Organization and Architecture.
All programs were written and optimized for X86-64 machines with gcc module 7.10.
A programming exercise to apply different optimization techniques to improve the performance of the given FFT function
- Measure the performance of C programs
- Apply different optimization techniques to improve the performance of a given program
- Verify the correctness of the optimized versions of the original program
- Evaluate the benefits and limitations of optimization techniques
The project consists in completing the following tasks:
- Measure the performance of the FFT function without any optimization for different values of height and width (N from 10 to 50)
- Optimize the performance of the function using the different optimization techniques covered in chapter 5
- Verify the correctness of the results of the optimized functions against of the results of the original function
- You must use at least four of the following optimizations techniques:
- Code motion
- Eliminating memory references when possible
- Loop unrolling Kx1 (for K: 2, 4, or 8)
- Loop unrolling Kx1a
- Loop unrolling Kx2
- Loop unrolling Kx2a
The program prog2.c is provided with the implementation of the unoptimized FFT function. You need to write a function definition for the different versions of the optimized function FFT. Add the type of the optimization you use to the name of the function. For example, name the FFT with loop unrolling 2x2 as fft_loop_unrolling_2_2. You need to modify the main method to call the optimized functions you define and measure the execution time of each function.
Document each optimization technique you implemented by describing the modifications you made to the FFT code, measuring the performance of the optimized function for different values of N, and discussing the effect of the optimization on the performance of the FFT function.
Check the correctness of the optimized functions. It is important to verify that the optimized functions generate exactly the same results as the original function. Add a test function in prog2.c that compares the output 2D arrays generated by the original function to the arrays generated by the optimized functions. One way to do this is to initialize the array inputData to the same constant values in all the functions and compare the output arrays realOut, imagOut, and amplitudeOut to the reference output arrays generated by the original function.
-
To measure the execution time of a function, use the C library function clock() before and after the call to the function. The main method in prog2.c shows an example on how to use clock() to measure the execution time of the function fft_orig in clock cycles. clock() returns the number of clock cycles since the beginning of the program.
-
Use the provided script runTests.sh to compile and run your program prog2.c. The script compiles prog2.c with no optimization and runs your program for different values of N (10, 20, 30, 40, and 50)..
-
Your submission, for this assignment, should include the source file prog2.c and one-page maximum write up to explain the optimizations you made and discuss the results you obtained.
-
Refer to the Assignment Grading Rubric on Coursesite to maximize your score.
-
To submit your code for grading, simply perform a "git push"; which will upload your changes to GitHub Classroom. There is no auto-grader for this assignment.
-
Post any questions on Piazza.