Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions docs/cheaha/tutorial/profiling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Profiling

Profiling is essential for assessing how different parts of a program execute and gathering crucial performance data. It plays a vital role in debugging, pinpointing bottlenecks, optimizing code, and scaling application performance. This analysis identifies common issues such as out-of-bound memory, segmentation faults, bus errors, and runtime overhead, enabling effective troubleshooting and improvement strategies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define some of these terms.

  • out-of-bounds memory
  • segmentation fault
  • bus error
  • runtime
  • bottleneck


In research contexts, where code often performs complex or resource-intensive tasks, profiling helps identify which parts of the code consume significant compute and memory resources. This insight guides optimizations aimed at reducing execution times and enhancing the overall efficiency of the program.

## Who gets benefited by Profiling?

## Profiling Python Codes

Three common profiling techniques used in analyzing Python codes are discussed briefly in this section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a short paragraph, give the purpose for each. Why would I want to use time vs CPU vs memory profiling?


1. Time Profiling
2. Memory Profling
3. CPU Profling

!!! note
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to fix this admonition, it doesn't appear to be formatted correctly.

The examples are tested on Cheaha.

### Time Profiling

For researchers, optimizing code runtime is essential, highlighting the significance of profiling code runtime. This process aids in accurately estimating the necessary computational resources like CPU and memory for successful simulation execution.

1. Line Profiling
Line profiling is a powerful tool that provides detailed insights into the execution time of each line of code. Unlike profiling an entire program, which might miss small but crucial performance issues, line profiling meticulously examines how much time each line takes to execute. This method is particularly beneficial for pinpointing specific code segments or operations that may be contributing to slower overall performance. However, this involves modifying the exisiting code. Inorder to avoid more manual changes, you can use the `kernprof` command line utility with `line profiler`. This tool allows you to profile Python scripts without needing to add profiling code directly into the script itself.

Example1:

The following example shows profiling a input python code that calculates num_array.

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to python

import sys
import numpy as np

# Verify if correct number of command-line arguments is provided
if len(sys.argv) != 3:
print("Usage: python python_script_new.py <start> <end>")
sys.exit(1)

# Passing start and end values from command-line arguments
start = int(sys.argv[1])
end = int(sys.argv[2])

@profile
def create_array(start, end):
# Create an array from start to end using numpy
return np.arange(start, end)

@profile
def compute_sum(input_array):
# Perform addition on the array elements using numpy's sum function
return np.sum(input_array)

# Main code execution
input_array = create_array(start, end)
sum_result = compute_sum(input_array)

# Print Input Range and Sum
print("Input Range: {} to {}, Sum: {}".format(start, end, sum_result))
```

```bash
$ kernprof -l -v python_script.py 1 1000

Input Range: 1 to 1000, Sum: 499500
Wrote profile results to python_script.py.lprof
Timer unit: 1e-06 s

Total time: 2.106e-05 s
File: python_script.py
Function: create_array at line 13

Line # Hits Time Per Hit % Time Line Contents
==============================================================
13 @profile
14 def create_array(start, end):
15 # Create an array from start to end using numpy
16 1 21.1 21.1 100.0 return np.arange(start, end)

Total time: 5.1727e-05 s
File: python_script.py
Function: compute_sum at line 18

Line # Hits Time Per Hit % Time Line Contents
==============================================================
18 @profile
19 def compute_sum(input_array):
20 # Perform addition on the array elements using numpy's sum function
21 1 51.7 51.7 100.0 return np.sum(input_array)
```

### Function Profiling

Function profiling is used to analyze the performance of each function in your code by identifying how much time each function takes to run. cProfile is a built-in Python library that performs this analysis for your entire program. It tracks every function call, showing which functions are called most often and how long each call takes on average. cProfile is included with Python.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ nav:
- Tutorials:
- cheaha/tutorial/index.md
- Anaconda Environment Tutorial: cheaha/tutorial/pytorch_tensorflow.md
- Profiling: cheaha/tutorial/profiling.md
- Cheaha Web Portal:
- cheaha/open_ondemand/index.md
- Using the Web Portal: cheaha/open_ondemand/ood_layout.md
Expand Down