Import HTA to Chakra to extract synchronization dependency#1
Import HTA to Chakra to extract synchronization dependency#1JoongunPark wants to merge 1 commit intoTaekyungHeo:refactorfrom
Conversation
be20a5d to
e73309b
Compare
f305f72 to
8f71209
Compare
There was a problem hiding this comment.
I am not sure whether it is the best name to describe your method. Could you please justify the method name or rename the method name?
There was a problem hiding this comment.
I tried to make similar name with 'enforce_inter_thread_order'.
There was a problem hiding this comment.
I want to know whether we can assume that the annotation is always 'ProfilerStep'
There was a problem hiding this comment.
I think we can. That is always assumed in HTA testing example. HTA also describes in that way.
annotation (str): a trace annotation to limit the analysis to,
for example "ProfilerStep" would match all annotations that
match this string (ProfilerStep#100, ProfilerStep#101 etc)
There was a problem hiding this comment.
What is the meaning of instance_id? Why is it set to zero?
There was a problem hiding this comment.
According to HTA, instance_id is used to classify which annotation to consider.
instance_id: can be either of the following
(int) - specify which instance of the annotation to consider.
Defaults to the first instance.
(Tuple(int, int)) - considers a range of annotation instances start to end,
inclusive of both start and end instance.
|
The tool fails with the following command. |
This failure occurs when the HTA can not find the files in the directory. Could you check if the files are really there. |
9b58ec7 to
1df8ee6
Compare
Summary
This PR is to process synchronization dependency between the Chakra nodes.
In order to do that, we use CriticalPathAnalyzer in Holistic Trace Analysis (https://github.com/facebookresearch/HolisticTraceAnalysis/blob/main/hta/analyzers/critical_path_analysis.py).
Please note that,
Test Plan
Download and Install HTA
Run Chakra et_converter
Test input with Resnet-50 with 2 GTX1070 (rank 0)
eg.rank_0.pt.trace.json
kineto.rank_0_step_5.1708449344148840892.pt.trace.json
Test result with Resnet-50 with 2 GTX1070 (rank 0)
rank_0.json
Test Result with Megatron (No Sync dependency)
I've observed that this update will not cause any changes in result with trace which has no synchronization dependency.
Original
sys[4] finished, 607252677 cycles
sys[5] finished, 607253196 cycles
sys[6] finished, 607253715 cycles
sys[7] finished, 607254234 cycles
sys[0] finished, 607254753 cycles
sys[1] finished, 607255272 cycles
sys[2] finished, 607255791 cycles
sys[3] finished, 607256310 cycles
New
sys[4] finished, 607252677 cycles
sys[5] finished, 607253196 cycles
sys[6] finished, 607253715 cycles
sys[7] finished, 607254234 cycles
sys[0] finished, 607254753 cycles
sys[1] finished, 607255272 cycles
sys[2] finished, 607255791 cycles
sys[3] finished, 607256310 cycles