小彭老师《c++中的高性能并行编程与优化》的第7讲作业 by Paul-Laifan · Pull Request #12 · parallel101/hw07

Paul-Laifan · 2026-03-11T10:28:05Z

matrix_randomize: swap loop order (y outer, x inner) for sequential write
matrix_transpose: use 32x32 tiling to improve cache locality
matrix_multiply: reorder loops to (y,t,x), hoist rhs scalar, 44x speedup
matrix_RtAR: use static temp matrices to avoid repeated malloc/free
overall: 5.36s -> 0.15s, ~35.7x speedup

以 n=1120 的数据为基准对比（改进前 / 改进后）：

函数	改进前	改进后	加速比	优化手段
matrix_randomize	0.000928s	0.000305s	~3x	交换循环顺序，x 在内层保证连续写
matrix_transpose	0.002528s	0.000579s	~4.4x	分块 Tiling（TILE=32）
matrix_multiply	0.904947s	0.020365s	~44x	循环重排为 (y,t,x)，内层连续 + 标量提升
matrix_RtAR	1.80908s	0.044072s	~41x	以上优化的叠加 + static 临时变量
overall	5.357s	0.150s	~35.7x	—

- matrix_randomize: swap loop order (y outer, x inner) for sequential write - matrix_transpose: use 32x32 tiling to improve cache locality - matrix_multiply: reorder loops to (y,t,x), hoist rhs scalar, 44x speedup - matrix_RtAR: use static temp matrices to avoid repeated malloc/free - overall: 5.36s -> 0.15s, ~35.7x speedup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

小彭老师《c++中的高性能并行编程与优化》的第7讲作业#12

小彭老师《c++中的高性能并行编程与优化》的第7讲作业#12
Paul-Laifan wants to merge 1 commit intoparallel101:mainfrom
Paul-Laifan:main

Paul-Laifan commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Paul-Laifan commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant