2D Extension of 1D Cooley-Tukey FFT. Refactoring original not necesss…#1
2D Extension of 1D Cooley-Tukey FFT. Refactoring original not necesss…#1bertiethorpe wants to merge 1 commit intomarkp-gc:mainfrom
Conversation
…ary as extension makes use of spatial variable separability. Included also is bash script for performing sweeps for data analysis
markp-gc
left a comment
There was a problem hiding this comment.
Just need to avoid losing some of my recent chagnes: if you like I can merge just the new files (but then it won't credit you as the author).
| // Can only do the second expression in-place: | ||
| auto tmpReal = popops::map(graph, complexMulExprRe, {real, v.real, imag, v.imag}, | ||
| prog, debugPrefix + "/complex_mul_re"); | ||
| popops::mapInPlace(graph, complexMulExprRe, {real, v.real, imag, v.imag}, |
There was a problem hiding this comment.
Need to keep my change here. Only the second operation can be done in place because it needs the input from the first.
| poplar::Tensor partial = | ||
| poplin::matMul(graph, matrix.real, realBatch, prog, | ||
| elemType, debugStr + "/real_matmul", matmulOptions); | ||
| elemType, debugStr + "/real_matmul"); |
There was a problem hiding this comment.
Need to keep my change here (matmul options).
|
|
||
| poplin::matMulAcc(graph, partial, 1.f, matrix.imag, imagBatch, prog, | ||
| debugStr + "/imag_matmul", matmulOptions); | ||
| debugStr + "/imag_matmul"); |
| result_odd_remapped.mapLinearly(graph); | ||
| prog.add(copy(result_odd, result_odd_remapped)); | ||
| result_odd = result_odd_remapped; | ||
| } |
There was a problem hiding this comment.
Keep my change here: it uses a better layout for small FFTs to improve their performance and memory use.
| auto result_odd_remapped = ComplexTensor(graph, result_even.elementType(), result_even.shape(), "dft_even_remapped"); | ||
| result_odd_remapped.mapLinearly(graph); | ||
| prog.add(copy(result_odd, result_odd_remapped)); | ||
| result_odd = result_odd_remapped; |
There was a problem hiding this comment.
Keep this as it improves memory use and perf.
|
|
||
| ComplexTensor FFTBuilder::inverseFourierMatrices( | ||
| std::size_t length, poplar::Type elemType) { | ||
| const double twoPi_over_length = (2.0L / length) * 3.141592653589793238462643383279502884L; |
There was a problem hiding this comment.
Keep my change as it improves prexision significantly (same below).
| availableMemoryProportion(-1.f), flopEstimate(0) {} | ||
|
|
||
| /// Set the proportion of memory available for the inner DFT matrix-multiplies. | ||
| void setAvailableMemoryProportion(float proportion) { availableMemoryProportion = proportion; } |
| @@ -0,0 +1,202 @@ | |||
| // Copyright (c) 2022 Graphcore Ltd. All rights reserved. | |||
There was a problem hiding this comment.
If you like I can just merge the new files and leave everythign else unchaged?
| @@ -0,0 +1,202 @@ | |||
| // Copyright (c) 2022 Graphcore Ltd. All rights reserved. | |||
|
|
|||
There was a problem hiding this comment.
Needs Graphcore copyright notice (copy format from one of the other files).
…ary as extension makes use of spatial variable separability.
Included also is bash script for performing sweeps for data analysis