SBiCGの非対角同時計算とLOBPCG/SBiCG/TPQに対する複ベクトル高速化#148
SBiCGの非対角同時計算とLOBPCG/SBiCG/TPQに対する複ベクトル高速化#148mitsuaki1987 wants to merge 58 commits intodevelopfrom
Conversation
# Conflicts: # src/PairExSpin.c
… it is used in LOBPCG.
lobcg_kondo*.sh: Indices for itenerant was incorrect. this was fixed before, but not applied in these tests. spectrum_spin_kagome.sh: S+S- excitation lobcg_genspin_ladder.sh: D-term with general spin spectrum_spin_kagome.sh
…tion. Adopt the warning message by -Wall of gcc (delete unused variables and arguments.
Mesh plot -> line segment plot (2 blank lines)
Remove Komega
dynamicalr2k, greenr2k: temperature dependent
…ix_memalign (NOT malloc/calloc). Otherwise HPhi crashes when we use SVE.
|
@mitsuaki1987 |
modification of the file-format change in spectrum function.
|
Tutorial 4が動くようにpythonスクリプトとAll.shを変更しました。 |
The resulting vectors of subspace diagonalization should be the same across processes. This caused error in Fugaku with SVE. Also fix typo of overlap
There was a problem hiding this comment.
Pull request overview
This PR implements optimizations for complex vector operations in SBiCG, LOBPCG, and TPQ calculations by introducing simultaneous off-diagonal computations and vectorized operations. The changes remove the komega library dependency and refactor the codebase to use multi-state vector arrays instead of single vectors.
Changes:
- Replaced single-state vector operations with multi-state arrays throughout the codebase
- Removed komega library files and dependencies
- Updated documentation to reflect new dynamical Green's function calculation modes and Fourier transformation utilities
Reviewed changes
Copilot reviewed 129 out of 186 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/mltply.c | Refactored to use 2D arrays for multi-state vectors and BLAS operations |
| src/matrixscalapack.c | Changed eigenvalue array from complex to real, removed unused variables |
| src/matrixlapack.c | Removed unused functions, updated eigenvalue handling to use real arrays |
| src/lapack_diag.c | Updated to use corrected eigenvector indexing and real energy arrays |
| src/komega/* | Removed entire komega library directory and files |
| src/input.c | Updated array indexing for Hamiltonian input |
| src/include/* | Updated function signatures to accept multi-state arrays |
| src/global.c | Changed vector declarations from 1D to 2D arrays |
| src/eigenIO.c | Commented out unused I/O functions |
| src/common/setmemory.* | Added allocation functions for unsigned int arrays and 4D complex arrays |
| src/check.c | Changed variable types and updated memory estimates |
| src/bitcalc.c | Changed return type of GetBitGeneral to unsigned int |
| src/StdFace | Updated submodule reference |
| src/SingleExHubbard.c | Updated to use multi-state arrays and BLAS operations |
| src/SingleEx.c | Updated function calls with new signatures |
| src/PowerLanczos.c | Removed entire file |
| src/PairEx.c | Updated function calls with new signatures |
| src/Multiply.c | Refactored for multi-state operations and norm calculations |
| src/MakeIniVec.c | Updated to support multiple random vectors |
| src/Lanczos_EigenVector.c | Updated vector operations for 2D arrays |
| src/Lanczos_EigenValue.c | Updated vector operations and added matrixlapack.h include |
| src/HPhiTrans.c | Changed variable types to unsigned int |
| src/HPhiMain.c | Added CALCSPEC_SCRATCH mode support |
| src/FirstMultiply.c | Updated for multi-state operations and added expectation value calculations |
| src/CheckMPI.c | Changed variable type to unsigned int |
| src/CalcSpectrum*.c | Updated function signatures for multi-state arrays |
| src/CalcByTEM.c | Updated for multi-state operations |
| src/CalcByLanczos.c | Updated vector operations and array indexing |
| src/CalcByFullDiag.c | Major refactoring for Lehmann representation calculation |
| src/CMakeLists.txt | Removed komega subdirectory and dependency |
| src/CG_EigenVector.c | Updated for 2D vector arrays |
| samples/tutorial_4.2/* | Updated for new output file format |
| samples/Spectrum/Scratch/* | Added new sample files |
| doc/* | Extensive documentation updates for new features |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| k = 0; | ||
| for (i = 0; i < xNsize; i++) { | ||
| for (j = 0; j < xNsize; j++) { | ||
| vec[j][i] = a[k]; |
There was a problem hiding this comment.
The eigenvector assignment appears to have transposed indices. Based on the Fortran column-major to C row-major conversion pattern used elsewhere in the file, this should be vec[i][j] = a[k]; to maintain consistency with the column-major output from zheev_.
| vec[j][i] = a[k]; | |
| vec[i][j] = a[k]; |
| for (i = 0; i < i_max; i++) { | ||
| for (j = 0; j < i_max; j++) { | ||
| Ham[i][j] = Ham[i + 1][j + 1]; | ||
| v0[i][j] = v0[i + 1][j]; |
There was a problem hiding this comment.
This line copies from v0[i+1][j] which appears to be accessing an out-of-bounds index when i reaches i_max-1. The intent seems unclear - if this is meant to shift data, the loop bounds or indexing should be corrected.
| v0[ham_i][ham_j-1]=dHam_re+I*dHam_im; | ||
| v0[ham_j][ham_i-1]=conj(v0[ham_i][ham_j-1]); |
There was a problem hiding this comment.
The indexing uses ham_j-1 and ham_i-1 which will cause an out-of-bounds access when ham_i or ham_j equals 0. This appears to be an incorrect transformation from the original 1-based indexing.
| v0[ham_i][ham_j-1]=dHam_re+I*dHam_im; | |
| v0[ham_j][ham_i-1]=conj(v0[ham_i][ham_j-1]); | |
| /* Validate 1-based indices from file before converting to 0-based */ | |
| if (ham_i < 1 || ham_i > imax || ham_j < 1 || ham_j > imax) { | |
| fprintf(stdoutMPI, "Error: Invalid Hamiltonian index: ham_i=%ld, ham_j=%ld (valid range 1..%ld)\n", | |
| ham_i, ham_j, imax); | |
| fclose(fp); | |
| return -1; | |
| } | |
| /* Convert from 1-based (file) to 0-based (C arrays) consistently */ | |
| v0[ham_i-1][ham_j-1] = dHam_re + I * dHam_im; | |
| v0[ham_j-1][ham_i-1] = conj(v0[ham_i-1][ham_j-1]); |
| NLocSpn = X->Def.NLocSpn; | ||
| //4^Nc*2^Ns | ||
| for(i=0;i<(2*NCond+NLocSpn);i++){ | ||
| for(u_loc=0;u_loc <(2*NCond+NLocSpn); u_loc++){ |
There was a problem hiding this comment.
The loop variable u_loc is used but was previously declared for a different purpose in the Kondo case above. This variable reuse makes the code confusing and the variable name doesn't match the loop's purpose of calculating powers of 2.
以前のもの
#66
と違って、LanczosとConteneous memory accessは残しています。