Skip to content

refactor/core cleanup#3

Merged
t-koretsune merged 6 commits intodevelopfrom
refactor/core-cleanup
Jan 1, 2026
Merged

refactor/core cleanup#3
t-koretsune merged 6 commits intodevelopfrom
refactor/core-cleanup

Conversation

@t-koretsune
Copy link
Copy Markdown
Contributor

  • add logging and documents
  • refactoring IO
  • modify reading routine
  • check immn file extension

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the symwannier core module by adding comprehensive logging support, improving documentation, and consolidating I/O operations. The changes enhance code maintainability and provide better debugging capabilities without altering the underlying computational algorithms.

  • Added structured logging with configurable levels throughout the codebase
  • Introduced a shared io_utils.py module to centralize file I/O operations and eliminate code duplication
  • Enhanced input validation with early error detection in the Wannierize class

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 17 comments.

Show a summary per file
File Description
src/symwannier/win.py Added logging support and docstrings; replaced assertion with descriptive FileNotFoundError
src/symwannier/wannierize.py Added comprehensive logging, input validation method, configurable log levels, and extensive docstrings for core algorithms
src/symwannier/timedata.py Integrated logging for timing outputs; replaced print statements with logger calls
src/symwannier/sym.py Added logging support and docstrings; replaced print statements with logger calls
src/symwannier/nnkp.py Added logging, file existence validation, and docstrings
src/symwannier/mmn.py Refactored file reading to use shared I/O utilities; added IBZ detection via file extension; optimized data parsing with bulk reads
src/symwannier/io_utils.py New shared utility module for opening plain or gzipped text files with consistent error handling
src/symwannier/expand_wannier_inputs.py Integrated logging and updated file readers to use logging
src/symwannier/eig.py Added logging support, docstrings, and improved error messages
src/symwannier/amn.py Refactored to use shared I/O utilities; optimized file reading; added docstrings

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lines = fp.readlines()
num_bands, nk, num_wann = [ int(x) for x in lines[1].split() ]
dat = np.genfromtxt(lines[2:]).reshape(nk, num_wann, num_bands, 5)
flat = np.fromstring("".join(lines[2:]), sep=" ")
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using np.fromstring is deprecated as of NumPy 1.14 and will be removed in future versions. The recommended replacement is np.frombuffer for bytes. Consider using a more modern approach that doesn't rely on deprecated functions.

Suggested change
flat = np.fromstring("".join(lines[2:]), sep=" ")
flat_str = " ".join(lines[2:])
flat = np.fromiter((float(val) for val in flat_str.split()), dtype=float)

Copilot uses AI. Check for mistakes.
amn_sym = self._symmetrize_Gk_internal(amn)
diff = np.sum(np.abs(amn_sym - amn))/self.sym.nks
print(" symmetrize Gk diff1 = {:15.5e}".format(diff))
self.log.info("symmetrize Gk diff1 = %.5e", diff)
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger formatting string uses old-style % formatting while most other logging calls in the file use f-strings or format method. For consistency, consider using the logging module's built-in parameter substitution (e.g., self.log.info("symmetrize Gk diff1 = %.5e", diff)) consistently throughout the file.

Suggested change
self.log.info("symmetrize Gk diff1 = %.5e", diff)
self.log.info(f"symmetrize Gk diff1 = {diff:.5e}")

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +23
fp : IO[str]
Opened file-like object in text mode.
used_path : str
Actual path used (plain or .gz).
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation incorrectly describes the return type as IO[str], but the function uses type annotations from the typing module. The actual return type is a tuple Tuple[IO[str], str] as correctly shown in the return type annotation, but the docstring says "fp : IO[str]" and "used_path : str" in separate lines under Returns, which doesn't match standard NumPy docstring format for tuples. The Returns section should clarify this is a tuple or use the tuple notation.

Suggested change
fp : IO[str]
Opened file-like object in text mode.
used_path : str
Actual path used (plain or .gz).
fp_and_path : Tuple[IO[str], str]
A tuple ``(fp, used_path)`` where ``fp`` is the opened file-like
object in text mode and ``used_path`` is the actual path used
(plain or ``.gz``).

Copilot uses AI. Check for mistakes.
Comment on lines 33 to +50
lines = fp.readlines()
num_bands, nk, num_wann = [ int(x) for x in lines[1].split() ]
dat = np.genfromtxt(lines[2:]).reshape(nk, num_wann, num_bands, 5)
flat = np.fromstring("".join(lines[2:]), sep=" ")
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joining all remaining lines into a single string for parsing is inefficient for large amn files. This creates a potentially very large temporary string object before parsing. Consider using np.loadtxt or np.genfromtxt directly on the file pointer (after skipping headers) to avoid the memory overhead of concatenating all lines into one string.

Copilot uses AI. Check for mistakes.
Parameters
----------
path : str
Base path (without .gz).
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring's description of the path parameter states "Base path (without .gz)" but the function actually accepts paths with or without .gz extension and checks both. The description should clarify that the path can be either the base name or include .gz, or the implementation should enforce that .gz is not part of the input path.

Suggested change
Base path (without .gz).
Path to the file, with or without a ``.gz`` extension. The function
first tries the given path as-is, then falls back to ``path + ".gz"``.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +117
flat = np.fromstring(" ".join(block_lines), sep=" ")
data = flat.reshape(self.num_bands, self.num_bands, 2)
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joining all block_lines into a single string just to parse them is inefficient for large datasets. With large num_bands values, this creates unnecessary temporary string objects and copies. Consider using np.loadtxt with a StringIO object or parsing the lines more directly to avoid the memory overhead of string concatenation.

Suggested change
flat = np.fromstring(" ".join(block_lines), sep=" ")
data = flat.reshape(self.num_bands, self.num_bands, 2)
# Parse block_lines directly without concatenating into a single large string
block_array = np.loadtxt(block_lines)
data = block_array.reshape(self.num_bands, self.num_bands, 2)

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +187
if getattr(self.amn, "nk", None) and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))

if getattr(self.eig, "nk", None) and self.eig.nk != self.nnkp.nk:
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation uses getattr(self.eig, "nk", None) which will return None if the attribute doesn't exist, but then checks if it's truthy before comparing. If self.eig.nk is 0 (a valid value), this check will incorrectly skip validation. Consider using hasattr(self.eig, "nk") instead to properly check for attribute existence.

Suggested change
if getattr(self.amn, "nk", None) and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))
if getattr(self.eig, "nk", None) and self.eig.nk != self.nnkp.nk:
if hasattr(self.amn, "nk") and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))
if hasattr(self.eig, "nk") and self.eig.nk != self.nnkp.nk:

Copilot uses AI. Check for mistakes.
sym : Sym, optional
Symmetry data. If provided, Amn is symmetrized/expanded accordingly.
log : logging.Logger, optional
Logger to use; if not provided a module logger is created.
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring contains a typo: "Logger to use; if not provided a module logger is created" is missing a comma. It should read "Logger to use; if not provided, a module logger is created" for proper grammar.

Suggested change
Logger to use; if not provided a module logger is created.
Logger to use; if not provided, a module logger is created.

Copilot uses AI. Check for mistakes.
log_level : int or str, optional
Logging level (e.g., logging.DEBUG, logging.INFO, or 'DEBUG', 'INFO').
log : logging.Logger, optional
Logger to use; if not provided a module logger is created.
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring contains a typo: "Logger instance" could be more descriptive. However, more importantly, line 44 has "Logger to use; if not provided a module logger is created" which is missing a comma. It should read "Logger to use; if not provided, a module logger is created" for proper grammar.

Suggested change
Logger to use; if not provided a module logger is created.
Logger to use; if not provided, a module logger is created.

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +187
if getattr(self.amn, "nk", None) and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))

if getattr(self.eig, "nk", None) and self.eig.nk != self.nnkp.nk:
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation uses getattr(self.amn, "nk", None) which will return None if the attribute doesn't exist, but then checks if it's truthy before comparing. If self.amn.nk is 0 (a valid value), this check will incorrectly skip validation. Consider using hasattr(self.amn, "nk") instead to properly check for attribute existence.

Suggested change
if getattr(self.amn, "nk", None) and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))
if getattr(self.eig, "nk", None) and self.eig.nk != self.nnkp.nk:
if hasattr(self.amn, "nk") and self.amn.nk is not None and self.amn.nk != self.nnkp.nk:
raise ValueError("amn nk={} inconsistent with nnkp nk={}".format(self.amn.nk, self.nnkp.nk))
if hasattr(self.eig, "nk") and self.eig.nk is not None and self.eig.nk != self.nnkp.nk:

Copilot uses AI. Check for mistakes.
@t-koretsune
Copy link
Copy Markdown
Contributor Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Copy Markdown

Copilot AI commented Jan 1, 2026

@t-koretsune I've opened a new pull request, #4, to work on those changes. Once the pull request is ready, I'll request review from you.

[WIP] Refactor core module with logging and I/O improvements
@t-koretsune t-koretsune merged commit 781caff into develop Jan 1, 2026
@t-koretsune t-koretsune deleted the refactor/core-cleanup branch January 2, 2026 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants