Skip to content
This repository was archived by the owner on Nov 21, 2025. It is now read-only.
This repository was archived by the owner on Nov 21, 2025. It is now read-only.

_download_file_from_run silently does nothing if is_local_rank_zero() returns False #915

@fepegar

Description

@fepegar

This might be dangerous because the way health_azure checks whether is_local_rank_zero might be compatible with Lightning but not with other frameworks, and the files are unexpectedly missing after being "downloaded". This is especially problematic when validate_checksum is True, because one would expect that things are double-checked after downloading.

def _download_file_from_run(
run: Run, filename: str, output_file: Path, validate_checksum: bool = False
) -> Optional[Path]:
"""
Download a single file from an Azure ML Run, optionally validating the content to ensure the file is not
corrupted during download. If running inside a distributed setting, will only attempt to download the file
onto the node with local_rank==0. This prevents multiple processes on the same node from trying to download
the same file, which can lead to errors.
:param run: The AML Run to download associated file for
:param filename: The name of the file as it exists in Azure storage
:param output_file: Local path to which the file should be downloaded
:param validate_checksum: Whether to validate the content from HTTP response
:return: The path to the downloaded file if local rank is zero, else None
"""
if not is_local_rank_zero():
return None
run.download_file(filename, output_file_path=str(output_file), _validate_checksum=validate_checksum)
return output_file

A minimally-invasive change would be logging a warning before returning None. It might also be nice to propagate the paths of the downloaded files up to other functions that call this one. For example, maybe _download_files_from_run and download_files_from_run_id should return the paths of all downloaded files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions