Skip to content

Shuffling QM9 before limiting#7

Merged
muhrin merged 1 commit intocamml-lab:developfrom
giotre:develop
Mar 18, 2026
Merged

Shuffling QM9 before limiting#7
muhrin merged 1 commit intocamml-lab:developfrom
giotre:develop

Conversation

@giotre
Copy link
Copy Markdown
Collaborator

@giotre giotre commented Mar 18, 2026

  • Added way for shuffling QM9 before limiting, updating src/tensorial/datasets/qm9.py
  • Added test for shuffling test/test_qm9.py
  • Added folder test/assets with a small subset (i.e., 20 structures) from QM9 to run the test/test_qm9.py

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 18, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 68.75000% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/tensorial/datasets/qm9.py 68.75% 5 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Files with missing lines Coverage Δ
src/tensorial/datasets/qm9.py 58.64% <68.75%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@muhrin muhrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good

if limit is not None:
indices = list(range(len(all_members)))
if shuffle:
random.seed(42)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

download: bool = True,
limit: int | None = None,
as_graphs: dict | None = None,
shuffle: bool = True,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default shuffle = False
plus add rng_seed parameter for the shuffle

indices = list(range(len(all_members)))
if shuffle:
random.seed(42)
random.shuffle(indices)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might prefer to use:
random.sample(population, k, *, counts=None)
as this allows you to sample from the indices without incurring the memory cost of shuffling the indices in place.

@giotre giotre force-pushed the develop branch 4 times, most recently from 57c95c6 to 717be1d Compare March 18, 2026 14:42
@giotre giotre marked this pull request as draft March 18, 2026 14:49
@giotre giotre marked this pull request as ready for review March 18, 2026 14:49
@giotre giotre marked this pull request as draft March 18, 2026 15:51
@giotre giotre marked this pull request as ready for review March 18, 2026 15:54
@muhrin muhrin merged commit 9f2beb0 into camml-lab:develop Mar 18, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants