Skip to content

Conversation

@electricalgorithm
Copy link
Owner

No description provided.

Removed the phrase 'research-grade' from the project description and adjusted the wording for clarity.
Defaults to 224.
"""
self.files = glob.glob(os.path.join(data_dir, "*.npz"))
self.data_dir = Path(data_dir)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, it's not used anywhere else. No need to preserve the data.

Comment on lines 44 to 47
# Pixels per image (512x512)
# We hardcode this because the dataset structure (1 pixel per row) implies it.
# If it changes, this breaks, but we verified it's 262144 rows per image.
self.rows_per_sample = 512 * 512
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be better to make it configurable.

continue

# Use pyarrow to get metadata
import pyarrow.parquet as pq
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only allow imports on module-level.

# Convert to numpy and reshape
# Note: .to_numpy() on a ChunkedArray (from slice) is efficient
h_flat = h_slice['intensity'].to_numpy() # (262144,)
h_np = h_flat.reshape(512, 512).astype(np.float32)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No magic numbers.

Comment on lines 38 to 45
# Pretrained = True (User Request).
# PROBLEM: Pretrained weights are 3 channels (RGB). We have 2 channels (Real/Imag).
# TIMM automatically adapts conv weights if in_chans != 3?
# Yes, timm usually re-initializes or copies weights.
# But to be safe/optimal for "Physics Aware" where Real/Imag matters:
# We relying on timm's default adaptation (sum/mean) is better than random init.
# Timm's `timm.models.load_checkpoint` helper usually handles this but `create_model` does too.
# `in_chans=2` will trigger timm to adapt the patch embedding layer.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is a huge miss. It's like we're not sure. It had to do experimentation before assumption,

nn.GELU(),
nn.Conv2d(encoder_channels[-4] // 2, out_chans, kernel_size=3, padding=1),
nn.Sigmoid(), # Assuming object is amplitude 0-1
# REMOVED Sigmoid: Real/Imag parts are not bounded [0,1].
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to emphasise it.

src/train.py Outdated
BATCH_SIZE = 6
LR = 1e-4
NUM_EPOCHS = 5
NUM_EPOCHS = 1 # Reduced for demo/validation purposes
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to write it down.

@electricalgorithm electricalgorithm force-pushed the v2 branch 3 times, most recently from 1f1d445 to 0984c62 Compare December 12, 2025 21:36
We see from the experiment 3 that the model cannot predict the objects.
Since the task is mainly removing the noise from reconstructed but dirty
complex domain, let's start by copying the dirty one to output but
remove the noise.
Within this patch, we realize that exp3 is way worse then expected.
That's why we wanted to add two new loss components that compares the
distance between phases and magnitudes.
Signed-off-by: Gyokhan Kochmarla <gokhan.kocmarli@gmail.com>
Signed-off-by: Gyokhan Kochmarla <gokhan.kocmarli@gmail.com>
Signed-off-by: Gyokhan Kochmarla <gokhan.kocmarli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants