Update nmf_workflow.py by vedatonuryilmaz · Pull Request #1 · vedatonuryilmaz/koptlib

vedatonuryilmaz · 2025-08-14T18:33:29Z

No description provided.

Copilot

Pull Request Overview

This PR optimizes the NMF workflow by fixing memory issues, improving JSON serialization, and enhancing the evaluation process. The changes focus on handling sparse matrices efficiently and preventing memory crashes when processing large datasets.

Removes memory-intensive sparse matrix conversion that was causing crashes
Adds explicit float conversion for JSON serialization compatibility
Improves tqdm integration with joblib for better progress tracking

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.}

Copilot · 2025-08-14T18:33:58Z

nmf_workflow.py

        group_sample_indices = [
            idx for idx, sample_id in enumerate(all_samples_list)
-            if sample_to_cancer_type_map.get(sample_id) in group_cancer_codes
+            if (c_type := sample_to_cancer_type_map.get(sample_id)) and c_type[:4] in group_cancer_codes


The walrus operator assignment and slicing c_type[:4] could cause an IndexError if c_type is shorter than 4 characters. Consider adding a length check: if (c_type := sample_to_cancer_type_map.get(sample_id)) and len(c_type) >= 4 and c_type[:4] in group_cancer_codes

Suggested change

if (c_type := sample_to_cancer_type_map.get(sample_id)) and c_type[:4] in group_cancer_codes

if (c_type := sample_to_cancer_type_map.get(sample_id)) and len(c_type) >= 4 and c_type[:4] in group_cancer_codes

Copilot · 2025-08-14T18:33:59Z

nmf_workflow.py

+        selected_samples = [all_samples_list[idx] for idx in group_sample_indices]
+
+    # Export selected sample IDs to preprocessed_data
+    export_path = os.path.join("preprocessed_data", f"selected_samples_{group_name}.json")


The hardcoded "preprocessed_data" directory path may not exist, causing the file write operation to fail. Consider using os.makedirs(os.path.dirname(export_path), exist_ok=True) before writing the file or making the directory path configurable.

Suggested change

export_path = os.path.join("preprocessed_data", f"selected_samples_{group_name}.json")

export_path = os.path.join("preprocessed_data", f"selected_samples_{group_name}.json")

os.makedirs(os.path.dirname(export_path), exist_ok=True)

Update nmf_workflow.py

0290057

vedatonuryilmaz requested a review from Copilot August 14, 2025 18:33

Copilot AI reviewed Aug 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update nmf_workflow.py#1

Update nmf_workflow.py#1
vedatonuryilmaz wants to merge 1 commit intoMVPfrom
fix-paralleljobs

vedatonuryilmaz commented Aug 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if (c_type := sample_to_cancer_type_map.get(sample_id)) and c_type[:4] in group_cancer_codes
	if (c_type := sample_to_cancer_type_map.get(sample_id)) and len(c_type) >= 4 and c_type[:4] in group_cancer_codes

	export_path = os.path.join("preprocessed_data", f"selected_samples_{group_name}.json")
	export_path = os.path.join("preprocessed_data", f"selected_samples_{group_name}.json")
	os.makedirs(os.path.dirname(export_path), exist_ok=True)

Conversation

vedatonuryilmaz commented Aug 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants