Skip to content

Migration DFM -> Bridge#2534

Open
abhinavg4 wants to merge 6 commits intomainfrom
migration/dfm
Open

Migration DFM -> Bridge#2534
abhinavg4 wants to merge 6 commits intomainfrom
migration/dfm

Conversation

@abhinavg4
Copy link

@abhinavg4 abhinavg4 commented Feb 25, 2026

What does this PR do ?

Migrate DFM to MB.

PR passing all tests in DFM is here: NVIDIA-NeMo/DFM#105

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • Related to # (issue)

For original authorship, see the DFM repo: https://github.com/NVIDIA-NeMo/DFM
Original Major Contributors for Megatron path: @abhinavg4, @huvunvidia, @sajadn, and @suiyoubi

Migrate the Megatron-based diffusion model code from the DFM repository
(commit 013ceca) into Megatron-Bridge as a self-contained `diffusion/`
module, following the shallow integration plan.

Source mapping:
- dfm/src/megatron/         -> src/megatron/bridge/diffusion/
- dfm/src/common/ (utils)   -> src/megatron/bridge/diffusion/common/
- examples/megatron/        -> examples/diffusion/
- tests/unit_tests/megatron -> tests/diffusion/unit_tests/
- tests/functional_tests/mcore -> tests/diffusion/functional_tests/

Key structural changes from DFM:
- model/ renamed to models/ (matches MB convention)
- model/*/conversion/ extracted to top-level conversion/ (separates
  bridge/checkpoint-conversion code from model implementation)
- All dfm.src.megatron.* imports rewritten to megatron.bridge.diffusion.*
- All dfm.src.common.* imports rewritten to megatron.bridge.diffusion.common.*
- dfm.src.automodel.* imports left as-is (automodel migrating separately)

Models included: DiT, FLUX, WAN (video generation)
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 25, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@abhinavg4
Copy link
Author

/ok to test a63854c

  F841 (unused variables) -- removed where clearly safe:
  • neg_t5_embed in inference_dit_model.py
  • device in inference_flux.py
  • error in wan/inference/utils.py (cache_image)
  • n -> _ in wan/rope_utils.py
  • data_batch, x0_from_data_batch in test_edm_pipeline.py
  • original_base in test_flux_hf_pretrained.py
  • total_layers, p_size, vp_size in test_flux_provider.py

  F841 -- kept with `# noqa: F841` (uncertain intent):
  • config = get_model_config(model) in flux_step_with_automodel.py
  • video_latents, loss_mask in flow_matching_pipeline_wan.py

  D101/D103 (missing docstrings) -- added `# noqa` markers to all 36 class/function definitions across 24 files.
  Markdown filename -- renamed README_perf_test.md to README-perf-test.md.
@abhinavg4
Copy link
Author

/ok to test 9242693

abhinavg4 and others added 2 commits February 26, 2026 14:16
…e recipes. Introduce new test files for DiT, FLUX, and WAN pretraining, along with necessary configuration and utility files. Update test structure to enhance coverage and maintainability.
@abhinavg4
Copy link
Author

/ok to test cb822f2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants