fix: resolve lego task instruction mismatch causing noise output#925
fix: resolve lego task instruction mismatch causing noise output#925
Conversation
The lego task was generating noise because _resolve_instruction() only
fell back to the task-specific instruction when the request instruction
exactly matched DEFAULT_DIT_INSTRUCTION. When the instruction field
arrived as empty/blank (e.g. lost in form parsing), it would not match
the default and the wrong instruction was passed to the DiT model.
The model was trained with "Generate the {TRACK_NAME} track based on
the audio context:" but received "Fill the audio semantic mask based
on the given conditions:", causing it to produce garbage output.
Changes:
- Widen _resolve_instruction() to also trigger on empty/blank
instruction values, not just exact default matches.
- Register "instruction" in PARAM_ALIASES for explicit form parsing.
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdded an "instruction" parameter alias to the request parameter parser, enabling direct retrieval of instruction values from requests. Updated instruction resolution logic to handle missing or blank instructions alongside exact default instruction matches, broadening fallback behavior when task types are present. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can enable review details to help with troubleshooting, context usage and more.Enable the |
Summary
_resolve_instruction()to fall back to task-specific instruction when the instruction field is empty/blank, not just when it exactly matchesDEFAULT_DIT_INSTRUCTION."instruction"inPARAM_ALIASESfor explicit form-data parsing.Root Cause
The lego task was generating noise because the DiT model received the wrong instruction text during inference. The model was trained with
"Generate the {TRACK_NAME} track based on the audio context:"but inference was using"Fill the audio semantic mask based on the given conditions:"(the default text2music instruction)._resolve_instruction()only substituted the task-specific instruction whenreq.instruction == DEFAULT_DIT_INSTRUCTION. If the instruction field arrived as an empty string from the client form data, it would not match the default and would pass through unchanged — resulting in the wrong instruction reaching the model.Verification
Before fix:
Peak=0.0045(noise)After fix:
Peak=1.0000(normal audio signal)Test plan
job_generation_setup_test.pytests pass (4/4)Made with Cursor
Summary by CodeRabbit
New Features
Bug Fixes