Skip to content

Add text2sample generation task type#863

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/support-text2sample-generation
Draft

Add text2sample generation task type#863
Copilot wants to merge 2 commits intomainfrom
copilot/support-text2sample-generation

Conversation

Copy link
Contributor

Copilot AI commented Mar 17, 2026

Adds text2sample as a first-class task type throughout the stack, enabling short production-sample generation from plain-text descriptions (the workflow used by models like RoyalCities/Foundation-1). Differs from text2music in its DiT instruction and the fact that it always triggers LLM-driven metadata generation without requiring sample_mode=True from the caller.

API / generation pipeline

  • constants.py: registers "text2sample" in TASK_INSTRUCTIONS ("Generate a music sample based on the given conditions:"), GENERATION_MODES_TURBO/BASE, and MODE_TO_TASK_TYPE ("Sample"→"text2sample").
  • llm_generation_inputs.py: when task_type="text2sample", implicitly sets sample_mode=True and promotes req.promptsample_query when sample_query is empty, so callers can use the standard prompt field without extra flags.
# API usage — no sample_mode flag required
POST /release_task
{
  "task_type": "text2sample",
  "prompt": "punchy trap hi-hat loop at 140 BPM"
}

CLI

  • text2sample is now option 7 in the interactive task menu.
  • Automatically sets sample_mode=True; validation updated to allow --sample_mode/--sample_query for this task type.
python cli.py generate --task_type text2sample --sample_query "lofi piano chord loop, Bb minor, 85 BPM"

Gradio UI

  • New Sample mode added to both turbo and base model mode lists, mapping to task_type="text2sample".
  • Shows the same simple-mode query panel as Simple, plus a generate-button row (initially disabled). Button activates after Create Sample succeeds.
  • After Create Sample in Sample mode the UI stays in Sample mode (preserving task_type="text2sample"); in Simple mode it continues to switch to Custom as before.
  • i18n description strings added for all four supported languages.

OpenRouter server

  • Message text is routed as sample_query when task_type="text2sample", consistent with sample_mode behavior.

Tests

  • acestep/api/text2sample_test.py: 7 unit tests covering auto-enable of sample_mode, prompt promotion, non-regression for text2music, and constants correctness.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • huggingface.co
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>support text2sample generation</issue_title>
<issue_description>please support this in ace-step 1.5: https://huggingface.co/RoyalCities/Foundation-1</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Add `text2sample` to TASK_INSTRUCTIONS, GENERATION_MODES, and MODE_TO_TASK_TYPE in constants.py
- Auto-enable sample_mode in llm_generation_inputs.py when task_type=text2sample; promote prompt to sample_query when sample_query is empty
- Add text2sample (item 7) to CLI interactive task selection with appropriate prompts
- Update CLI validation to allow sample_mode/sample_query for text2sample
- Add 'Sample' mode to Gradio UI (shows simple_mode_group + disabled generate_btn; button enables after Create Sample); task_type stays text2sample after create_sample
- Pass current generation_mode to handle_create_sample so Sample mode preserves task_type
- Handle text2sample in openrouter API server (treats message text as sample_query)
- Add text2sample to Streamlit config GENERATION_MODES
- Add mode_info_sample i18n strings in en/ja/zh/he
- Add 7 unit tests for text2sample behavior in text2sample_test.py

Co-authored-by: ChuxiJ <30956809+ChuxiJ@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for text2sample generation in ace-step 1.5 Add text2sample generation task type Mar 17, 2026
Copilot AI requested a review from ChuxiJ March 17, 2026 14:33
Copy link
Contributor

@ChuxiJ ChuxiJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overall: Well-structured PR that follows existing conventions. A few items to address before merge.

Key Issues

  1. Medium - Restore trailing newlines in i18n JSON files: The diff removes trailing newlines from en.json, he.json, ja.json, and zh.json. This is unnecessary and may conflict with editor configs or linters that expect POSIX-compliant final newlines.

  2. Medium - Missing edge case tests: Add tests for:

    • text2sample with empty prompt AND empty sample_query
    • text2sample with both prompt and sample_query provided
  3. Low - Inconsistent getattr usage: getattr(req, "task_type", "text2music") is used on line ~78, but the next line uses req.task_type directly. Since GenerationRequest always has task_type, just use req.task_type for consistency.

  4. Low - _did_promote_prompt_as_query flag is fragile: Consider replacing with an inline condition check at the consumption site to reduce coupling between distant code sections.

  5. Low - Brittle test assertion: test_text2sample_promotes_prompt_as_query_when_sample_query_empty checks call_args without asserting call_args is not None first, so it could silently pass with None == None.

  6. Low - Duplicate emoji in Streamlit config: text2sample reuses the same 🎼 emoji as "Complete Section". Consider using a distinct emoji.

Minor Notes

  • The print() for disabling use_cot_lyrics should use the logging infrastructure instead
  • OpenRouter callers may get different behavior than direct API callers for empty prompt+sample_query (no "NO USER INPUT" fallback)

No blocking issues. The main items are trailing newlines and edge case test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support text2sample generation

2 participants