Fix moe by Yunnglin · Pull Request #58 · modelscope/twinkle

Yunnglin · 2026-02-13T04:02:21Z

No description provided.

…fix_moe

gemini-code-assist · 2026-02-13T04:03:20Z

Summary of Changes

Hello @Yunnglin, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces substantial improvements to the server's operational stability and resource efficiency. It refines how tasks are queued and processed, implements more granular control over adapter lifecycles, and updates core configurations to support these changes. The modifications aim to provide a more robust and predictable environment for training and inference, while also streamlining client-side interactions and documentation.

Highlights

Server Robustness and Resource Management: The server's task queueing system has been rearchitected to use per-key queues, improving fairness and throughput. Comprehensive pre-flight checks for rate limiting, maximum input tokens, and batch size validation are now integrated into task scheduling. Adapter lifecycle management has been significantly enhanced with session-based expiration, maximum lifetime settings, and dynamic limit checks, ensuring more efficient resource utilization and stability.
Configuration and Documentation Updates: YAML configurations for both Megatron and Transformer backends have been updated to utilize the dp_size parameter for device mesh definitions, replacing the older mesh and mesh_dim_names syntax. New environment variables, DEVICE_COUNT_PER_PHYSICAL_NODE and TWINKLE_TRUST_REMOTE_CODE, are introduced and documented in both English and Chinese usage guides, providing clearer instructions for server setup and resource allocation.
Client-side Examples and Refactoring: An outdated GRPO training example has been removed, and existing client-side examples (sample.py, self_congnition.py, short_math_grpo.py) have been updated to align with new API patterns, such as using template.encode/decode for prompt handling. Minor refactoring of client-side wrappers for DataLoader, Dataset, and InputProcessor classes improves consistency and JSON data handling.
Gradient Readiness Tracking: New state tracking for 'grad_ready' has been implemented within the model service to ensure that optimization steps (optim_step) are only performed after gradients have been successfully accumulated via a forward_backward pass, preventing erroneous updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

.gitignore
- Added 'swanlog/' to the list of ignored files.
.pre-commit-config.yaml
- Expanded pre-commit hook exclusions to include 'examples/', 'cookbook/', and 'src/twinkle_client/' directories.
cookbook/client/tinker/grpo.py
- Removed the GRPO (Group Relative Policy Optimization) training example file.
cookbook/client/tinker/megatron/server_config.yaml
- Updated server configuration for Megatron backend, including queue limits, model max length, max LoRAs, and adapter lifecycle settings.
- Adjusted queue configuration parameters for SamplerManagement and ModelManagement services.
- Added 'max_length' and 'max_loras' arguments to the ModelManagement service.
- Modified 'per_token_adapter_limit' and 'adapter_timeout' in adapter configuration, and introduced 'adapter_max_lifetime'.
cookbook/client/tinker/megatron/server_config_7b.yaml
- Adjusted server configuration for 7B Megatron model, including token per second limits, max input tokens, and adapter timeout/lifetime.
- Modified 'tps_limit' and added 'max_input_tokens' to the queue configuration.
- Reordered and updated parameters within the adapter configuration.
cookbook/client/tinker/sample.py
- Refactored sampling example to use 'Qwen/Qwen2.5-7B-Instruct' as the base model and 'http://localhost:8000' as the server URL.
- Removed 'modelscope.AutoTokenizer' import and replaced tokenizer-based prompt handling with 'template.encode' and 'template.decode'.
- Updated the model path for the sampling client and adjusted sampling temperature.
cookbook/client/tinker/self_congnition.py
- Updated self-cognition example to use 'template.encode' and 'template.decode' for prompt handling, replacing 'tokenizer.apply_chat_template' and 'tokenizer.decode'.
- Removed 'modelscope.AutoTokenizer' import and reordered other imports.
- Updated the 'weight_path' for loading the trained LoRA checkpoint.
cookbook/client/tinker/short_math_grpo.py
- Migrated short math GRPO example to use 'template' for tokenization and decoding, replacing 'tokenizer'.
- Removed 'modelscope.AutoTokenizer' import and reordered other imports.
- Added a check to skip training steps if all advantages are zero.
cookbook/client/twinkle/grpo.py
- Modified GRPO training parameters, reducing 'NUM_GENERATIONS', 'BATCH_SIZE', and 'SYNC_INTERVAL'.
cookbook/client/twinkle/transformer/server_config.yaml
- Updated Transformer backend server configuration to use 'dp_size' for device mesh, replacing 'mesh' and 'mesh_dim_names'.
- Added runtime environment variables ('TWINKLE_TRUST_REMOTE_CODE', 'DEVICE_COUNT_PER_PHYSICAL_NODE') to Ray actor options for ModelManagement, ProcessorManagement, and SamplerManagement.
- Increased 'nproc_per_node' for SamplerManagement and adjusted its 'ranks' configuration.
docs/source_en/Usage Guide/Server and Client/Server.md
- Documented new environment variables ('DEVICE_COUNT_PER_PHYSICAL_NODE', 'TWINKLE_TRUST_REMOTE_CODE') required for server startup.
- Updated the 'Node Rank in YAML Configuration' section to reflect the use of 'dp_size' instead of 'mesh' and 'mesh_dim_names', and clarified that 'ranks' refer to physical GPU card numbers.
- Added example YAML configurations incorporating the new syntax and environment variables.
docs/source_zh/使用指引/服务端和客户端/服务端.md
- Updated Chinese server documentation with new environment variables ('DEVICE_COUNT_PER_PHYSICAL_NODE', 'TWINKLE_TRUST_REMOTE_CODE') and the revised 'device_mesh' configuration syntax.
- Revised example YAML configurations to align with the updated parameter usage.
src/twinkle/model/megatron/megatron.py
- Renamed the 'resume' parameter to 'load_optimizer' in the 'load' method for clarity regarding optimizer state restoration.
src/twinkle/server/tinker/common/compat_base.py
- Improved the 'clean_metrics' function to robustly handle various numeric types (Python, NumPy, PyTorch tensors) and extract float values from common metric strings with units.
src/twinkle/server/tinker/model.py
- Implemented robust adapter lifecycle management with dedicated '_cleanup_adapter' and '_on_adapter_expired' methods.
- Modified 'create_model' to register adapters before adding them to the model and to perform cleanup on failure.
- Removed batch size assertion in 'forward' and 'forward_backward' methods.
- Added 'grad_ready' state tracking for adapters to ensure gradients are accumulated before 'optim_step'.
- Enhanced task scheduling by adding 'batch_size', 'data_world_size', and 'task_type' parameters to 'schedule_task' calls.
src/twinkle/server/tinker/sampler.py
- Added 'os' import for path validation.
- Implemented validation for adapter URI existence before sampling.
- Included 'task_type' in the 'schedule_task' call for sampling operations.
src/twinkle/server/tinker/server.py
- Enhanced server initialization to accept a 'server_config' dictionary and normalize supported models internally.
- Removed redundant supported model normalization logic from the main 'build_server_app' function.
src/twinkle/server/twinkle/model.py
- Integrated '_on_adapter_expired' method for consistent adapter cleanup.
- Adjusted 'add_adapter_to_model' to register the adapter with the manager before adding it to the model.
src/twinkle/server/twinkle/sampler.py
- Updated 'typing.Union' import.
- Streamlined adapter expiration handling by removing manual limit checks.
- Modified '_get_adapter_name' to include 'request.state.request_id' for unique adapter naming.
- Adjusted 'add_adapter_to_sampler' to register the adapter before adding it to the sampler.
src/twinkle/server/utils/adapter_manager.py
- Removed 'TwinkleModel' type hint from the mixin.
- Introduced 'adapter_max_lifetime' for time-to-live expiration of adapters.
- Refactored 'register_adapter' to check limits before registration and added 'session_id' tracking.
- Added '_is_session_alive' method to check session heartbeats for session-based adapter expiration.
- Implemented generic adapter state management methods ('set_adapter_state', 'get_adapter_state', 'pop_adapter_state', 'clear_adapter_state').
- Modified 'touch_adapter' to prevent updating activity for adapters marked as 'expiring'.
- Overhauled '_adapter_countdown_loop' to support session-based expiration, TTL, and more robust cleanup logic.
- Refactored 'check_adapter_limit' to dynamically count active adapters from internal records.
- Removed '_adapter_lock', 'list_adapters', 'assert_adapter_valid', and 'get_adapter_count'.
src/twinkle/server/utils/state.py
- Extended 'ServerState' initialization to accept additional keyword arguments.
- Added 'last_heartbeat' tracking to session creation and a 'get_session_last_heartbeat' method.
- Modified 'get_server_state' to pass additional keyword arguments to the 'ServerState' actor constructor.
src/twinkle/server/utils/task_queue.py
- Rearchitected task queueing to use per-key queues ('_task_queues', '_queue_order') and an event-driven worker ('_new_task_event').
- Added 'max_input_tokens' to 'TaskQueueConfig' for input validation.
- Introduced '_QueuedTask' dataclass to store task details.
- Changed 'schedule_task' to accept a 'coro_factory' and perform pre-flight checks (rate limiting, max tokens, batch size) before queuing.
- Implemented '_fail_queue_tasks_async' and 'fail_pending_tasks_for_model' for robust task cancellation.
- Updated '_queue_worker' to process tasks from multiple queues in a round-robin fashion.
- Adjusted 'get_queue_stats' and 'shutdown_task_queue' to reflect the new queue structure.
src/twinkle_client/init.py
- Updated the API key check from 'api_key is None' to 'not api_key' for improved clarity.
src/twinkle_client/dataloader/dataloader.py
- Refactored DataLoader client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
src/twinkle_client/dataset/base.py
- Refactored Dataset client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
src/twinkle_client/dataset/iterable_dataset.py
- Refactored IterableDataset client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
src/twinkle_client/dataset/iterable_packing_dataset.py
- Simplified JSON data passing in HTTP requests within the IterablePackingDataset client wrapper.
src/twinkle_client/dataset/lazy_dataset.py
- Simplified JSON data passing in HTTP requests within the LazyDataset client wrapper.
src/twinkle_client/dataset/packing_dataset.py
- Simplified JSON data passing in HTTP requests within the PackingDataset client wrapper.
src/twinkle_client/model/multi_lora_transformers.py
- Reordered imports for consistency.
src/twinkle_client/processor/base.py
- Refactored InputProcessor client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
src/twinkle_client/sampler/vllm_sampler.py
- Reordered imports and simplified JSON data passing in HTTP requests within the vLLMSampler client wrapper.

Activity

No specific review comments or activities were recorded for this pull request since its creation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant refactoring of the server-side adapter management and task queuing systems, bringing more robust lifecycle handling, fairer scheduling, and improved state management. The changes also include updates to client examples and documentation to align with the new server capabilities.

My review has identified a critical concurrency issue in the AdapterManagerMixin due to missing thread synchronization, a bug involving a duplicated method call, and some areas for improvement in documentation clarity and code style in the client library. Addressing these points will greatly improve the stability and maintainability of the system.

gemini-code-assist · 2026-02-13T04:06:24Z

src/twinkle/server/utils/adapter_manager.py

-        # 'inactivity_counter': int}
+        # Dict mapping adapter_name ->
+        # {'token': str, 'session_id': str, 'last_activity': float, 'created_at': float, 'inactivity_counter': int}
        self._adapter_records: dict[str, dict[str, Any]] = {}


The _adapter_records dictionary is accessed and modified from multiple threads without any locking. The _adapter_countdown_loop runs in a separate threading.Thread, while methods like register_adapter, unregister_adapter, and touch_adapter are called from the asyncio event loop that handles requests. This concurrent access to a shared dictionary is not thread-safe and will lead to race conditions and data corruption. Please re-introduce a threading.Lock to protect all accesses to _adapter_records.

src/twinkle/server/twinkle/model.py

docs/source_en/Usage Guide/Server and Client/Server.md

gemini-code-assist · 2026-02-13T04:06:24Z

src/twinkle_client/dataloader/dataloader.py

-

-class DataLoader:
+class DataLoader(object):


Inheriting from object is redundant in Python 3. This issue is also present in src/twinkle_client/dataset/base.py and src/twinkle_client/processor/base.py. These style issues are likely because src/twinkle_client/ is excluded from pre-commit checks. Please remove the redundant inheritance for cleaner code and consider re-enabling pre-commit for this directory to maintain code quality.

Suggested change

class DataLoader(object):

class DataLoader:

Copilot

Pull request overview

This pull request titled "Fix moe" contains extensive changes that go well beyond MoE (Mixture of Experts) fixes. The changes include a major refactoring of the task queue system, adapter lifecycle management improvements, API signature changes, and numerous formatting updates to client-generated files.

Changes:

Major task queue refactoring: Changed from single queue to per-model/per-token queues with coroutine factories instead of coroutines
Adapter lifecycle management: Enhanced session-based expiration, TTL enforcement, and gradient state tracking
API signature changes: Breaking changes to schedule_task (now requires coro_factory), _on_adapter_expired (removed token parameter), and parameter renames

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 22 comments.

Show a summary per file

File	Description
src/twinkle_client/init.py	Changed API key validation from None check to falsy check
src/twinkle_client/sampler/vllm_sampler.py	Import reordering and formatting changes
src/twinkle_client/processor/base.py	Formatting changes, added trailing whitespace, quote style changes
src/twinkle_client/model/multi_lora_transformers.py	Import reordering
src/twinkle_client/dataset/*.py	Formatting changes, quote style changes, trailing whitespace
src/twinkle_client/dataloader/dataloader.py	Formatting changes, added object inheritance
src/twinkle/server/utils/task_queue.py	Major refactoring: per-queue architecture, coro_factory pattern, preflight checks
src/twinkle/server/utils/state.py	Added session heartbeat tracking, kwargs parameter
src/twinkle/server/utils/adapter_manager.py	Session-based expiration, TTL enforcement, adapter state management, removed token from _on_adapter_expired
src/twinkle/server/twinkle/sampler.py	Adapter expiration handling (signature mismatch bug)
src/twinkle/server/twinkle/model.py	Duplicate remove_adapter call bug
src/twinkle/server/tinker/server.py	Server config handling, normalize_models with incorrect parameter name
src/twinkle/server/tinker/sampler.py	Adapter URI validation logic bug (fails all non-adapter requests)
src/twinkle/server/tinker/model.py	Gradient state tracking, cleanup refactoring, batch validation moved
src/twinkle/server/tinker/common/compat_base.py	Enhanced metrics cleaning with string parsing bug
src/twinkle/model/megatron/megatron.py	Parameter rename from resume to load_optimizer
cookbook/client/tinker/sample.py	Duplicate imports
cookbook/client/tinker/short_math_grpo.py	Updated to use Template instead of tokenizer
cookbook/client/tinker/self_congnition.py	Updated to use Template instead of tokenizer
cookbook/client/twinkle/transformer/server_config.yaml	Config updates: dp_size, environment variables
cookbook/client/tinker/megatron/server_config*.yaml	Adapter timeout, max_lifetime, rate limit changes
cookbook/client/tinker/grpo.py	File deleted
.pre-commit-config.yaml	Expanded exclusion patterns for examples/cookbook
.gitignore	Added swanlog/ directory

Comments suppressed due to low confidence (1)

src/twinkle/server/twinkle/sampler.py:180

This method requires 3 positional arguments, whereas overridden AdapterManagerMixin._on_adapter_expired requires 2. This call correctly calls the base method, but does not match the signature of the overriding method.

        def _on_adapter_expired(self, adapter_name: str, token: str) -> None:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-13T04:09:16Z

src/twinkle_client/processor/base.py

+
    def __call__(self, inputs: Union[InputFeature, List[InputFeature]], **kwargs):
        response = http_post(
            url=f'{self.server_url}/processors/call',
            json_data={
                'processor_id': self.processor_id,
                'function': '__call__',
-                **{
-                    'inputs': inputs
-                },
+                **{'inputs': inputs},
                **kwargs
-            })
+            }
+        )
        response.raise_for_status()
-        return response.json()['result']
+        return response.json()["result"]
+


Trailing whitespace has been added at lines 42 and 55. While this doesn't affect functionality, it's inconsistent with the file's style and may trigger linting warnings. The pre-commit configuration has been updated to exclude client files, but adding trailing whitespace is still not ideal. Consider removing these trailing spaces.

Copilot · 2026-02-13T04:09:17Z

src/twinkle/server/utils/task_queue.py

    async def schedule_task(
        self,
-        coro: Coroutine,
+        coro_factory: Callable[[], Coroutine],


The signature change from coro: Coroutine to coro_factory: Callable[[], Coroutine] is a breaking API change. All callers of schedule_task must now pass a factory function instead of an already-created coroutine. While all the visible callers in this PR have been updated (e.g., _do_sample() changed to _do_sample), this could break external code or plugins that call schedule_task. Consider whether this breaking change is intended and if it should be documented or versioned appropriately.

Copilot · 2026-02-13T04:09:17Z

src/twinkle/server/twinkle/model.py

+                # Remove adapter from model
+                self.model.remove_adapter(adapter_name)
+


The _on_adapter_expired method calls self.model.remove_adapter(adapter_name) twice: once at line 204 and again at line 211. This appears to be duplicated logic that could cause issues. The second call will likely fail if the first one succeeded, or it's unnecessary redundancy. One of these calls should be removed.

Suggested change

# Remove adapter from model

self.model.remove_adapter(adapter_name)

Copilot · 2026-02-13T04:09:17Z

src/twinkle/server/tinker/sampler.py

+                    # Validate adapter URI existence if provided
+                    if not adapter_uri or not os.path.exists(adapter_uri):
+                        return types.RequestFailedResponse(
+                            error=f'Adapter URI {model_path} does not exist. Please check the model_path.',
+                            category=types.RequestErrorCategory.User,
+                        )


The validation at lines 165-169 checks if adapter_uri exists, but this check happens even when model_path is None or when no adapter is needed. The condition should be if adapter_uri and not os.path.exists(adapter_uri): to avoid failing when no adapter URI is intentionally provided. Currently, if adapter_uri is None or empty string, the check not adapter_uri or not os.path.exists(adapter_uri) will always be True, causing all requests without an adapter to fail.

Copilot · 2026-02-13T04:09:17Z

src/twinkle/server/tinker/common/compat_base.py

+            s = value.strip()
+            if s:
+                try:
+                    head, unit = s.split()  # ignore unit/tail


At line 101, s.split() without an argument will split on any whitespace and could raise ValueError if the split result doesn't have exactly 2 elements. For example, "123 seconds remaining" would split into 3 elements. The unpacking head, unit = s.split() will fail with ValueError: too many values to unpack. Consider using s.split(maxsplit=1) or handling the ValueError more gracefully.

Suggested change

head, unit = s.split() # ignore unit/tail

parts = s.split(maxsplit=1) # split into value and unit/tail

if len(parts) != 2:

raise ValueError("Expected a value and a unit in metric string")

head, unit = parts

Copilot · 2026-02-13T04:09:20Z

src/twinkle_client/dataset/iterable_packing_dataset.py

-
-from twinkle.dataset import Dataset, DatasetMeta
+from twinkle_client.http import http_post, heartbeat_manager
+from twinkle.dataset import Dataset


Import of 'Dataset' is not used.

Suggested change

from twinkle.dataset import Dataset

Copilot · 2026-02-13T04:09:21Z

src/twinkle_client/model/multi_lora_transformers.py

 #   1. Modify the source files in src/twinkle/
 #   2. Run: python client_tools/client_generator.py
 # ============================================================================
+from typing import Any, Optional, Union, Type, Dict, Literal, List


Import of 'List' is not used.
Import of 'Type' is not used.
Import of 'Union' is not used.
Import of 'Literal' is not used.

Suggested change

from typing import Any, Optional, Union, Type, Dict, Literal, List

from typing import Any, Optional, Dict

Copilot · 2026-02-13T04:09:21Z

src/twinkle/server/twinkle/sampler.py

 from pydantic import BaseModel, Field
 from ray import serve
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, Union


Import of 'Union' is not used.

Suggested change

from typing import Any, Dict, List, Optional, Union

from typing import Any, Dict, List, Optional

Copilot · 2026-02-13T04:09:21Z

src/twinkle/server/utils/task_queue.py

+from dataclasses import dataclass
 from enum import Enum
-from typing import TYPE_CHECKING, Any, Callable, Coroutine, Dict, List, Optional, Tuple
+from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque, Dict, Optional


Import of 'Dict' is not used.
Import of 'Optional' is not used.

Suggested change

from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque, Dict, Optional

from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque

Copilot · 2026-02-13T04:09:21Z

src/twinkle/server/utils/task_queue.py

+        try:
+            while queue_key in self._queue_order:
+                self._queue_order.remove(queue_key)
+        except ValueError:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except ValueError:

except ValueError:

# If the queue_key is already absent from _queue_order, we can safely ignore this.

Yunnglin and others added 25 commits February 11, 2026 20:31

update load

d6fbd89

Merge branch 'dev' into fix_moe

d4cf817

update

67111e5

update

947a3e4

update

4e7f30d

update

51b89e0

update

077e206

update

87ce96d

Merge branch 'dev' into fix_moe

ca26436

update

41d92f8

update

80c0fd8

update

16494e0

Merge branch 'dev' into fix_moe

4e3c68b

fix lint

c7b235b

update

82ad72a

update

720aebc

update

01f88f7

update

d6c274d

Merge branch 'fix_moe' of https://github.com/modelscope/twinkle into …

37fcf17

…fix_moe

update

23038ff

Merge branch 'fix_moe' of https://github.com/modelscope/twinkle into …

4f28480

…fix_moe

update

d8fa0b0

update

ca1eeab

fix lint

7b5412b

update

987d89e

Copilot AI review requested due to automatic review settings February 13, 2026 04:02

Copilot started reviewing on behalf of Yunnglin February 13, 2026 04:02 View session

gemini-code-assist bot reviewed Feb 13, 2026

View reviewed changes

Copilot AI reviewed Feb 13, 2026

View reviewed changes

Yunnglin added 2 commits February 13, 2026 12:10

fix

65f1f7f

update

45a9cdb

Yunnglin merged commit aa86181 into dev Feb 13, 2026
2 of 4 checks passed

tastelikefeet deleted the fix_moe branch February 13, 2026 09:43

		# Remove adapter from model
		self.model.remove_adapter(adapter_name)

-                    head, unit = s.split()  # ignore unit/tail
+                    parts = s.split(maxsplit=1)  # split into value and unit/tail
+                    if len(parts) != 2:
+                        raise ValueError("Expected a value and a unit in metric string")
+                    head, unit = parts

	from typing import Any, Optional, Union, Type, Dict, Literal, List
	from typing import Any, Optional, Dict

	from typing import Any, Dict, List, Optional, Union
	from typing import Any, Dict, List, Optional

	from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque, Dict, Optional
	from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque

	except ValueError:
	except ValueError:
	# If the queue_key is already absent from _queue_order, we can safely ignore this.

Conversation

Yunnglin commented Feb 13, 2026

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants