Skip to content

Fix moe#58

Merged
Yunnglin merged 27 commits intodevfrom
fix_moe
Feb 13, 2026
Merged

Fix moe#58
Yunnglin merged 27 commits intodevfrom
fix_moe

Conversation

@Yunnglin
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings February 13, 2026 04:02
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Yunnglin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces substantial improvements to the server's operational stability and resource efficiency. It refines how tasks are queued and processed, implements more granular control over adapter lifecycles, and updates core configurations to support these changes. The modifications aim to provide a more robust and predictable environment for training and inference, while also streamlining client-side interactions and documentation.

Highlights

  • Server Robustness and Resource Management: The server's task queueing system has been rearchitected to use per-key queues, improving fairness and throughput. Comprehensive pre-flight checks for rate limiting, maximum input tokens, and batch size validation are now integrated into task scheduling. Adapter lifecycle management has been significantly enhanced with session-based expiration, maximum lifetime settings, and dynamic limit checks, ensuring more efficient resource utilization and stability.
  • Configuration and Documentation Updates: YAML configurations for both Megatron and Transformer backends have been updated to utilize the dp_size parameter for device mesh definitions, replacing the older mesh and mesh_dim_names syntax. New environment variables, DEVICE_COUNT_PER_PHYSICAL_NODE and TWINKLE_TRUST_REMOTE_CODE, are introduced and documented in both English and Chinese usage guides, providing clearer instructions for server setup and resource allocation.
  • Client-side Examples and Refactoring: An outdated GRPO training example has been removed, and existing client-side examples (sample.py, self_congnition.py, short_math_grpo.py) have been updated to align with new API patterns, such as using template.encode/decode for prompt handling. Minor refactoring of client-side wrappers for DataLoader, Dataset, and InputProcessor classes improves consistency and JSON data handling.
  • Gradient Readiness Tracking: New state tracking for 'grad_ready' has been implemented within the model service to ensure that optimization steps (optim_step) are only performed after gradients have been successfully accumulated via a forward_backward pass, preventing erroneous updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added 'swanlog/' to the list of ignored files.
  • .pre-commit-config.yaml
    • Expanded pre-commit hook exclusions to include 'examples/', 'cookbook/', and 'src/twinkle_client/' directories.
  • cookbook/client/tinker/grpo.py
    • Removed the GRPO (Group Relative Policy Optimization) training example file.
  • cookbook/client/tinker/megatron/server_config.yaml
    • Updated server configuration for Megatron backend, including queue limits, model max length, max LoRAs, and adapter lifecycle settings.
    • Adjusted queue configuration parameters for SamplerManagement and ModelManagement services.
    • Added 'max_length' and 'max_loras' arguments to the ModelManagement service.
    • Modified 'per_token_adapter_limit' and 'adapter_timeout' in adapter configuration, and introduced 'adapter_max_lifetime'.
  • cookbook/client/tinker/megatron/server_config_7b.yaml
    • Adjusted server configuration for 7B Megatron model, including token per second limits, max input tokens, and adapter timeout/lifetime.
    • Modified 'tps_limit' and added 'max_input_tokens' to the queue configuration.
    • Reordered and updated parameters within the adapter configuration.
  • cookbook/client/tinker/sample.py
    • Refactored sampling example to use 'Qwen/Qwen2.5-7B-Instruct' as the base model and 'http://localhost:8000' as the server URL.
    • Removed 'modelscope.AutoTokenizer' import and replaced tokenizer-based prompt handling with 'template.encode' and 'template.decode'.
    • Updated the model path for the sampling client and adjusted sampling temperature.
  • cookbook/client/tinker/self_congnition.py
    • Updated self-cognition example to use 'template.encode' and 'template.decode' for prompt handling, replacing 'tokenizer.apply_chat_template' and 'tokenizer.decode'.
    • Removed 'modelscope.AutoTokenizer' import and reordered other imports.
    • Updated the 'weight_path' for loading the trained LoRA checkpoint.
  • cookbook/client/tinker/short_math_grpo.py
    • Migrated short math GRPO example to use 'template' for tokenization and decoding, replacing 'tokenizer'.
    • Removed 'modelscope.AutoTokenizer' import and reordered other imports.
    • Added a check to skip training steps if all advantages are zero.
  • cookbook/client/twinkle/grpo.py
    • Modified GRPO training parameters, reducing 'NUM_GENERATIONS', 'BATCH_SIZE', and 'SYNC_INTERVAL'.
  • cookbook/client/twinkle/transformer/server_config.yaml
    • Updated Transformer backend server configuration to use 'dp_size' for device mesh, replacing 'mesh' and 'mesh_dim_names'.
    • Added runtime environment variables ('TWINKLE_TRUST_REMOTE_CODE', 'DEVICE_COUNT_PER_PHYSICAL_NODE') to Ray actor options for ModelManagement, ProcessorManagement, and SamplerManagement.
    • Increased 'nproc_per_node' for SamplerManagement and adjusted its 'ranks' configuration.
  • docs/source_en/Usage Guide/Server and Client/Server.md
    • Documented new environment variables ('DEVICE_COUNT_PER_PHYSICAL_NODE', 'TWINKLE_TRUST_REMOTE_CODE') required for server startup.
    • Updated the 'Node Rank in YAML Configuration' section to reflect the use of 'dp_size' instead of 'mesh' and 'mesh_dim_names', and clarified that 'ranks' refer to physical GPU card numbers.
    • Added example YAML configurations incorporating the new syntax and environment variables.
  • docs/source_zh/使用指引/服务端和客户端/服务端.md
    • Updated Chinese server documentation with new environment variables ('DEVICE_COUNT_PER_PHYSICAL_NODE', 'TWINKLE_TRUST_REMOTE_CODE') and the revised 'device_mesh' configuration syntax.
    • Revised example YAML configurations to align with the updated parameter usage.
  • src/twinkle/model/megatron/megatron.py
    • Renamed the 'resume' parameter to 'load_optimizer' in the 'load' method for clarity regarding optimizer state restoration.
  • src/twinkle/server/tinker/common/compat_base.py
    • Improved the 'clean_metrics' function to robustly handle various numeric types (Python, NumPy, PyTorch tensors) and extract float values from common metric strings with units.
  • src/twinkle/server/tinker/model.py
    • Implemented robust adapter lifecycle management with dedicated '_cleanup_adapter' and '_on_adapter_expired' methods.
    • Modified 'create_model' to register adapters before adding them to the model and to perform cleanup on failure.
    • Removed batch size assertion in 'forward' and 'forward_backward' methods.
    • Added 'grad_ready' state tracking for adapters to ensure gradients are accumulated before 'optim_step'.
    • Enhanced task scheduling by adding 'batch_size', 'data_world_size', and 'task_type' parameters to 'schedule_task' calls.
  • src/twinkle/server/tinker/sampler.py
    • Added 'os' import for path validation.
    • Implemented validation for adapter URI existence before sampling.
    • Included 'task_type' in the 'schedule_task' call for sampling operations.
  • src/twinkle/server/tinker/server.py
    • Enhanced server initialization to accept a 'server_config' dictionary and normalize supported models internally.
    • Removed redundant supported model normalization logic from the main 'build_server_app' function.
  • src/twinkle/server/twinkle/model.py
    • Integrated '_on_adapter_expired' method for consistent adapter cleanup.
    • Adjusted 'add_adapter_to_model' to register the adapter with the manager before adding it to the model.
  • src/twinkle/server/twinkle/sampler.py
    • Updated 'typing.Union' import.
    • Streamlined adapter expiration handling by removing manual limit checks.
    • Modified '_get_adapter_name' to include 'request.state.request_id' for unique adapter naming.
    • Adjusted 'add_adapter_to_sampler' to register the adapter before adding it to the sampler.
  • src/twinkle/server/utils/adapter_manager.py
    • Removed 'TwinkleModel' type hint from the mixin.
    • Introduced 'adapter_max_lifetime' for time-to-live expiration of adapters.
    • Refactored 'register_adapter' to check limits before registration and added 'session_id' tracking.
    • Added '_is_session_alive' method to check session heartbeats for session-based adapter expiration.
    • Implemented generic adapter state management methods ('set_adapter_state', 'get_adapter_state', 'pop_adapter_state', 'clear_adapter_state').
    • Modified 'touch_adapter' to prevent updating activity for adapters marked as 'expiring'.
    • Overhauled '_adapter_countdown_loop' to support session-based expiration, TTL, and more robust cleanup logic.
    • Refactored 'check_adapter_limit' to dynamically count active adapters from internal records.
    • Removed '_adapter_lock', 'list_adapters', 'assert_adapter_valid', and 'get_adapter_count'.
  • src/twinkle/server/utils/state.py
    • Extended 'ServerState' initialization to accept additional keyword arguments.
    • Added 'last_heartbeat' tracking to session creation and a 'get_session_last_heartbeat' method.
    • Modified 'get_server_state' to pass additional keyword arguments to the 'ServerState' actor constructor.
  • src/twinkle/server/utils/task_queue.py
    • Rearchitected task queueing to use per-key queues ('_task_queues', '_queue_order') and an event-driven worker ('_new_task_event').
    • Added 'max_input_tokens' to 'TaskQueueConfig' for input validation.
    • Introduced '_QueuedTask' dataclass to store task details.
    • Changed 'schedule_task' to accept a 'coro_factory' and perform pre-flight checks (rate limiting, max tokens, batch size) before queuing.
    • Implemented '_fail_queue_tasks_async' and 'fail_pending_tasks_for_model' for robust task cancellation.
    • Updated '_queue_worker' to process tasks from multiple queues in a round-robin fashion.
    • Adjusted 'get_queue_stats' and 'shutdown_task_queue' to reflect the new queue structure.
  • src/twinkle_client/init.py
    • Updated the API key check from 'api_key is None' to 'not api_key' for improved clarity.
  • src/twinkle_client/dataloader/dataloader.py
    • Refactored DataLoader client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
  • src/twinkle_client/dataset/base.py
    • Refactored Dataset client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
  • src/twinkle_client/dataset/iterable_dataset.py
    • Refactored IterableDataset client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
  • src/twinkle_client/dataset/iterable_packing_dataset.py
    • Simplified JSON data passing in HTTP requests within the IterablePackingDataset client wrapper.
  • src/twinkle_client/dataset/lazy_dataset.py
    • Simplified JSON data passing in HTTP requests within the LazyDataset client wrapper.
  • src/twinkle_client/dataset/packing_dataset.py
    • Simplified JSON data passing in HTTP requests within the PackingDataset client wrapper.
  • src/twinkle_client/model/multi_lora_transformers.py
    • Reordered imports for consistency.
  • src/twinkle_client/processor/base.py
    • Refactored InputProcessor client wrapper to explicitly inherit from 'object' and simplified JSON data passing in HTTP requests.
  • src/twinkle_client/sampler/vllm_sampler.py
    • Reordered imports and simplified JSON data passing in HTTP requests within the vLLMSampler client wrapper.
Activity
  • No specific review comments or activities were recorded for this pull request since its creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring of the server-side adapter management and task queuing systems, bringing more robust lifecycle handling, fairer scheduling, and improved state management. The changes also include updates to client examples and documentation to align with the new server capabilities.

My review has identified a critical concurrency issue in the AdapterManagerMixin due to missing thread synchronization, a bug involving a duplicated method call, and some areas for improvement in documentation clarity and code style in the client library. Addressing these points will greatly improve the stability and maintainability of the system.

# 'inactivity_counter': int}
# Dict mapping adapter_name ->
# {'token': str, 'session_id': str, 'last_activity': float, 'created_at': float, 'inactivity_counter': int}
self._adapter_records: dict[str, dict[str, Any]] = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The _adapter_records dictionary is accessed and modified from multiple threads without any locking. The _adapter_countdown_loop runs in a separate threading.Thread, while methods like register_adapter, unregister_adapter, and touch_adapter are called from the asyncio event loop that handles requests. This concurrent access to a shared dictionary is not thread-safe and will lead to race conditions and data corruption. Please re-introduce a threading.Lock to protect all accesses to _adapter_records.



class DataLoader:
class DataLoader(object):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Inheriting from object is redundant in Python 3. This issue is also present in src/twinkle_client/dataset/base.py and src/twinkle_client/processor/base.py. These style issues are likely because src/twinkle_client/ is excluded from pre-commit checks. Please remove the redundant inheritance for cleaner code and consider re-enabling pre-commit for this directory to maintain code quality.

Suggested change
class DataLoader(object):
class DataLoader:

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request titled "Fix moe" contains extensive changes that go well beyond MoE (Mixture of Experts) fixes. The changes include a major refactoring of the task queue system, adapter lifecycle management improvements, API signature changes, and numerous formatting updates to client-generated files.

Changes:

  • Major task queue refactoring: Changed from single queue to per-model/per-token queues with coroutine factories instead of coroutines
  • Adapter lifecycle management: Enhanced session-based expiration, TTL enforcement, and gradient state tracking
  • API signature changes: Breaking changes to schedule_task (now requires coro_factory), _on_adapter_expired (removed token parameter), and parameter renames

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 22 comments.

Show a summary per file
File Description
src/twinkle_client/init.py Changed API key validation from None check to falsy check
src/twinkle_client/sampler/vllm_sampler.py Import reordering and formatting changes
src/twinkle_client/processor/base.py Formatting changes, added trailing whitespace, quote style changes
src/twinkle_client/model/multi_lora_transformers.py Import reordering
src/twinkle_client/dataset/*.py Formatting changes, quote style changes, trailing whitespace
src/twinkle_client/dataloader/dataloader.py Formatting changes, added object inheritance
src/twinkle/server/utils/task_queue.py Major refactoring: per-queue architecture, coro_factory pattern, preflight checks
src/twinkle/server/utils/state.py Added session heartbeat tracking, kwargs parameter
src/twinkle/server/utils/adapter_manager.py Session-based expiration, TTL enforcement, adapter state management, removed token from _on_adapter_expired
src/twinkle/server/twinkle/sampler.py Adapter expiration handling (signature mismatch bug)
src/twinkle/server/twinkle/model.py Duplicate remove_adapter call bug
src/twinkle/server/tinker/server.py Server config handling, normalize_models with incorrect parameter name
src/twinkle/server/tinker/sampler.py Adapter URI validation logic bug (fails all non-adapter requests)
src/twinkle/server/tinker/model.py Gradient state tracking, cleanup refactoring, batch validation moved
src/twinkle/server/tinker/common/compat_base.py Enhanced metrics cleaning with string parsing bug
src/twinkle/model/megatron/megatron.py Parameter rename from resume to load_optimizer
cookbook/client/tinker/sample.py Duplicate imports
cookbook/client/tinker/short_math_grpo.py Updated to use Template instead of tokenizer
cookbook/client/tinker/self_congnition.py Updated to use Template instead of tokenizer
cookbook/client/twinkle/transformer/server_config.yaml Config updates: dp_size, environment variables
cookbook/client/tinker/megatron/server_config*.yaml Adapter timeout, max_lifetime, rate limit changes
cookbook/client/tinker/grpo.py File deleted
.pre-commit-config.yaml Expanded exclusion patterns for examples/cookbook
.gitignore Added swanlog/ directory
Comments suppressed due to low confidence (1)

src/twinkle/server/twinkle/sampler.py:180

        def _on_adapter_expired(self, adapter_name: str, token: str) -> None:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +42 to +55

def __call__(self, inputs: Union[InputFeature, List[InputFeature]], **kwargs):
response = http_post(
url=f'{self.server_url}/processors/call',
json_data={
'processor_id': self.processor_id,
'function': '__call__',
**{
'inputs': inputs
},
**{'inputs': inputs},
**kwargs
})
}
)
response.raise_for_status()
return response.json()['result']
return response.json()["result"]

No newline at end of file
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace has been added at lines 42 and 55. While this doesn't affect functionality, it's inconsistent with the file's style and may trigger linting warnings. The pre-commit configuration has been updated to exclude client files, but adding trailing whitespace is still not ideal. Consider removing these trailing spaces.

Copilot uses AI. Check for mistakes.
async def schedule_task(
self,
coro: Coroutine,
coro_factory: Callable[[], Coroutine],
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature change from coro: Coroutine to coro_factory: Callable[[], Coroutine] is a breaking API change. All callers of schedule_task must now pass a factory function instead of an already-created coroutine. While all the visible callers in this PR have been updated (e.g., _do_sample() changed to _do_sample), this could break external code or plugins that call schedule_task. Consider whether this breaking change is intended and if it should be documented or versioned appropriately.

Copilot uses AI. Check for mistakes.
Comment on lines +210 to +212
# Remove adapter from model
self.model.remove_adapter(adapter_name)

Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _on_adapter_expired method calls self.model.remove_adapter(adapter_name) twice: once at line 204 and again at line 211. This appears to be duplicated logic that could cause issues. The second call will likely fail if the first one succeeded, or it's unnecessary redundancy. One of these calls should be removed.

Suggested change
# Remove adapter from model
self.model.remove_adapter(adapter_name)

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +169
# Validate adapter URI existence if provided
if not adapter_uri or not os.path.exists(adapter_uri):
return types.RequestFailedResponse(
error=f'Adapter URI {model_path} does not exist. Please check the model_path.',
category=types.RequestErrorCategory.User,
)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation at lines 165-169 checks if adapter_uri exists, but this check happens even when model_path is None or when no adapter is needed. The condition should be if adapter_uri and not os.path.exists(adapter_uri): to avoid failing when no adapter URI is intentionally provided. Currently, if adapter_uri is None or empty string, the check not adapter_uri or not os.path.exists(adapter_uri) will always be True, causing all requests without an adapter to fail.

Copilot uses AI. Check for mistakes.
s = value.strip()
if s:
try:
head, unit = s.split() # ignore unit/tail
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At line 101, s.split() without an argument will split on any whitespace and could raise ValueError if the split result doesn't have exactly 2 elements. For example, "123 seconds remaining" would split into 3 elements. The unpacking head, unit = s.split() will fail with ValueError: too many values to unpack. Consider using s.split(maxsplit=1) or handling the ValueError more gracefully.

Suggested change
head, unit = s.split() # ignore unit/tail
parts = s.split(maxsplit=1) # split into value and unit/tail
if len(parts) != 2:
raise ValueError("Expected a value and a unit in metric string")
head, unit = parts

Copilot uses AI. Check for mistakes.

from twinkle.dataset import Dataset, DatasetMeta
from twinkle_client.http import http_post, heartbeat_manager
from twinkle.dataset import Dataset
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Dataset' is not used.

Suggested change
from twinkle.dataset import Dataset

Copilot uses AI. Check for mistakes.
# 1. Modify the source files in src/twinkle/
# 2. Run: python client_tools/client_generator.py
# ============================================================================
from typing import Any, Optional, Union, Type, Dict, Literal, List
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'List' is not used.
Import of 'Type' is not used.
Import of 'Union' is not used.
Import of 'Literal' is not used.

Suggested change
from typing import Any, Optional, Union, Type, Dict, Literal, List
from typing import Any, Optional, Dict

Copilot uses AI. Check for mistakes.
from pydantic import BaseModel, Field
from ray import serve
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Union
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Union' is not used.

Suggested change
from typing import Any, Dict, List, Optional, Union
from typing import Any, Dict, List, Optional

Copilot uses AI. Check for mistakes.
from dataclasses import dataclass
from enum import Enum
from typing import TYPE_CHECKING, Any, Callable, Coroutine, Dict, List, Optional, Tuple
from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque, Dict, Optional
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Dict' is not used.
Import of 'Optional' is not used.

Suggested change
from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque, Dict, Optional
from typing import TYPE_CHECKING, Any, Callable, Coroutine, Deque

Copilot uses AI. Check for mistakes.
try:
while queue_key in self._queue_order:
self._queue_order.remove(queue_key)
except ValueError:
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except ValueError:
except ValueError:
# If the queue_key is already absent from _queue_order, we can safely ignore this.

Copilot uses AI. Check for mistakes.
@Yunnglin Yunnglin merged commit aa86181 into dev Feb 13, 2026
2 of 4 checks passed
@tastelikefeet tastelikefeet deleted the fix_moe branch February 13, 2026 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants