Fix proxy by Yunnglin · Pull Request #87 · modelscope/twinkle

Yunnglin · 2026-03-01T06:32:39Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Write the detail information belongs to this PR.

Experiment results

Paste your experiment result here(if needed).

gemini-code-assist · 2026-03-01T06:33:03Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the internal proxying mechanism within the server by introducing a dedicated ServiceProxy class, centralizing request routing logic. It standardizes the request identification header used for sticky sessions, ensuring consistency across client and server components. Additionally, server module imports have been optimized through lazy loading, and a minor configuration change was made to the ModelManagement service's replica settings.

Highlights

Dedicated Service Proxy Implementation: Introduced a new ServiceProxy class to centralize and improve internal request routing for model and sampler services, handling URL construction, header management, and error handling.
Standardized Request ID Header: Updated the header used for sticky sessions and request identification from serve_multiplexed_model_id to X-Ray-Serve-Request-Id across server-side validation, client-side header building, and documentation.
Server Component Lazy Loading Refactoring: Refactored server-side __init__.py files to utilize _LazyModule for lazy loading of build_*_app functions, improving import performance and module organization.
Configuration Adjustment: Reduced the minimum and maximum replica counts for the ModelManagement service in the server_config.yaml from 2 to 1.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

cookbook/client/tinker/custom_service/megatron/server_config.yaml
- Updated min_replicas and max_replicas for ModelManagement deployment from 2 to 1.
docs/source_en/Usage Guide/Server and Client/Tinker-Compatible-Client.md
- Updated documentation to reflect the change in authentication header from serve_multiplexed_model_id to X-Ray-Serve-Request-Id.
docs/source_zh/使用指引/服务端和客户端/Tinker兼容客户端.md
- Updated Chinese documentation to reflect the change in authentication header from serve_multiplexed_model_id to X-Ray-Serve-Request-Id.
src/twinkle/server/init.py
- Removed direct imports and __all__ entries for build_model_app, build_processor_app, build_sampler_app, and build_server_app.
src/twinkle/server/launcher.py
- Adjusted import paths for build_*_app functions to reflect their new location within twinkle.server.twinkle.
- Added logic to pass http_options to server applications for internal proxy routing.
src/twinkle/server/tinker/init.py
- Refactored module imports to use _LazyModule for lazy loading of model, sampler, and server components.
src/twinkle/server/tinker/common/datum.py
- Removed unused List, Union imports.
- Added handling for list type when converting labels_raw to a NumPy array.
src/twinkle/server/tinker/model.py
- Imported wrap_builder_with_device_group_env.
- Applied wrap_builder_with_device_group_env to build_model_app.
src/twinkle/server/tinker/proxy.py
- Added new file proxy.py containing the ServiceProxy class for internal HTTP request routing.
src/twinkle/server/tinker/sampler.py
- Imported wrap_builder_with_device_group_env.
- Applied wrap_builder_with_device_group_env to build_sampler_app.
src/twinkle/server/tinker/server.py
- Removed direct httpx and logging imports, and updated logger initialization.
- Added http_options parameter to build_server_app.
- Replaced internal httpx.AsyncClient and proxy methods with an instance of the new ServiceProxy.
- Updated all model and sampler endpoint handlers to use the ServiceProxy for request forwarding.
- Passed http_options to the TinkerCompatServer initialization.
src/twinkle/server/twinkle/init.py
- Refactored module imports to use _LazyModule for lazy loading of model, processor, sampler, and server components.
src/twinkle/server/utils/validation.py
- Updated the expected request header for sticky sessions from serve_multiplexed_model_id to X-Ray-Serve-Request-Id.
src/twinkle_client/http/http_utils.py
- Changed the header key for request ID from serve_multiplexed_model_id to X-Ray-Serve-Request-Id.
src/twinkle_client/utils/patch_tinker.py
- Updated the header key for request ID from serve_multiplexed_model_id to X-Ray-Serve-Request-Id in patched service client initialization.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively addresses a proxying issue by ensuring internal requests are correctly routed to localhost. The refactoring of proxy logic into a dedicated ServiceProxy class is a significant improvement for code clarity and maintainability. The change of the sticky session header to X-Ray-Serve-Request-Id is consistently applied across the client, server, and documentation. Additionally, the introduction of lazy loading for modules is a good optimization. I have one suggestion to improve code consistency.

src/twinkle/server/tinker/proxy.py

Copilot

Pull request overview

This PR updates Twinkle’s client/server request routing to improve sticky-session request ID handling and internal proxying for the Tinker-compatible server, aiming to avoid issues caused by proxies and to reduce import overhead.

Changes:

Switch client + server sticky-session header from serve_multiplexed_model_id to X-Ray-Serve-Request-Id.
Refactor Tinker server proxying into a dedicated ServiceProxy that routes internally using http_options.
Introduce lazy imports for twinkle.server.twinkle and twinkle.server.tinker (and adjust launcher imports accordingly).

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/twinkle_client/utils/patch_tinker.py	Updates patched Tinker client headers to send `X-Ray-Serve-Request-Id`.
src/twinkle_client/http/http_utils.py	Updates default HTTP headers to send `X-Ray-Serve-Request-Id`.
src/twinkle/server/utils/validation.py	Middleware now requires `X-Ray-Serve-Request-Id` for sticky sessions.
src/twinkle/server/twinkle/init.py	Converts to `_LazyModule`-based lazy imports for server builders.
src/twinkle/server/tinker/server.py	Uses `ServiceProxy`, adds `http_options` plumbed from launcher for internal routing.
src/twinkle/server/tinker/sampler.py	Wraps `build_sampler_app` with device-group env helper within the module.
src/twinkle/server/tinker/proxy.py	New internal proxy module handling URL building, header forwarding, and request forwarding.
src/twinkle/server/tinker/model.py	Wraps `build_model_app` with device-group env helper within the module.
src/twinkle/server/tinker/common/datum.py	Adds list handling when converting labels to numpy arrays.
src/twinkle/server/tinker/init.py	Converts to `_LazyModule`-based lazy imports for Tinker builders.
src/twinkle/server/launcher.py	Imports Twinkle builders from `twinkle.server.twinkle` and passes `http_options` into apps.
src/twinkle/server/init.py	Removes re-exports of build_* app builders (leaving launcher exports).
cookbook/client/tinker/custom_service/megatron/server_config.yaml	Changes ModelManagement autoscaling replicas from 2→1.

src/twinkle/server/tinker/proxy.py

src/twinkle/server/__init__.py

cookbook/client/tinker/custom_service/megatron/server_config.yaml

src/twinkle/server/tinker/proxy.py

Yunnglin and others added 7 commits March 1, 2026 12:43

update init

9bee81d

update proxy

a42bd71

update proxy

1816c9d

update proxy

42ae691

update

599cb3e

Merge branch 'main' into fix/proxy

074cb91

update

894785f

Copilot AI review requested due to automatic review settings March 1, 2026 06:32

Copilot started reviewing on behalf of Yunnglin March 1, 2026 06:33 View session

gemini-code-assist bot reviewed Mar 1, 2026

View reviewed changes

src/twinkle/server/tinker/proxy.py Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 1, 2026

View reviewed changes

tastelikefeet approved these changes Mar 1, 2026

View reviewed changes

Yunnglin added 2 commits March 2, 2026 10:14

Merge branch 'main' into fix/proxy

37d21ca

update

7634154

Yunnglin merged commit 2c24077 into main Mar 2, 2026
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix proxy#87

Fix proxy#87
Yunnglin merged 9 commits intomainfrom
fix/proxy

Yunnglin commented Mar 1, 2026

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Yunnglin commented Mar 1, 2026

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot commented Mar 1, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants