Releases: kvcache-ai/Mooncake
Releases · kvcache-ai/Mooncake
v0.3.8
What's Changed
- adxl: fix aclrtMemcpyBatch max 4096 limit bug by @ascend-direct-dev in #963
- ci: add non-CUDA release workflow and update documentation by @xiaguan in #969
- fix(transfer_engine): Add notify callback registration in RPC metadata handling by @iBenzene in #966
- [Misc] (Mooncake Backend) Early break if a rank failure can be determined through ping message by @UNIDY2002 in #980
- Revise installation instructions for non-cuda mooncake by @ShangmingCai in #985
- [Store] feat: store key-value data in buckets by @zhuxinjie-nz in #968
- Support AMDGPU (refactor CUDA-alike) by @yeahdongcn in #973
- add log by @ascend-direct-dev in #996
- Feature: support custom key prefix for issue 957 by @uniqueni in #958
- refactor(store): store remove transfer engine internal api usage by @xiaguan in #994
- [TransferEngine] Mitigating performance overhead from large cluster and large bulks by @alogfans in #999
- Bump version to 0.3.7.post1 in pyproject.toml by @ShangmingCai in #984
- [store] feat: add secondary storage usage monitor by @yejj710 in #976
- fix(ci): remove nvlink allocator --ci-build flag by @xiaguan in #1003
- [DOC] Update fig by @stmatengss in #987
- Modify build command for nvlink_allocator by @ShangmingCai in #1001
- fix(ci): add id-token permission and unify PyPI token for release by @xiaguan in #1004
- [Misc] Lazy import
epinmooncake_ep_buffer.pyby @UNIDY2002 in #1014 - Bump version to 0.3.7.post2 in pyproject.toml by @ShangmingCai in #1015
- [Store] Fix CI bugs & Improve log output & Refactor TE Initialization by @ykwd in #1006
- [CI] Fix a CI BUG in PyClientTest:TestSetupExistTransferEngine by @ykwd in #1016
- [Misc] Remove EP's duplicated impl of
getCudaTopologyJsonby @UNIDY2002 in #1009 - [Store]: Cleanup processing objects if transferring timedout (#975) by @nickyc975 in #993
- [Store] support segment level metrics(fix code format of #1029) by @cocktail828 in #1030
- [Store] One Replica Has One Slice by @ykwd in #1032
- [DOC] Update Slack link in README.md by @stmatengss in #1042
- [Doc] Update SGLang Hicache Docs by @ykwd in #1023
- [Store] add choosing endpoint store option by @stmatengss in #1024
- [feat]More KVCache metrics in both master/client side by @Liziqi-77 in #1020
- change adxl log by @ascend-direct-dev in #1039
- Te seperated compilation by @zhaoyongke in #1041
- Fix TCP Transport Handshake Daemon Initialization by @staryxchen in #846
- add batch [put/get] tensor by @XucSh in #1044
- handling cudaMemcpy errors in tcp_transport.cpp by @flying-x in #1057
- [Store] fix: honor MC_MS_FILTERS by applying whitelist before TransferEngine init by @wwq2333 in #1051
- [Doc] Add license badge to README by @stmatengss in #1063
- [DOC] add web api doc by @stmatengss in #1059
- [Chore] Add Contributor Covenant Code of Conduct by @stmatengss in #1056
- Add pull request template for standardized PR submissions by @Copilot in #1065
- [Store|TransferEngine]: use condition-variable based completion instead of busy-polling by @wwq2333 in #1053
- docs: update README by @zhyncs in #1079
- [store] Add disk eviction feature by @Vincent-Bo-ali in #1028
- [Store] MasterMetricManager Returns Zero-Value Variables by @ykwd in #1068
- [RDMA] Fix RDMA device selection to prioritize GIDs with network devices by @uniqueni in #1077
- [CI] add sglang integration test by @stmatengss in #1089
- [EP] Fallback impl of Mooncake EP when IBGDA is unavailable by @UNIDY2002 in #1002
- TCP transport support ipv6 by @LCAIZJ in #1067
- [Store] add version checking between client and server by @stmatengss in #1061
- [TE/Topology] Support device filtering when dumping topology by @popsiclexu in #1087
- [BugFix] Adapt mooncake_connector_v1 to latest vllm by @ZeldaHuang in #1080
- [CI] Add label event by @XucSh in #1108
- Adapt to adxl connection auto release feature by @ascend-direct-dev in #1072
- feat[Store]: Add standalone deployment implementation for Client by @YiXR in #1084
- feat[accl-barex]: add barex_transport by build with USE_BAREX by @ZechaoZhang-beta in #1045
- Update CI by @XucSh in #1111
- [TE/Topology] Enhance PCI distance calculation by considering NUMA node affinity by @popsiclexu in #1086
- [DEV] add pre-commit by @stmatengss in #1124
- [Store] Cancel all negative ret val by @Azure-Tang in #1129
- [Store] fix compilation warning in storage backend by @stmatengss in #1134
- [EP] Support multiple torch versions by @UNIDY2002 in #1098
- feat[Store]: Add multi dummy clients support for real client by @YiXR in #1122
- [Store] Add support for static labels (host IP/cluster name) in client metrics by @cocktail828 in #1081
- [Misc] Add Codeowners by @ykwd in #1135
- fix MC_MAX_EP_PER_CTX doc by @whybeyoung in #1142
- [Bug] fixed bug of master not using glog actually by @SpecterCipher in #1075
- change cmake by @ascend-direct-dev in #1114
- feat[Store]: Refine shm mmap logic and add reconnection for Dummy Client after the Real restarted by @YiXR in #1146
- [mooncake-store]: prevent orphaned bucket data files from leaking dis… by @maheshrbapatu in #1140
- [store] Fix IPv6 link-local address parsing and add IPv6 tests by @Azure-Tang in #1137
- [CI] Install CUDA toolkit on job
test-wheel-ubuntu, so that the wheel can be built withUSE_CUDA=ONby @UNIDY2002 in #1156 - [store] add pybind for get_replica_desc by @yejj710 in #1121
- [Store]: Refactor AllocationStrategy implementation for better performance and flexibility by @nickyc975 in #1149
- [Store] Optimize master & client binary size by @YiXR in #1166
- Add a CI test for Mooncake EP Backend (CPU only) by @UNIDY2002 in #1099
- Improve AMD HIP support with hipify-perl by @amd-arozanov in #1154
- [DOC] Add MAINTAINERS.md by @alogfans in #1171
- [MUSA] Enable USE_MNNVL by @yeahdongcn in #1176
- [Store] feat: Add BatchQueryIp API for querying multiple client IPs by @Vincent-Bo-ali in #1162
- [Store] pub_tensor for multiple replica by @zxpdemonio in #1148
- [Store] feat: Implement a FileStorage component to manage the lifecycle of key-value data by @zhuxinjie-nz in #1031
- [Doc] add docs of Mooncake EP integration with SGLang by @UNIDY2002 in #1188
- Pr1 coro rpc core by @JasonZhang517 in #1104
- [TE] Support rdma traffic class by environmental variable by @yafengio in #1187
- [Store] add tp awareness for get_tensor by @XucSh in https://github.com/kvcache-ai/Mooncake/p...
v0.3.7.post2
What's Changed
- [Misc] Lazy import
epinmooncake_ep_buffer.pyby @UNIDY2002 in #1014 - Bump version to 0.3.7.post2 in pyproject.toml by @ShangmingCai in #1015
Full Changelog: v0.3.7.post1...v0.3.7.post2
v0.3.7.post1
What's Changed
- adxl: fix aclrtMemcpyBatch max 4096 limit bug by @ascend-direct-dev in #963
- ci: add non-CUDA release workflow and update documentation by @xiaguan in #969
- fix(transfer_engine): Add notify callback registration in RPC metadata handling by @iBenzene in #966
- [Misc] (Mooncake Backend) Early break if a rank failure can be determined through ping message by @UNIDY2002 in #980
- Revise installation instructions for non-cuda mooncake by @ShangmingCai in #985
- [Store] feat: store key-value data in buckets by @zhuxinjie-nz in #968
- Support AMDGPU (refactor CUDA-alike) by @yeahdongcn in #973
- add log by @ascend-direct-dev in #996
- Feature: support custom key prefix for issue 957 by @uniqueni in #958
- refactor(store): store remove transfer engine internal api usage by @xiaguan in #994
- [TransferEngine] Mitigating performance overhead from large cluster and large bulks by @alogfans in #999
- Bump version to 0.3.7.post1 in pyproject.toml by @ShangmingCai in #984
- [store] feat: add secondary storage usage monitor by @yejj710 in #976
- fix(ci): remove nvlink allocator --ci-build flag by @xiaguan in #1003
- [DOC] Update fig by @stmatengss in #987
- Modify build command for nvlink_allocator by @ShangmingCai in #1001
- fix(ci): add id-token permission and unify PyPI token for release by @xiaguan in #1004
New Contributors
- @iBenzene made their first contribution in #966
- @zhuxinjie-nz made their first contribution in #968
- @yeahdongcn made their first contribution in #973
- @yejj710 made their first contribution in #976
Full Changelog: v0.3.7...v0.3.7.post1
v0.3.7
What's Changed
- [Store] skip null buffer by @XucSh in #812
- [Store] Change Default Value of eviction_high_watermark_ratio and eviction_ratio by @ykwd in #820
- [CI/Build] gate mooncake-store test behind BUILD_UNIT_TESTS option by @peng1999 in #821
- [Doc] Store Integrated to SGLang HiCache by @ykwd in #829
- [TransferEngine]: remove SO_REUSEADDR in findAvailableTcpPort by @doujiang24 in #830
- [Build] Install Python Files by @ykwd in #836
- [TransferEngine] Make ascend TE to be released successfully and support fast recovery from failures through retry by @hjchen2 in #827
- [Build] Install Python Files Patch by @ykwd in #839
- [Doc] SGLang HiCache Intergration by @ykwd in #833
- [Docs] Fix Broken Trace Link by @ykwd in #841
- feat(store): add NUMA node binding support via bind_to_numa_node method by @xiaguan in #823
- docs(deployment): Add Basic Mooncake Store deployment guide by @xiaguan in #825
- refactor(store): use dedicated thread for signal handling by @xiaguan in #840
- add ascend protocol to mooncake store by @ascend-direct-dev in #835
- store: Add json file and improve doc by @201341 in #843
- feat(store): add client heartbeat support for non ha mode by @xiaguan in #845
- Fix typo in issue template by @Zane-Jiang in #858
- Fix nvlink_transport bug: revert #683 by @ShangmingCai in #869
- fix adxl find tcp port bug by @ascend-direct-dev in #856
- chore: bump version to 0.3.6.post1 in pyproject.toml by @ShangmingCai in #870
- [Transfer Engine] Post notify if all transfer tasks are completed by @alogfans in #831
- feat(store): support transfer engine p2phandshake by @xiaguan in #852
- [Chores] Remove Unused Variable by @ykwd in #822
- [TransferEngine] Performance Enhancement for Heterogeneous Ascend via Intelligent Aggregation & Pipeline Design by @zuochunwei in #859
- [Misc] feat: Support external kv_connector for vllm v1 by @dtcccc in #865
- feat(store): disable auto discovery by default, require devices for RDMA by @xiaguan in #877
- Allow customizing RPC port range by @peng1999 in #873
- [Store] Check If Get Completed Within Lease by @ykwd in #778
- [Docs] Update Obsolete Content & Fix Minor Problems by @ykwd in #880
- fix(store): fix memory leak in client_integration_test.cpp by @JINGE-ui in #881
- Refactor(store): Remove BufStatus and segment_name for AllocatedBuffer by @xiaguan in #883
- [Store] Check if Connecting Master Fails by @ykwd in #886
- [TransferEngine] clear all transport mems for fast recovery for ascend transport by @hjchen2 in #847
- [Misc] Mooncake EP & Mooncake Backend by @UNIDY2002 in #805
- [Docs] Update quick start and usage examples by @chestnut-Q in #893
- fix(store): disable persistence instead of returning error by @xiaguan in #892
- [Store]: Get start_time before calling RPC in BatchQuery by @nickyc975 in #896
- feat(store): Add multi threading handle page fault during segment allocation by @xiaguan in #875
- docs(store): restructure and simplify SGLang HiCache integration guide by @xiaguan in #897
- [Store]: Add option to use jemalloc in mooncake store master by @nickyc975 in #902
- [Misc] improvements for mooncake_connector_v1 by @dtcccc in #906
- [TransferEngine] initiator_test script: make it works with P2PHANDSHAKE. by @doujiang24 in #907
- [CI/Build] For Mooncake EP, fix the flag USE_CUDA that was unexpectedly turned off by @UNIDY2002 in #909
- [Misc] For EP, pass device_name instead of nic_id when creating
ep.Bufferby @UNIDY2002 in #910 - [Doc] Add Mooncake x SGLang Hicache Design and Some Updates by @ykwd in #913
- fix(doc): Add Hicache Design to Index.md by @ykwd in #914
- [TransferEngine] Add Moore Threads GPUs Support by @popsiclexu in #862
- feat(TE): add notify support for sync transfers and expose getNotifies API by @staryxchen in #894
- mooncake-backend chunked transfer by @ympcMark in #911
- ascend direct transport support transfer to multiple destinations in one batch by @ascend-direct-dev in #857
- fix(store): Fix integer overflow in get_into/batch_get_into for values > 4GB by @xiaguan in #920
- [Doc] Add Clarification for STORE_USE_ETCD Compile Option by @ykwd in #927
- [Store] Change log level for batch operation by @stmatengss in #916
- Reduced build-with-ep workflow by @JasonZhang517 in #926
- [Misc] Fix chunked impl of _reduce_scatter_base by @UNIDY2002 in #931
- feat: add batch_put_from_multi_buffers by @LCAIZJ in #929
- [Misc] Fix the shutdown logic of Mooncake Backend by @UNIDY2002 in #933
- [CI/Build] Always build with EP in CI by @UNIDY2002 in #922
- [Integration] feat: introduce barex allocator by @stmatengss in #932
- TE: adxl config without buffer pool by @ascend-direct-dev in #941
- fix(transfer_engine): replace deprecated Json::Reader by @xiaguan in #938
- [CI] Fix CI Error Due to RDMA Fail by @ykwd in #930
- mlx5gda.cpp: add cleanup to destroy ah in mlx5gda_modify_rc_qp_init2rtr by @zhilishui in #945
- [doc] Fix documentation link by @Liziqi-77 in #949
- feat(store_service): support load config from env for mooncake store_service by @Syspretor in #951
- [Misc] For Mooncake Backend, skip transferring to non-active ranks by @UNIDY2002 in #953
- Enable CUDA support in CI configuration by @ShangmingCai in #937
- Bugfix issue 946 by @uniqueni in #947
- Bump version to 0.3.7 in pyproject.toml by @ShangmingCai in #959
- Try to fix the release CI by @UNIDY2002 in #962
New Contributors
- @peng1999 made their first contribution in #821
- @Zane-Jiang made their first contribution in #858
- @dtcccc made their first contribution in #865
- @JINGE-ui made their first contribution in #881
- @nickyc975 made their first contribution in #896
- @popsiclexu made their first contribution in #862
- @ympcMark made their first contribution in #911
- @zhilishui made their first contribution in #945
- @Liziqi-77 made their first contribution in #949
- @Syspretor made their first contribution in #951
- @uniqueni made their first contribution in #947
Full Changelog: v0.3.6...v0.3.7
v0.3.6.post1
What's Changed
- [Store] skip null buffer by @XucSh in #812
- [Store] Change Default Value of eviction_high_watermark_ratio and eviction_ratio by @ykwd in #820
- [CI/Build] gate mooncake-store test behind BUILD_UNIT_TESTS option by @peng1999 in #821
- [Doc] Store Integrated to SGLang HiCache by @ykwd in #829
- [TransferEngine]: remove SO_REUSEADDR in findAvailableTcpPort by @doujiang24 in #830
- [Build] Install Python Files by @ykwd in #836
- [TransferEngine] Make ascend TE to be released successfully and support fast recovery from failures through retry by @hjchen2 in #827
- [Build] Install Python Files Patch by @ykwd in #839
- [Doc] SGLang HiCache Intergration by @ykwd in #833
- [Docs] Fix Broken Trace Link by @ykwd in #841
- feat(store): add NUMA node binding support via bind_to_numa_node method by @xiaguan in #823
- docs(deployment): Add Basic Mooncake Store deployment guide by @xiaguan in #825
- refactor(store): use dedicated thread for signal handling by @xiaguan in #840
- add ascend protocol to mooncake store by @ascend-direct-dev in #835
- store: Add json file and improve doc by @201341 in #843
- feat(store): add client heartbeat support for non ha mode by @xiaguan in #845
- Fix typo in issue template by @Zane-Jiang in #858
- Fix nvlink_transport bug: revert #683 by @ShangmingCai in #869
- fix adxl find tcp port bug by @ascend-direct-dev in #856
- chore: bump version to 0.3.6.post1 in pyproject.toml by @ShangmingCai in #870
New Contributors
- @peng1999 made their first contribution in #821
- @Zane-Jiang made their first contribution in #858
Full Changelog: v0.3.6...v0.3.6.post1
v0.3.6
What's Changed
- feat(store): add batch get buffer support by @xiaguan in #671
- [TransferEngine] optimization: remove request_list parameter from submitTransferTask by @staryxchen in #565
- feat(transfer_engine_bench): Add multi-GPU support by @staryxchen in #675
- [CI/Build] Mooncake-common/common.cmake: Add link flag of pthread by @weinanliu in #681
- [DOC] fix problem in mooncake-store-preview.md by @SgtPepperr in #685
- [Store] metric: add response struct by @stmatengss in #686
- [Store] fix: add client list metrics by @stmatengss in #693
- refactor(offset-allocator): add memory allocation metrics tracking by @xiaguan in #687
- [TransferEngine] Fix build issues & adapt to latest Mooncake changes by @AscendTransport in #684
- docs: add RDMA memory registration troubleshooting guide by @xiaguan in #694
- add instructions for running on AMD GPU by @lihaofd in #689
- [BugFix] Zero Size RDMA Mem Register by @ykwd in #695
- refactor(store): move client buffer implementation to store module by @xiaguan in #700
- refactor(MasterClient): introduce generic RPC invocation helpers by @xiaguan in #697
- [BugFix] Topology Empty Check Bug by @ykwd in #696
- bugfix(nvlink): Add explicit P2P access enablement and error handling for NvlinkTransport by @staryxchen in #683
- [BugFix] Forbid Register Zero Size Memory by @ykwd in #701
- [TransferEngine] feat: Support CXL shared memory, and provide simple unit tests. by @hemist in #670
- code format & enable code format checking in ci by @doujiang24 in #677
- docs: add troubleshooting steps for QP allocation error by @staryxchen in #707
- [Store] Enhance Master Metrics by @ykwd in #705
- [store] feat: add master config by @201341 in #650
- [Store][bind] add new support data types by @stmatengss in #712
- [TransferEngine] feat: add a universal method to get CXL device size automatically by @StepY1aoZz in #715
- Reimplement VRAM buffering in TCP transport by @alogfans in #702
- [Transfer Engine] fix: maximum the memory resource limitation by @stmatengss in #716
- Fix lint error in
transfer_engine_validator.cppby @SCDESPERTATE in #720 - Add a topology dumping tool for ease of use by @SCDESPERTATE in #713
- [TransferEngine] Update to support CANN 8.2.RC1 by @AscendTransport in #714
- [Store] Optimize Offset Allocator by @ykwd in #706
- [Store] Add multi-endpoint etcd support for Transfer Engine metadata plugin by @vladnosiv in #729
- Add ability to do RDMA without nvidia-peermem by @misterwilliam in #704
- feat: add source code of MXA-EP by @UNIDY2002 in #726
- refactor(store): move python bidning to
pybind_clientby @xiaguan in #723 - [Bugfix] invalidation of one replica results in deletion of the key by @vladnosiv in #731
- [TransferEngine] exclude packaging ascend precompiled libraries by @hjchen2 in #737
- [Transfer Engine] Metrics: Add total qp metrics by @stmatengss in #738
- Fix deleting buffer that doesn't belong to us by @SzymonOzog in #739
- fix(store): replace CHECK with error handling by @xiaguan in #735
- feat(client): Add client-side metrics for transfer and RPC operations by @xiaguan in #733
- [Store]feat: Migrate Persistence Metadata from Client to Master Service by @SgtPepperr in #690
- Add interface for fuzz match by @XucSh in #734
- refactor(store_py): Replace function with AutoPortBinder RAII class by @xiaguan in #741
- [Docs] Minor Update: Explain Return Value of batch_put_from by @ykwd in #747
- [CI] Avoid Running Deploy Workflow on Forked Repositories by @ykwd in #746
- Fixed cachelib_memory_allocator dependency. by @karya0 in #750
- [Store] Refine Complicated Constructor Parameters by @ykwd in #748
- Update .typos.toml by @ShangmingCai in #756
- add ascend direct transport by @ascend-direct-dev in #740
- Fix typo CI by @ShangmingCai in #757
- [TransferEngine] Add guide in testing Transfer Engine, and remove confusing output in transfer engine by @alogfans in #754
- [TransferEngine] Ascend supports asymmetric amount of registered memory by @hjchen2 in #758
- [Doc] Fix 3FS plugin file link problem by @SgtPepperr in #762
- [Store] Serialize/Deserialize Offset Allocator by @ykwd in #760
- refactor(store): remove garbage collection implementation by @xiaguan in #763
- [Doc] Allocator Performance by @ykwd in #774
- [TE][EndpointStore]: Fix hand_ assignment after evict by @lizhemingi in #768
- [Doc] update doc for *Regex interface by @XucSh in #776
- [Store] add c++ http metadata server in mooncake master by @stmatengss in #766
- [Store] Add replication guarantees by @vladnosiv in #744
- [Store] Break Circular Dependency Between type.h And Other Files by @ykwd in #771
- [DOC] Add all badges by @stmatengss in #781
- [Doc] Add description for CONTRIBUTING.md by @SgtPepperr in #773
- [TE] Fix adxl error code in ascend-direct-transport by @ascend-direct-dev in #764
- [Bugfix] YAML CPP has inline impl in header file which will cause linking error by @alexnails in #784
- [TE] Fix notifs problems in C wrapper by @alogfans in #779
- fix nixl bench bug by @haobayuxi in #788
- feat(store): Add largest free region filtering for allocation by @xiaguan in #785
- Fix Handshake Daemon Initialization Order by @staryxchen in #765
- [Store] Update stress_cluster_benchmark.py (Multi-thread mooncake store benchmark) by @ChaosD in #791
- chore(deps): bump tracing-subscriber from 0.3.18 to 0.3.20 in /mooncake-transfer-engine/rust by @dependabot[bot] in #794
- GIL release for put_tensor and get_tenor by @jerrychenhf in #783
- Fix: ascend direct transport support host addr type by @ascend-direct-dev in #786
- [coro_rpc] use client pool and enable rdma by @qicosmos in #789
- [TransferEngine] Fix SIEVE eviction algorithm for RDMA endpoint store by @KarmaD7 in #767
- [TE][RDMA Transport]: Simplify Transfer Submission Logic by @staryxchen in #772
- [TransferEngine] heterogeneous_ascend support kv-cache transfer between npu and gpu by @zuochunwei in #759
- fix(store): add mutex locks for thread-safe metrics retrieval by @xiaguan in #804
- feat(build): add CI-specific build option with --ci-build flag by @xiaguan in #808
- [Store] Remove unnecessary register buffer from put_tensor by @jerrychenhf in #803
- [CI] Fix Release build_wheel.sh to make python 3.8 auditwheel happy by @mumupika in #801
- [Chores] Offset Allocator Test Fix & Docs Fix by @ykwd in #806
- [Doc] Update docs for a better quick start by @chestnut-Q in https://github.com/k...
v0.3.5
What's Changed
- feat(store): add thread safety analysis with clang annotations by @xiaguan in #538
- feat(master): support rpc server address parameter by @xiaguan in #530
- add notify support by @haobayuxi in #528
- [TE] revert: fix QP reclaim issues by @stmatengss in #543
- chore: bump version to 0.3.4.post1 in pyproject.toml by @ShangmingCai in #544
- [TransferEngine] Add Redis password authentication and DB selection via environment variables by @staryxchen in #512
- feat(store): add batch exist support for master by @xiaguan in #542
- [TransferEngine] Fix side effect of wild location registration by @alogfans in #552
- chore: bump version to 0.3.4.post2 in pyproject.toml by @ShangmingCai in #554
- chore: checkout specific version of yalantinglibs in script by @xiaguan in #555
- [TransferEngine]: fix compilation warning by @201341 in #550
- [TransferEngine] fix segfault when create cq failed by @doujiang24 in #535
- [Integration] feat: expose batch reg API by @stmatengss in #558
- support batch put/get api in python module by @xinranwang17 in #556
- feat(store): add zero copy batch put and get for python binding by @xiaguan in #551
- [TransferEngine] bugfix: ensure proper socket closure in destructor by @staryxchen in #566
- [TransferEngine] Add support to force MNNVL transport by MC_FORCE_MNNVL by @alogfans in #572
- [Store] Add Chaos Tests and Fix Bugs by @ykwd in #568
- Optimize slice handling to accelerate the large batch transfer operation by @SCDESPERTATE in #557
- [P2P Store] Add cuda link option when it is installed by @alogfans in #560
- [DOC] fix: Naming errors in Doc transfer-engine-python.md by @SgtPepperr in #508
- [cmake]fix cmake for centos by @qicosmos in #573
- [Doc] Add pypi install guide in the build doc by @ShangmingCai in #574
- [TransferEngine] Enable Huawei Ascend Transport for TransferEngine by @AscendTransport in #502
- [Misc] Add Issue Template in Github by @scatyf3 in #506
- [DOC] Update API description of mooncake store client by @panli889 in #548
- [DOC] Add Description for High Availability in Store by @ykwd in #576
- Disable memcpy by default and improve stress workload test by @xiaguan in #577
- [Store] Enable Client SSD Offload And Storage Persistence by @SgtPepperr in #437
- [TransferEngine] Fix retry logics in RDMA worker by @alogfans in #417
- refactor: introduce expected pattern for error handling in master service by @xiaguan in #562
- refactor(tests): enhance stress test benchmarking with zero-copy batch by @xiaguan in #586
- [TransferEngine] Enlarge default send/recv message size in etcd by @alogfans in #575
- fixed initall function by @JasonZhang517 in #591
- [DOC]: Add Description for Data Persistence and KVCache offloading in Store by @SgtPepperr in #585
- add support for asynchronous batch transfer to accelerate transfer operation by @SCDESPERTATE in #564
- [Store] Add ungister_buffer api for Store by @SgtPepperr in #596
- feat(topology): improve HCA selection by considering PCIe distance by @staryxchen in #581
- [Store] Soft Pin for Important Object by @ykwd in #587
- docs: add support for LMDeploy by @Risc-lt in #592
- [doc] Update mooncake-store doc by @LuyuZhang00 in #603
- test(client): add batch put test for duplicate keys by @xiaguan in #588
- refactor(store): remove unused value_length from PutStart functions by @xiaguan in #606
- feat(store): add ReplicateConfig support for pybindings by @xiaguan in #608
- [Store] feat: put/get tensor API for store by @stmatengss in #579
- feat(store): add
get_hostnamemethod for py bindings by @xiaguan in #617 - fix: correctly cleanup local buffer allocation by @xinranwang17 in #590
- [Doc] Update Mooncake Store Docs by @ykwd in #612
- [Mooncake Store] perf: avoid memory copy for rpc service by @qicosmos in #618
- refactor(rpc_service): separate implementation into cpp file by @xiaguan in #620
- implement genNotify interface by @haobayuxi in #600
- [DOC] Update readme by @stmatengss in #629
- [TransferEngine] Fix address already in use by @alogfans in #604
- fix(memory): Prevent integer overflow in getMemoryLocation for large memory regions by @ZeroLiu2018 in #626
- [Build] Fix nvlink allocator compile command by @ShangmingCai in #534
- ci: add --use-nvcc flag to build nvlink.so by @xiaguan in #634
- Revert "implement genNotify interface" by @ShangmingCai in #636
- [BugFix] Prevent SIGSEGV when SliceBuffer is destroyed after DistributedObjectStore::close() by @wwq2333 in #639
- [TransferEngine] Add IPv6 support [2] by @thefacetakt in #628
- [TransferEngine] Ascend Transport: add batch_transfer_sync, Debian support & bug fixes by @AscendTransport in #619
- [Store] Import Offset Allocator by @ykwd in #641
- [TransferEngine] fix the compilation warnings by @LuyuZhang00 in #643
- mooncake-common: add config class by @201341 in #582
- refactor(store): replace memory management with offset allocator by @xiaguan in #642
- [Fix] Support large global segment by @ykwd in #647
- feat(rdma): add device affinity optimization for RDMA performance by @staryxchen in #645
- [TransferEngine] Reimplement #600 posting Notify message after transfer successful by @alogfans in #635
- Remove unused files by @ykwd in #652
- docs(store): add Python API documentation for mooncake store by @xiaguan in #646
- [store] test: add mutil-threads test by @LuyuZhang00 in #611
- fix(store) : fix disk-backed replicas in size calculation and slice allocation(#653) by @SgtPepperr in #655
- refactor(store_py): convert functions to use tl::expected for error by @xiaguan in #651
- fix transfer engine: handle install transport fail by @LCAIZJ in #656
- [DOC] add SGLang RDMA trouble shooting by @stmatengss in #662
- fix(bench): use correct GPU ID in memory registration by @staryxchen in #661
- [TransferEngine] Fix NVlink accuracy drop issue by @alogfans in #663
- [Store] feat: add metadata support for tensor interface by @JasonZhang517 in #625
- [Store] Enlarge the Default KV TTL by @ykwd in #660
- [TransferEngine] Fix compile issue to make CentOS usable + Make ascend_transport timeout configurable by @AscendTransport in #658
- [Store] Master Service Support OffsetAllocater by @ykwd in #657
- [Store]feat: Add 3fs native api plugin for KVCache storage persistence by @SgtPepperr in #610
- Revert "[TransferEngine] Fix NVlink accuracy drop issue" by @ShangmingCai in #665
- feat(ci): enable CUDA support in CI workflow ...
v0.3.4.post2
What's Changed
- [TransferEngine] Add Redis password authentication and DB selection via environment variables by @staryxchen in #512
- feat(store): add batch exist support for master by @xiaguan in #542
- [TransferEngine] Fix side effect of wild location registration by @alogfans in #552
- chore: bump version to 0.3.4.post2 in pyproject.toml by @ShangmingCai in #554
- chore: checkout specific version of yalantinglibs in script by @xiaguan in #555
New Contributors
- @staryxchen made their first contribution in #512
Full Changelog: v0.3.4.post1...v0.3.4.post2
v0.3.4.post1
What's Changed
- feat(store): add thread safety analysis with clang annotations by @xiaguan in #538
- feat(master): support rpc server address parameter by @xiaguan in #530
- add notify support by @haobayuxi in #528
- [TE] revert: fix QP reclaim issues by @stmatengss in #543
- chore: bump version to 0.3.4.post1 in pyproject.toml by @ShangmingCai in #544
New Contributors
- @haobayuxi made their first contribution in #528
Full Changelog: v0.3.4...v0.3.4.post1
v0.3.4
What's Changed
- chore(ci): disable asan in release workflow by @xiaguan in #493
- chore: bump version to 0.3.3.post1 in pyproject.toml by @ShangmingCai in #494
- [TransferEngine] Optimize custom allocator function name by @ShangmingCai in #497
- chore: bump version to 0.3.3.post2 in pyproject.toml by @ShangmingCai in #498
- fix(transfer-task): fix error hanlding logic in transfer task by @xiaguan in #503
- [MooncakeIntegration] Fix find class id by @jellor in #500
- [Build] add TE bench into wheel package by @stmatengss in #514
- [Build] add nvlink hook into python package dir for local build by @ShangmingCai in #517
- [Build] Optimize nvlink allocator build logic and fix name issue by @ShangmingCai in #523
- [Build] Add allocator class to support nvlink for more use-cases by @ShangmingCai in #524
- [TransferEngine] Change option use_nvlink to use_mnnvl to clarify the usage by @ShangmingCai in #525
- [Bugfix] Fix missing option and sglang integration doc by @ShangmingCai in #526
- [Build] Skip etcd go package compilation by default by @ykwd in #520
- [Build] Deprecate stale adaptor usage to reduce whl package size by @ShangmingCai in #529
- [TransferEngine] Disabling auto-delete QP trying to avoid the availabilty problem by @alogfans in #483
- use kWildcardLocation instead of hardcode "cpu:0" to recognize cpu numa node automatically. by @doujiang24 in #527
- [Build] Optimize store build control for wheel and local build by @ShangmingCai in #531
- add support for batch transfer to accelerate transfer operation by @ssssnow in #499
- [Store] High Availability V2: Client Failover by @ykwd in #501
- feat(store): add zero-copy operations for python binding by @xiaguan in #532
- chore: bump version to 0.3.4 in pyproject.toml by @ShangmingCai in #533
Full Changelog: v0.3.3...v0.3.4