Skip to content

Releases: kvcache-ai/Mooncake

v0.3.8

26 Dec 09:25
823064a

Choose a tag to compare

What's Changed

Read more

v0.3.7.post2

04 Nov 04:40
b6a841d

Choose a tag to compare

What's Changed

Full Changelog: v0.3.7.post1...v0.3.7.post2

v0.3.7.post1

03 Nov 08:04
7c22adb

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.3.7...v0.3.7.post1

v0.3.7

25 Oct 02:56
9e4f96b

Choose a tag to compare

What's Changed

  • [Store] skip null buffer by @XucSh in #812
  • [Store] Change Default Value of eviction_high_watermark_ratio and eviction_ratio by @ykwd in #820
  • [CI/Build] gate mooncake-store test behind BUILD_UNIT_TESTS option by @peng1999 in #821
  • [Doc] Store Integrated to SGLang HiCache by @ykwd in #829
  • [TransferEngine]: remove SO_REUSEADDR in findAvailableTcpPort by @doujiang24 in #830
  • [Build] Install Python Files by @ykwd in #836
  • [TransferEngine] Make ascend TE to be released successfully and support fast recovery from failures through retry by @hjchen2 in #827
  • [Build] Install Python Files Patch by @ykwd in #839
  • [Doc] SGLang HiCache Intergration by @ykwd in #833
  • [Docs] Fix Broken Trace Link by @ykwd in #841
  • feat(store): add NUMA node binding support via bind_to_numa_node method by @xiaguan in #823
  • docs(deployment): Add Basic Mooncake Store deployment guide by @xiaguan in #825
  • refactor(store): use dedicated thread for signal handling by @xiaguan in #840
  • add ascend protocol to mooncake store by @ascend-direct-dev in #835
  • store: Add json file and improve doc by @201341 in #843
  • feat(store): add client heartbeat support for non ha mode by @xiaguan in #845
  • Fix typo in issue template by @Zane-Jiang in #858
  • Fix nvlink_transport bug: revert #683 by @ShangmingCai in #869
  • fix adxl find tcp port bug by @ascend-direct-dev in #856
  • chore: bump version to 0.3.6.post1 in pyproject.toml by @ShangmingCai in #870
  • [Transfer Engine] Post notify if all transfer tasks are completed by @alogfans in #831
  • feat(store): support transfer engine p2phandshake by @xiaguan in #852
  • [Chores] Remove Unused Variable by @ykwd in #822
  • [TransferEngine] Performance Enhancement for Heterogeneous Ascend via Intelligent Aggregation & Pipeline Design by @zuochunwei in #859
  • [Misc] feat: Support external kv_connector for vllm v1 by @dtcccc in #865
  • feat(store): disable auto discovery by default, require devices for RDMA by @xiaguan in #877
  • Allow customizing RPC port range by @peng1999 in #873
  • [Store] Check If Get Completed Within Lease by @ykwd in #778
  • [Docs] Update Obsolete Content & Fix Minor Problems by @ykwd in #880
  • fix(store): fix memory leak in client_integration_test.cpp by @JINGE-ui in #881
  • Refactor(store): Remove BufStatus and segment_name for AllocatedBuffer by @xiaguan in #883
  • [Store] Check if Connecting Master Fails by @ykwd in #886
  • [TransferEngine] clear all transport mems for fast recovery for ascend transport by @hjchen2 in #847
  • [Misc] Mooncake EP & Mooncake Backend by @UNIDY2002 in #805
  • [Docs] Update quick start and usage examples by @chestnut-Q in #893
  • fix(store): disable persistence instead of returning error by @xiaguan in #892
  • [Store]: Get start_time before calling RPC in BatchQuery by @nickyc975 in #896
  • feat(store): Add multi threading handle page fault during segment allocation by @xiaguan in #875
  • docs(store): restructure and simplify SGLang HiCache integration guide by @xiaguan in #897
  • [Store]: Add option to use jemalloc in mooncake store master by @nickyc975 in #902
  • [Misc] improvements for mooncake_connector_v1 by @dtcccc in #906
  • [TransferEngine] initiator_test script: make it works with P2PHANDSHAKE. by @doujiang24 in #907
  • [CI/Build] For Mooncake EP, fix the flag USE_CUDA that was unexpectedly turned off by @UNIDY2002 in #909
  • [Misc] For EP, pass device_name instead of nic_id when creating ep.Buffer by @UNIDY2002 in #910
  • [Doc] Add Mooncake x SGLang Hicache Design and Some Updates by @ykwd in #913
  • fix(doc): Add Hicache Design to Index.md by @ykwd in #914
  • [TransferEngine] Add Moore Threads GPUs Support by @popsiclexu in #862
  • feat(TE): add notify support for sync transfers and expose getNotifies API by @staryxchen in #894
  • mooncake-backend chunked transfer by @ympcMark in #911
  • ascend direct transport support transfer to multiple destinations in one batch by @ascend-direct-dev in #857
  • fix(store): Fix integer overflow in get_into/batch_get_into for values > 4GB by @xiaguan in #920
  • [Doc] Add Clarification for STORE_USE_ETCD Compile Option by @ykwd in #927
  • [Store] Change log level for batch operation by @stmatengss in #916
  • Reduced build-with-ep workflow by @JasonZhang517 in #926
  • [Misc] Fix chunked impl of _reduce_scatter_base by @UNIDY2002 in #931
  • feat: add batch_put_from_multi_buffers by @LCAIZJ in #929
  • [Misc] Fix the shutdown logic of Mooncake Backend by @UNIDY2002 in #933
  • [CI/Build] Always build with EP in CI by @UNIDY2002 in #922
  • [Integration] feat: introduce barex allocator by @stmatengss in #932
  • TE: adxl config without buffer pool by @ascend-direct-dev in #941
  • fix(transfer_engine): replace deprecated Json::Reader by @xiaguan in #938
  • [CI] Fix CI Error Due to RDMA Fail by @ykwd in #930
  • mlx5gda.cpp: add cleanup to destroy ah in mlx5gda_modify_rc_qp_init2rtr by @zhilishui in #945
  • [doc] Fix documentation link by @Liziqi-77 in #949
  • feat(store_service): support load config from env for mooncake store_service by @Syspretor in #951
  • [Misc] For Mooncake Backend, skip transferring to non-active ranks by @UNIDY2002 in #953
  • Enable CUDA support in CI configuration by @ShangmingCai in #937
  • Bugfix issue 946 by @uniqueni in #947
  • Bump version to 0.3.7 in pyproject.toml by @ShangmingCai in #959
  • Try to fix the release CI by @UNIDY2002 in #962

New Contributors

Full Changelog: v0.3.6...v0.3.7

v0.3.6.post1

20 Sep 03:31
356d99f

Choose a tag to compare

What's Changed

  • [Store] skip null buffer by @XucSh in #812
  • [Store] Change Default Value of eviction_high_watermark_ratio and eviction_ratio by @ykwd in #820
  • [CI/Build] gate mooncake-store test behind BUILD_UNIT_TESTS option by @peng1999 in #821
  • [Doc] Store Integrated to SGLang HiCache by @ykwd in #829
  • [TransferEngine]: remove SO_REUSEADDR in findAvailableTcpPort by @doujiang24 in #830
  • [Build] Install Python Files by @ykwd in #836
  • [TransferEngine] Make ascend TE to be released successfully and support fast recovery from failures through retry by @hjchen2 in #827
  • [Build] Install Python Files Patch by @ykwd in #839
  • [Doc] SGLang HiCache Intergration by @ykwd in #833
  • [Docs] Fix Broken Trace Link by @ykwd in #841
  • feat(store): add NUMA node binding support via bind_to_numa_node method by @xiaguan in #823
  • docs(deployment): Add Basic Mooncake Store deployment guide by @xiaguan in #825
  • refactor(store): use dedicated thread for signal handling by @xiaguan in #840
  • add ascend protocol to mooncake store by @ascend-direct-dev in #835
  • store: Add json file and improve doc by @201341 in #843
  • feat(store): add client heartbeat support for non ha mode by @xiaguan in #845
  • Fix typo in issue template by @Zane-Jiang in #858
  • Fix nvlink_transport bug: revert #683 by @ShangmingCai in #869
  • fix adxl find tcp port bug by @ascend-direct-dev in #856
  • chore: bump version to 0.3.6.post1 in pyproject.toml by @ShangmingCai in #870

New Contributors

Full Changelog: v0.3.6...v0.3.6.post1

v0.3.6

10 Sep 07:46
be89497

Choose a tag to compare

What's Changed

  • feat(store): add batch get buffer support by @xiaguan in #671
  • [TransferEngine] optimization: remove request_list parameter from submitTransferTask by @staryxchen in #565
  • feat(transfer_engine_bench): Add multi-GPU support by @staryxchen in #675
  • [CI/Build] Mooncake-common/common.cmake: Add link flag of pthread by @weinanliu in #681
  • [DOC] fix problem in mooncake-store-preview.md by @SgtPepperr in #685
  • [Store] metric: add response struct by @stmatengss in #686
  • [Store] fix: add client list metrics by @stmatengss in #693
  • refactor(offset-allocator): add memory allocation metrics tracking by @xiaguan in #687
  • [TransferEngine] Fix build issues & adapt to latest Mooncake changes by @AscendTransport in #684
  • docs: add RDMA memory registration troubleshooting guide by @xiaguan in #694
  • add instructions for running on AMD GPU by @lihaofd in #689
  • [BugFix] Zero Size RDMA Mem Register by @ykwd in #695
  • refactor(store): move client buffer implementation to store module by @xiaguan in #700
  • refactor(MasterClient): introduce generic RPC invocation helpers by @xiaguan in #697
  • [BugFix] Topology Empty Check Bug by @ykwd in #696
  • bugfix(nvlink): Add explicit P2P access enablement and error handling for NvlinkTransport by @staryxchen in #683
  • [BugFix] Forbid Register Zero Size Memory by @ykwd in #701
  • [TransferEngine] feat: Support CXL shared memory, and provide simple unit tests. by @hemist in #670
  • code format & enable code format checking in ci by @doujiang24 in #677
  • docs: add troubleshooting steps for QP allocation error by @staryxchen in #707
  • [Store] Enhance Master Metrics by @ykwd in #705
  • [store] feat: add master config by @201341 in #650
  • [Store][bind] add new support data types by @stmatengss in #712
  • [TransferEngine] feat: add a universal method to get CXL device size automatically by @StepY1aoZz in #715
  • Reimplement VRAM buffering in TCP transport by @alogfans in #702
  • [Transfer Engine] fix: maximum the memory resource limitation by @stmatengss in #716
  • Fix lint error in transfer_engine_validator.cpp by @SCDESPERTATE in #720
  • Add a topology dumping tool for ease of use by @SCDESPERTATE in #713
  • [TransferEngine] Update to support CANN 8.2.RC1 by @AscendTransport in #714
  • [Store] Optimize Offset Allocator by @ykwd in #706
  • [Store] Add multi-endpoint etcd support for Transfer Engine metadata plugin by @vladnosiv in #729
  • Add ability to do RDMA without nvidia-peermem by @misterwilliam in #704
  • feat: add source code of MXA-EP by @UNIDY2002 in #726
  • refactor(store): move python bidning to pybind_client by @xiaguan in #723
  • [Bugfix] invalidation of one replica results in deletion of the key by @vladnosiv in #731
  • [TransferEngine] exclude packaging ascend precompiled libraries by @hjchen2 in #737
  • [Transfer Engine] Metrics: Add total qp metrics by @stmatengss in #738
  • Fix deleting buffer that doesn't belong to us by @SzymonOzog in #739
  • fix(store): replace CHECK with error handling by @xiaguan in #735
  • feat(client): Add client-side metrics for transfer and RPC operations by @xiaguan in #733
  • [Store]feat: Migrate Persistence Metadata from Client to Master Service by @SgtPepperr in #690
  • Add interface for fuzz match by @XucSh in #734
  • refactor(store_py): Replace function with AutoPortBinder RAII class by @xiaguan in #741
  • [Docs] Minor Update: Explain Return Value of batch_put_from by @ykwd in #747
  • [CI] Avoid Running Deploy Workflow on Forked Repositories by @ykwd in #746
  • Fixed cachelib_memory_allocator dependency. by @karya0 in #750
  • [Store] Refine Complicated Constructor Parameters by @ykwd in #748
  • Update .typos.toml by @ShangmingCai in #756
  • add ascend direct transport by @ascend-direct-dev in #740
  • Fix typo CI by @ShangmingCai in #757
  • [TransferEngine] Add guide in testing Transfer Engine, and remove confusing output in transfer engine by @alogfans in #754
  • [TransferEngine] Ascend supports asymmetric amount of registered memory by @hjchen2 in #758
  • [Doc] Fix 3FS plugin file link problem by @SgtPepperr in #762
  • [Store] Serialize/Deserialize Offset Allocator by @ykwd in #760
  • refactor(store): remove garbage collection implementation by @xiaguan in #763
  • [Doc] Allocator Performance by @ykwd in #774
  • [TE][EndpointStore]: Fix hand_ assignment after evict by @lizhemingi in #768
  • [Doc] update doc for *Regex interface by @XucSh in #776
  • [Store] add c++ http metadata server in mooncake master by @stmatengss in #766
  • [Store] Add replication guarantees by @vladnosiv in #744
  • [Store] Break Circular Dependency Between type.h And Other Files by @ykwd in #771
  • [DOC] Add all badges by @stmatengss in #781
  • [Doc] Add description for CONTRIBUTING.md by @SgtPepperr in #773
  • [TE] Fix adxl error code in ascend-direct-transport by @ascend-direct-dev in #764
  • [Bugfix] YAML CPP has inline impl in header file which will cause linking error by @alexnails in #784
  • [TE] Fix notifs problems in C wrapper by @alogfans in #779
  • fix nixl bench bug by @haobayuxi in #788
  • feat(store): Add largest free region filtering for allocation by @xiaguan in #785
  • Fix Handshake Daemon Initialization Order by @staryxchen in #765
  • [Store] Update stress_cluster_benchmark.py (Multi-thread mooncake store benchmark) by @ChaosD in #791
  • chore(deps): bump tracing-subscriber from 0.3.18 to 0.3.20 in /mooncake-transfer-engine/rust by @dependabot[bot] in #794
  • GIL release for put_tensor and get_tenor by @jerrychenhf in #783
  • Fix: ascend direct transport support host addr type by @ascend-direct-dev in #786
  • [coro_rpc] use client pool and enable rdma by @qicosmos in #789
  • [TransferEngine] Fix SIEVE eviction algorithm for RDMA endpoint store by @KarmaD7 in #767
  • [TE][RDMA Transport]: Simplify Transfer Submission Logic by @staryxchen in #772
  • [TransferEngine] heterogeneous_ascend support kv-cache transfer between npu and gpu by @zuochunwei in #759
  • fix(store): add mutex locks for thread-safe metrics retrieval by @xiaguan in #804
  • feat(build): add CI-specific build option with --ci-build flag by @xiaguan in #808
  • [Store] Remove unnecessary register buffer from put_tensor by @jerrychenhf in #803
  • [CI] Fix Release build_wheel.sh to make python 3.8 auditwheel happy by @mumupika in #801
  • [Chores] Offset Allocator Test Fix & Docs Fix by @ykwd in #806
  • [Doc] Update docs for a better quick start by @chestnut-Q in https://github.com/k...
Read more

v0.3.5

25 Jul 03:29
392fea7

Choose a tag to compare

What's Changed

  • feat(store): add thread safety analysis with clang annotations by @xiaguan in #538
  • feat(master): support rpc server address parameter by @xiaguan in #530
  • add notify support by @haobayuxi in #528
  • [TE] revert: fix QP reclaim issues by @stmatengss in #543
  • chore: bump version to 0.3.4.post1 in pyproject.toml by @ShangmingCai in #544
  • [TransferEngine] Add Redis password authentication and DB selection via environment variables by @staryxchen in #512
  • feat(store): add batch exist support for master by @xiaguan in #542
  • [TransferEngine] Fix side effect of wild location registration by @alogfans in #552
  • chore: bump version to 0.3.4.post2 in pyproject.toml by @ShangmingCai in #554
  • chore: checkout specific version of yalantinglibs in script by @xiaguan in #555
  • [TransferEngine]: fix compilation warning by @201341 in #550
  • [TransferEngine] fix segfault when create cq failed by @doujiang24 in #535
  • [Integration] feat: expose batch reg API by @stmatengss in #558
  • support batch put/get api in python module by @xinranwang17 in #556
  • feat(store): add zero copy batch put and get for python binding by @xiaguan in #551
  • [TransferEngine] bugfix: ensure proper socket closure in destructor by @staryxchen in #566
  • [TransferEngine] Add support to force MNNVL transport by MC_FORCE_MNNVL by @alogfans in #572
  • [Store] Add Chaos Tests and Fix Bugs by @ykwd in #568
  • Optimize slice handling to accelerate the large batch transfer operation by @SCDESPERTATE in #557
  • [P2P Store] Add cuda link option when it is installed by @alogfans in #560
  • [DOC] fix: Naming errors in Doc transfer-engine-python.md by @SgtPepperr in #508
  • [cmake]fix cmake for centos by @qicosmos in #573
  • [Doc] Add pypi install guide in the build doc by @ShangmingCai in #574
  • [TransferEngine] Enable Huawei Ascend Transport for TransferEngine by @AscendTransport in #502
  • [Misc] Add Issue Template in Github by @scatyf3 in #506
  • [DOC] Update API description of mooncake store client by @panli889 in #548
  • [DOC] Add Description for High Availability in Store by @ykwd in #576
  • Disable memcpy by default and improve stress workload test by @xiaguan in #577
  • [Store] Enable Client SSD Offload And Storage Persistence by @SgtPepperr in #437
  • [TransferEngine] Fix retry logics in RDMA worker by @alogfans in #417
  • refactor: introduce expected pattern for error handling in master service by @xiaguan in #562
  • refactor(tests): enhance stress test benchmarking with zero-copy batch by @xiaguan in #586
  • [TransferEngine] Enlarge default send/recv message size in etcd by @alogfans in #575
  • fixed initall function by @JasonZhang517 in #591
  • [DOC]: Add Description for Data Persistence and KVCache offloading in Store by @SgtPepperr in #585
  • add support for asynchronous batch transfer to accelerate transfer operation by @SCDESPERTATE in #564
  • [Store] Add ungister_buffer api for Store by @SgtPepperr in #596
  • feat(topology): improve HCA selection by considering PCIe distance by @staryxchen in #581
  • [Store] Soft Pin for Important Object by @ykwd in #587
  • docs: add support for LMDeploy by @Risc-lt in #592
  • [doc] Update mooncake-store doc by @LuyuZhang00 in #603
  • test(client): add batch put test for duplicate keys by @xiaguan in #588
  • refactor(store): remove unused value_length from PutStart functions by @xiaguan in #606
  • feat(store): add ReplicateConfig support for pybindings by @xiaguan in #608
  • [Store] feat: put/get tensor API for store by @stmatengss in #579
  • feat(store): add get_hostname method for py bindings by @xiaguan in #617
  • fix: correctly cleanup local buffer allocation by @xinranwang17 in #590
  • [Doc] Update Mooncake Store Docs by @ykwd in #612
  • [Mooncake Store] perf: avoid memory copy for rpc service by @qicosmos in #618
  • refactor(rpc_service): separate implementation into cpp file by @xiaguan in #620
  • implement genNotify interface by @haobayuxi in #600
  • [DOC] Update readme by @stmatengss in #629
  • [TransferEngine] Fix address already in use by @alogfans in #604
  • fix(memory): Prevent integer overflow in getMemoryLocation for large memory regions by @ZeroLiu2018 in #626
  • [Build] Fix nvlink allocator compile command by @ShangmingCai in #534
  • ci: add --use-nvcc flag to build nvlink.so by @xiaguan in #634
  • Revert "implement genNotify interface" by @ShangmingCai in #636
  • [BugFix] Prevent SIGSEGV when SliceBuffer is destroyed after DistributedObjectStore::close() by @wwq2333 in #639
  • [TransferEngine] Add IPv6 support [2] by @thefacetakt in #628
  • [TransferEngine] Ascend Transport: add batch_transfer_sync, Debian support & bug fixes by @AscendTransport in #619
  • [Store] Import Offset Allocator by @ykwd in #641
  • [TransferEngine] fix the compilation warnings by @LuyuZhang00 in #643
  • mooncake-common: add config class by @201341 in #582
  • refactor(store): replace memory management with offset allocator by @xiaguan in #642
  • [Fix] Support large global segment by @ykwd in #647
  • feat(rdma): add device affinity optimization for RDMA performance by @staryxchen in #645
  • [TransferEngine] Reimplement #600 posting Notify message after transfer successful by @alogfans in #635
  • Remove unused files by @ykwd in #652
  • docs(store): add Python API documentation for mooncake store by @xiaguan in #646
  • [store] test: add mutil-threads test by @LuyuZhang00 in #611
  • fix(store) : fix disk-backed replicas in size calculation and slice allocation(#653) by @SgtPepperr in #655
  • refactor(store_py): convert functions to use tl::expected for error by @xiaguan in #651
  • fix transfer engine: handle install transport fail by @LCAIZJ in #656
  • [DOC] add SGLang RDMA trouble shooting by @stmatengss in #662
  • fix(bench): use correct GPU ID in memory registration by @staryxchen in #661
  • [TransferEngine] Fix NVlink accuracy drop issue by @alogfans in #663
  • [Store] feat: add metadata support for tensor interface by @JasonZhang517 in #625
  • [Store] Enlarge the Default KV TTL by @ykwd in #660
  • [TransferEngine] Fix compile issue to make CentOS usable + Make ascend_transport timeout configurable by @AscendTransport in #658
  • [Store] Master Service Support OffsetAllocater by @ykwd in #657
  • [Store]feat: Add 3fs native api plugin for KVCache storage persistence by @SgtPepperr in #610
  • Revert "[TransferEngine] Fix NVlink accuracy drop issue" by @ShangmingCai in #665
  • feat(ci): enable CUDA support in CI workflow ...
Read more

v0.3.4.post2

25 Jun 08:23
3765bae

Choose a tag to compare

What's Changed

  • [TransferEngine] Add Redis password authentication and DB selection via environment variables by @staryxchen in #512
  • feat(store): add batch exist support for master by @xiaguan in #542
  • [TransferEngine] Fix side effect of wild location registration by @alogfans in #552
  • chore: bump version to 0.3.4.post2 in pyproject.toml by @ShangmingCai in #554
  • chore: checkout specific version of yalantinglibs in script by @xiaguan in #555

New Contributors

Full Changelog: v0.3.4.post1...v0.3.4.post2

v0.3.4.post1

23 Jun 11:28
810828c

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.3.4...v0.3.4.post1

v0.3.4

20 Jun 10:58
8d07a23

Choose a tag to compare

What's Changed

  • chore(ci): disable asan in release workflow by @xiaguan in #493
  • chore: bump version to 0.3.3.post1 in pyproject.toml by @ShangmingCai in #494
  • [TransferEngine] Optimize custom allocator function name by @ShangmingCai in #497
  • chore: bump version to 0.3.3.post2 in pyproject.toml by @ShangmingCai in #498
  • fix(transfer-task): fix error hanlding logic in transfer task by @xiaguan in #503
  • [MooncakeIntegration] Fix find class id by @jellor in #500
  • [Build] add TE bench into wheel package by @stmatengss in #514
  • [Build] add nvlink hook into python package dir for local build by @ShangmingCai in #517
  • [Build] Optimize nvlink allocator build logic and fix name issue by @ShangmingCai in #523
  • [Build] Add allocator class to support nvlink for more use-cases by @ShangmingCai in #524
  • [TransferEngine] Change option use_nvlink to use_mnnvl to clarify the usage by @ShangmingCai in #525
  • [Bugfix] Fix missing option and sglang integration doc by @ShangmingCai in #526
  • [Build] Skip etcd go package compilation by default by @ykwd in #520
  • [Build] Deprecate stale adaptor usage to reduce whl package size by @ShangmingCai in #529
  • [TransferEngine] Disabling auto-delete QP trying to avoid the availabilty problem by @alogfans in #483
  • use kWildcardLocation instead of hardcode "cpu:0" to recognize cpu numa node automatically. by @doujiang24 in #527
  • [Build] Optimize store build control for wheel and local build by @ShangmingCai in #531
  • add support for batch transfer to accelerate transfer operation by @ssssnow in #499
  • [Store] High Availability V2: Client Failover by @ykwd in #501
  • feat(store): add zero-copy operations for python binding by @xiaguan in #532
  • chore: bump version to 0.3.4 in pyproject.toml by @ShangmingCai in #533

Full Changelog: v0.3.3...v0.3.4