-
Notifications
You must be signed in to change notification settings - Fork 666
[zipsync] Add new tool to efficiently pack and unpack cache entries #5361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
e869fcc
zipsync
bmiddha 5b30861
update readme
bmiddha 118be3d
rush change
bmiddha 0acedae
address pr feedback
bmiddha 58f1f2b
add bin/zipsync
bmiddha 0e19d5e
pr feedback
bmiddha cb9cd73
split pack and unpack
bmiddha b4352ff
fixup benchmark
bmiddha e0dc252
fix unlink
bmiddha c614e31
fixup benchmark
bmiddha 9eac307
fixup benchmark
bmiddha 1a709e3
pr feedback
bmiddha 8c33fbd
fixup benchmark
bmiddha ac896b2
try zstd
bmiddha bd3e701
fixup
bmiddha c8c1bb7
fix tests
bmiddha 9433c9a
revert build cache integration
bmiddha a0c3125
docs
bmiddha a897191
pr feedback
bmiddha f7164c6
pr feedback
bmiddha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| # THIS IS A STANDARD TEMPLATE FOR .npmignore FILES IN THIS REPO. | ||
|
|
||
| # Ignore all files by default, to avoid accidentally publishing unintended files. | ||
| * | ||
|
|
||
| # Use negative patterns to bring back the specific things we want to publish. | ||
| !/bin/** | ||
| !/lib/** | ||
| !/lib-*/** | ||
| !/dist/** | ||
|
|
||
| !CHANGELOG.md | ||
| !CHANGELOG.json | ||
| !heft-plugin.json | ||
| !rush-plugin-manifest.json | ||
| !ThirdPartyNotice.txt | ||
|
|
||
| # Ignore certain patterns that should not get published. | ||
| /dist/*.stats.* | ||
| /lib/**/test/ | ||
| /lib-*/**/test/ | ||
| *.test.js | ||
|
|
||
| # NOTE: These don't need to be specified, because NPM includes them automatically. | ||
| # | ||
| # package.json | ||
| # README.md | ||
| # LICENSE | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # DO NOT MODIFY ABOVE THIS LINE! Add any project-specific overrides below. | ||
| # --------------------------------------------------------------------------- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| @rushstack/zipsync | ||
|
|
||
| Copyright (c) Microsoft Corporation. All rights reserved. | ||
|
|
||
| MIT License | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining | ||
| a copy of this software and associated documentation files (the | ||
| "Software"), to deal in the Software without restriction, including | ||
| without limitation the rights to use, copy, modify, merge, publish, | ||
| distribute, sublicense, and/or sell copies of the Software, and to | ||
| permit persons to whom the Software is furnished to do so, subject to | ||
| the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be | ||
| included in all copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | ||
| EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | ||
| MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND | ||
| NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE | ||
| LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION | ||
| OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION | ||
| WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # @rushstack/zipsync | ||
|
|
||
| zipsync is a focused tool for packing and unpacking build cache entries using a constrained subset of the ZIP format for high performance. It optimizes the common scenario where most files already exist in the target location and are unchanged. | ||
|
|
||
| ## Goals & Rationale | ||
|
|
||
| - **Optimize partial unpack**: Most builds reuse the majority of previously produced outputs. Skipping rewrites preserves filesystem and page cache state. | ||
| - **Only write when needed**: Fewer syscalls. | ||
| - **Integrated cleanup**: Removes the need for a separate `rm -rf` pass; extra files and empty directories are pruned automatically. | ||
| - **ZIP subset**: Compatibility with malware scanners. | ||
| - **Fast inspection**: The central directory can be enumerated without inflating the entire archive (unlike tar+gzip). | ||
|
|
||
| ## How It Works | ||
|
|
||
| ### Pack Flow | ||
|
|
||
| ``` | ||
| for each file F | ||
| write LocalFileHeader(F) | ||
| stream chunks: | ||
| read -> hash + crc + maybe compress -> write | ||
| finalize compressor | ||
| write DataDescriptor(F) | ||
| add metadata entry (same pattern) | ||
| write central directory records | ||
| ``` | ||
|
|
||
| ### Unpack Flow | ||
|
|
||
| ``` | ||
| load archive -> parse central dir -> read metadata | ||
| scan filesystem & delete extraneous entries | ||
| for each entry (except metadata): | ||
| if unchanged (sha1 matches) => skip | ||
| else extract (decompress if needed) | ||
| ``` | ||
|
|
||
| ## Why ZIP (vs tar + gzip) | ||
|
|
||
| Pros for this scenario: | ||
|
|
||
| - Central directory enables cheap listing without decompressing entire payload. | ||
| - Widely understood / tooling-friendly (system explorers, scanners, CI tooling). | ||
| - Per-file compression keeps selective unpack simple (no need to inflate all bytes). | ||
|
|
||
| Trade-offs: | ||
|
|
||
| - Tar+gzip can exploit cross-file redundancy for better compressed size in datasets with many similar files. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| #!/usr/bin/env node | ||
| require('../lib/start.js'); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| { | ||
| "extends": "local-node-rig/profiles/default/config/jest.config.json", | ||
| "setupFilesAfterEnv": ["<rootDir>/config/jestSymbolDispose.js"] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT license. | ||
| // See LICENSE in the project root for license information. | ||
|
|
||
| const disposeSymbol = Symbol('Symbol.dispose'); | ||
| const asyncDisposeSymbol = Symbol('Symbol.asyncDispose'); | ||
|
|
||
| Symbol.asyncDispose ??= asyncDisposeSymbol; | ||
| Symbol.dispose ??= disposeSymbol; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| { | ||
| // The "rig.json" file directs tools to look for their config files in an external package. | ||
| // Documentation for this system: https://www.npmjs.com/package/@rushstack/rig-package | ||
| "$schema": "https://developer.microsoft.com/json-schemas/rig-package/rig.schema.json", | ||
|
|
||
| "rigPackageName": "local-node-rig" | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT license. | ||
| // See LICENSE in the project root for license information. | ||
|
|
||
| const nodeTrustedToolProfile = require('local-node-rig/profiles/default/includes/eslint/flat/profile/node-trusted-tool'); | ||
| const friendlyLocalsMixin = require('local-node-rig/profiles/default/includes/eslint/flat/mixins/friendly-locals'); | ||
|
|
||
| module.exports = [ | ||
| ...nodeTrustedToolProfile, | ||
| ...friendlyLocalsMixin, | ||
| { | ||
| files: ['**/*.ts', '**/*.tsx'], | ||
| languageOptions: { | ||
| parserOptions: { | ||
| tsconfigRootDir: __dirname | ||
| } | ||
| } | ||
| } | ||
| ]; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| { | ||
| "name": "@rushstack/zipsync", | ||
| "version": "0.0.0", | ||
| "description": "CLI tool for creating and extracting ZIP archives with intelligent filesystem synchronization", | ||
| "repository": { | ||
| "type": "git", | ||
| "url": "https://github.com/microsoft/rushstack.git", | ||
| "directory": "apps/zipsync" | ||
| }, | ||
| "bin": { | ||
| "zipsync": "./bin/zipsync" | ||
| }, | ||
| "license": "MIT", | ||
| "scripts": { | ||
| "start": "node lib/start", | ||
| "build": "heft build --clean", | ||
| "_phase:build": "heft run --only build -- --clean", | ||
| "_phase:test": "heft run --only test -- --clean" | ||
| }, | ||
| "dependencies": { | ||
| "@rushstack/terminal": "workspace:*", | ||
| "@rushstack/ts-command-line": "workspace:*", | ||
| "typescript": "~5.8.2", | ||
| "@rushstack/lookup-by-path": "workspace:*" | ||
| }, | ||
| "devDependencies": { | ||
| "@rushstack/heft": "workspace:*", | ||
| "eslint": "~9.25.1", | ||
| "local-node-rig": "workspace:*" | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT license. | ||
| // See LICENSE in the project root for license information. | ||
|
|
||
| import { CommandLineParser } from '@rushstack/ts-command-line/lib/providers/CommandLineParser'; | ||
| import type { | ||
| CommandLineFlagParameter, | ||
| IRequiredCommandLineStringParameter, | ||
| IRequiredCommandLineChoiceParameter, | ||
| IRequiredCommandLineStringListParameter | ||
| } from '@rushstack/ts-command-line/lib/index'; | ||
| import type { ConsoleTerminalProvider } from '@rushstack/terminal/lib/ConsoleTerminalProvider'; | ||
| import type { ITerminal } from '@rushstack/terminal/lib/ITerminal'; | ||
|
|
||
| import type { IZipSyncMode, ZipSyncOptionCompression } from './zipSyncUtils'; | ||
| import { pack, unpack } from './index'; | ||
|
|
||
| export class ZipSyncCommandLineParser extends CommandLineParser { | ||
| private readonly _debugParameter: CommandLineFlagParameter; | ||
| private readonly _verboseParameter: CommandLineFlagParameter; | ||
| private readonly _modeParameter: IRequiredCommandLineChoiceParameter<IZipSyncMode>; | ||
| private readonly _archivePathParameter: IRequiredCommandLineStringParameter; | ||
| private readonly _baseDirParameter: IRequiredCommandLineStringParameter; | ||
| private readonly _targetDirectoriesParameter: IRequiredCommandLineStringListParameter; | ||
| private readonly _compressionParameter: IRequiredCommandLineChoiceParameter<ZipSyncOptionCompression>; | ||
| private readonly _terminal: ITerminal; | ||
| private readonly _terminalProvider: ConsoleTerminalProvider; | ||
|
|
||
| public constructor(terminalProvider: ConsoleTerminalProvider, terminal: ITerminal) { | ||
| super({ | ||
| toolFilename: 'zipsync', | ||
| toolDescription: '' | ||
| }); | ||
|
|
||
| this._terminal = terminal; | ||
| this._terminalProvider = terminalProvider; | ||
|
|
||
| this._debugParameter = this.defineFlagParameter({ | ||
| parameterLongName: '--debug', | ||
| parameterShortName: '-d', | ||
| description: 'Show the full call stack if an error occurs while executing the tool' | ||
| }); | ||
|
|
||
| this._verboseParameter = this.defineFlagParameter({ | ||
| parameterLongName: '--verbose', | ||
| parameterShortName: '-v', | ||
| description: 'Show verbose output' | ||
| }); | ||
|
|
||
| this._modeParameter = this.defineChoiceParameter<IZipSyncMode>({ | ||
| parameterLongName: '--mode', | ||
| parameterShortName: '-m', | ||
| description: | ||
| 'The mode of operation: "pack" to create a zip archive, or "unpack" to extract files from a zip archive', | ||
| alternatives: ['pack', 'unpack'], | ||
| required: true | ||
| }); | ||
|
|
||
| this._archivePathParameter = this.defineStringParameter({ | ||
| parameterLongName: '--archive-path', | ||
| parameterShortName: '-a', | ||
| description: 'Zip file path', | ||
| argumentName: 'ARCHIVE_PATH', | ||
| required: true | ||
| }); | ||
|
|
||
| this._targetDirectoriesParameter = this.defineStringListParameter({ | ||
| parameterLongName: '--target-directory', | ||
| parameterShortName: '-t', | ||
| description: 'Target directories to pack or unpack', | ||
| argumentName: 'TARGET_DIRECTORIES', | ||
| required: true | ||
| }); | ||
|
|
||
| this._baseDirParameter = this.defineStringParameter({ | ||
| parameterLongName: '--base-dir', | ||
| parameterShortName: '-b', | ||
| description: 'Base directory for relative paths within the archive', | ||
| argumentName: 'BASE_DIR', | ||
| required: true | ||
| }); | ||
|
|
||
| this._compressionParameter = this.defineChoiceParameter<ZipSyncOptionCompression>({ | ||
| parameterLongName: '--compression', | ||
| parameterShortName: '-z', | ||
| description: | ||
| 'Compression strategy when packing. "deflate" and "zlib" attempts compression for every file (keeps only if smaller); "auto" first skips likely-compressed types before attempting "deflate" compression; "store" disables compression.', | ||
| alternatives: ['store', 'deflate', 'zstd', 'auto'], | ||
| required: true | ||
| }); | ||
| } | ||
|
|
||
| protected override async onExecuteAsync(): Promise<void> { | ||
| if (this._debugParameter.value) { | ||
| // eslint-disable-next-line no-debugger | ||
| debugger; | ||
| this._terminalProvider.debugEnabled = true; | ||
| this._terminalProvider.verboseEnabled = true; | ||
| } | ||
| if (this._verboseParameter.value) { | ||
| this._terminalProvider.verboseEnabled = true; | ||
| } | ||
| try { | ||
| if (this._modeParameter.value === 'pack') { | ||
| pack({ | ||
| terminal: this._terminal, | ||
| archivePath: this._archivePathParameter.value, | ||
| targetDirectories: this._targetDirectoriesParameter.values, | ||
| baseDir: this._baseDirParameter.value, | ||
| compression: this._compressionParameter.value | ||
| }); | ||
| } else if (this._modeParameter.value === 'unpack') { | ||
| unpack({ | ||
| terminal: this._terminal, | ||
| archivePath: this._archivePathParameter.value, | ||
| targetDirectories: this._targetDirectoriesParameter.values, | ||
| baseDir: this._baseDirParameter.value | ||
| }); | ||
| } | ||
| } catch (error) { | ||
| this._terminal.writeErrorLine('\n' + error.stack); | ||
| } | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| // Jest Snapshot v1, https://goo.gl/fbAQLP | ||
|
|
||
| exports[`CLI Tool Tests should display help for "zipsync --help" 1`] = ` | ||
| " | ||
| zipsync 0.0.0 - https://rushstack.io | ||
|
|
||
| usage: zipsync [-h] [-d] [-v] -m {pack,unpack} -a ARCHIVE_PATH -t | ||
| TARGET_DIRECTORIES -b BASE_DIR -z {store,deflate,zstd,auto} | ||
|
|
||
|
|
||
| Optional arguments: | ||
| -h, --help Show this help message and exit. | ||
| -d, --debug Show the full call stack if an error occurs while | ||
| executing the tool | ||
| -v, --verbose Show verbose output | ||
| -m {pack,unpack}, --mode {pack,unpack} | ||
| The mode of operation: \\"pack\\" to create a zip archive, | ||
| or \\"unpack\\" to extract files from a zip archive | ||
| -a ARCHIVE_PATH, --archive-path ARCHIVE_PATH | ||
| Zip file path | ||
| -t TARGET_DIRECTORIES, --target-directory TARGET_DIRECTORIES | ||
| Target directories to pack or unpack | ||
| -b BASE_DIR, --base-dir BASE_DIR | ||
| Base directory for relative paths within the archive | ||
| -z {store,deflate,zstd,auto}, --compression {store,deflate,zstd,auto} | ||
| Compression strategy when packing. \\"deflate\\" and | ||
| \\"zlib\\" attempts compression for every file (keeps | ||
| only if smaller); \\"auto\\" first skips | ||
| likely-compressed types before attempting \\"deflate\\" | ||
| compression; \\"store\\" disables compression. | ||
|
|
||
| [1mFor detailed help about a specific command, use: zipsync <command> -h[22m | ||
| " | ||
| `; |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.