Commit 665c353
Parallelize file uploads in fs cp command. (#4132)
## What changes are proposed in this pull request?
This PR improves the performance of the `databricks fs cp` command when
copying directories by parallelizing file uploads. The command uses 8
concurrent workers by default but the number can be controlled via
`--concurrency`.
Implementation details:
- **No ordering guarantee:** Files are now copied in parallel with no
guaranteed order (previously sequential).
- **Fail-fast on errors:** If any file copy fails, the context is
cancelled and remaining operations are stopped (first error is
returned).
- **Retry responsibility:** The implementation does not retry failed
operations; this remains the responsibility of the underlying `Filer`
implementation as before.
**Why `--concurrency`?** No strong preference here, it does not seem
that there is a pattern in the CLI to control concurrency in other
places. This is the flag name used in most Go tools but I'm happy to use
something else.
## How is this tested?
Added acceptance tests to exercise most code paths + unit tests to
validate that the context cancellation and propagation works properly.
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>1 parent 2e6dca4 commit 665c353
File tree
26 files changed
+530
-39
lines changed- acceptance/cmd/fs/cp
- dir-to-dir
- localdir
- file-to-dir
- file-to-file
- input-validation
- cmd/fs
- libs/testserver
26 files changed
+530
-39
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
9 | 12 | | |
10 | 13 | | |
11 | 14 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
0 commit comments