Skip to content

Commit f36486f

Browse files
authored
Preserve empty lines in config-remote-sync (#4514)
## Changes Yaml round-trip strips empty lines when diff is applied to the file. To fix that, we mark empty lines with a marker comment and then restore them after processing. ## Why Empty line removals cause unexpected diff ## Tests Added more acceptance tests and also added unit test for the preservation logic to cover edge cases . It is expected to see updated output in other config-remote-sync tests because previously empty lines were removed <!-- If your PR needs to be included in the release notes for next release, add a separate entry in NEXT_CHANGELOG.md as part of your PR. -->
1 parent db40140 commit f36486f

File tree

14 files changed

+531
-134
lines changed

14 files changed

+531
-134
lines changed

acceptance/bundle/config-remote-sync/cli_defaults/output.txt

Lines changed: 4 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -29,39 +29,22 @@ Resource: resources.pipelines.pipeline1
2929
>>> diff.py databricks.yml.backup databricks.yml
3030
--- databricks.yml.backup
3131
+++ databricks.yml
32-
@@ -1,9 +1,7 @@
33-
bundle:
34-
name: test-bundle-[UNIQUE_NAME]
35-
-
36-
targets:
37-
default:
38-
mode: development
39-
-
40-
resources:
41-
jobs:
42-
@@ -13,4 +11,5 @@
32+
@@ -13,4 +13,5 @@
4333
interval: 1
4434
unit: DAYS
4535
+ pause_status: UNPAUSED
4636
tasks:
4737
- task_key: main
48-
@@ -21,5 +20,8 @@
49-
node_type_id: [NODE_TYPE_ID]
38+
@@ -22,4 +23,8 @@
5039
num_workers: 1
51-
-
40+
5241
+ tags:
5342
+ dev: default_tag_changed
5443
+ max_concurrent_runs: 5
5544
+ name: Custom Job Name
5645
job2:
5746
tasks:
58-
@@ -31,5 +33,4 @@
59-
node_type_id: [NODE_TYPE_ID]
60-
num_workers: 1
61-
-
62-
pipelines:
63-
pipeline1:
64-
@@ -38,2 +39,4 @@
47+
@@ -38,2 +43,4 @@
6548
- notebook:
6649
path: /Users/{{workspace_user_name}}/notebook
6750
+ channel: PREVIEW

acceptance/bundle/config-remote-sync/config_edits/output.txt

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -34,26 +34,14 @@ Resource: resources.jobs.my_job
3434
>>> diff.py databricks.yml.backup databricks.yml
3535
--- databricks.yml.backup
3636
+++ databricks.yml
37-
@@ -1,5 +1,4 @@
38-
bundle:
39-
name: test-bundle-[UNIQUE_NAME]
40-
-
41-
resources:
42-
jobs:
43-
@@ -13,5 +12,4 @@
44-
node_type_id: [NODE_TYPE_ID]
45-
num_workers: 1
46-
-
47-
targets:
48-
default:
49-
@@ -24,5 +22,5 @@
37+
@@ -24,5 +24,5 @@
5038
- success@example.com
5139
on_failure:
5240
- - config-failure@example.com
5341
+ - remote-failure@example.com
5442
parameters:
5543
- name: catalog
56-
@@ -35,7 +33,6 @@
44+
@@ -35,7 +35,6 @@
5745
unit: DAYS
5846
tags:
5947
- env: config-production

acceptance/bundle/config-remote-sync/formatting_preserved/databricks.yml.tmpl

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,29 @@
22
bundle:
33
name: test-bundle-$UNIQUE_NAME
44

5-
# Resources section with extra spacing
5+
6+
# Resources section with extra spacing (consecutive blank lines above)
67
resources:
78
jobs:
89
my_job:
910
# Comment about max concurrent runs
1011
max_concurrent_runs: 1
1112

13+
description: |
14+
Main ETL job for data processing.
15+
16+
Runs daily at 2am UTC.
17+
1218
# Task configuration
1319
tasks:
1420
- task_key: main
21+
22+
description: >-
23+
Main processing task
24+
that runs the notebook.
1525
notebook_task:
1626
notebook_path: /Users/{{workspace_user_name}}/notebook
27+
1728
new_cluster:
1829
spark_version: $DEFAULT_SPARK_VERSION
1930
node_type_id: $NODE_TYPE_ID

acceptance/bundle/config-remote-sync/formatting_preserved/output.txt

Lines changed: 25 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,14 @@ Deploying resources...
33
Updating deployment state...
44
Deployment complete!
55

6-
=== Modify max_concurrent_runs from 1 to 5
6+
=== Modify max_concurrent_runs, description, and add timeout
77
=== Detect and save changes
88
Detected changes in 1 resource(s):
99

1010
Resource: resources.jobs.my_job
11+
description: replace
1112
max_concurrent_runs: replace
13+
timeout_seconds: add
1214

1315

1416

@@ -17,31 +19,33 @@ Resource: resources.jobs.my_job
1719
>>> diff.py databricks.yml.backup databricks.yml
1820
--- databricks.yml.backup
1921
+++ databricks.yml
20-
@@ -2,5 +2,4 @@
21-
bundle:
22-
name: test-bundle-[UNIQUE_NAME]
23-
-
24-
# Resources section with extra spacing
25-
resources:
26-
@@ -8,6 +7,5 @@
22+
@@ -9,10 +9,10 @@
2723
my_job:
2824
# Comment about max concurrent runs
2925
- max_concurrent_runs: 1
30-
-
3126
+ max_concurrent_runs: 5
27+
28+
- description: |
29+
- Main ETL job for data processing.
30+
+ description: |-
31+
+ Updated ETL job.
32+
33+
- Runs daily at 2am UTC.
34+
+ Runs hourly.
35+
3236
# Task configuration
33-
tasks:
34-
@@ -19,10 +17,8 @@
35-
node_type_id: [NODE_TYPE_ID]
36-
num_workers: 1 # inline comment about workers
37-
-
38-
# Tags for categorization
39-
tags:
40-
env: dev # environment tag
41-
team: data-eng
42-
-
43-
# Flow-style formatting (should be preserved)
44-
parameters:
37+
@@ -21,6 +21,5 @@
38+
39+
description: >-
40+
- Main processing task
41+
- that runs the notebook.
42+
+ Main processing task that runs the notebook.
43+
notebook_task:
44+
notebook_path: /Users/{{workspace_user_name}}/notebook
45+
@@ -40,2 +39,3 @@
46+
- {name: catalog, default: main}
47+
- {name: schema, default: dev}
48+
+ timeout_seconds: 3600
4549

4650
>>> [CLI] bundle destroy --auto-approve
4751
The following resources will be deleted:

acceptance/bundle/config-remote-sync/formatting_preserved/script

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,11 @@ touch dummy.whl
1111
$CLI bundle deploy
1212
job_id="$(read_id.py my_job)"
1313

14-
title "Modify max_concurrent_runs from 1 to 5"
14+
title "Modify max_concurrent_runs, description, and add timeout"
1515
edit_resource.py jobs $job_id <<EOF
1616
r["max_concurrent_runs"] = 5
17+
r["description"] = "Updated ETL job.\n\nRuns hourly."
18+
r["timeout_seconds"] = 3600
1719
EOF
1820

1921
title "Detect and save changes"

acceptance/bundle/config-remote-sync/job_fields/output.txt

Lines changed: 3 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,7 @@ Resource: resources.jobs.my_job
2525
>>> diff.py databricks.yml.backup databricks.yml
2626
--- databricks.yml.backup
2727
+++ databricks.yml
28-
@@ -1,5 +1,4 @@
29-
bundle:
30-
name: test-bundle-[UNIQUE_NAME]
31-
-
32-
resources:
33-
jobs:
34-
@@ -8,13 +7,19 @@
28+
@@ -8,13 +8,19 @@
3529
on_success:
3630
- success@example.com
3731
+ no_alert_for_skipped_runs: true
@@ -58,14 +52,13 @@ Resource: resources.jobs.my_job
5852
+ - samples.nyctaxi.trips
5953
environments:
6054
- environment_key: default
61-
@@ -31,5 +36,6 @@
55+
@@ -31,4 +37,6 @@
6256
node_type_id: [NODE_TYPE_ID]
6357
num_workers: 1
64-
-
6558
+ tags:
6659
+ team: data
60+
6761
targets:
68-
default:
6962

7063
>>> [CLI] bundle destroy --auto-approve
7164
The following resources will be deleted:

acceptance/bundle/config-remote-sync/job_multiple_tasks/output.txt

Lines changed: 5 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,7 @@ Resource: resources.jobs.my_job
2121
>>> diff.py databricks.yml.backup databricks.yml
2222
--- databricks.yml.backup
2323
+++ databricks.yml
24-
@@ -1,5 +1,4 @@
25-
bundle:
26-
name: test-bundle-[UNIQUE_NAME]
27-
-
28-
resources:
29-
jobs:
30-
@@ -13,13 +12,11 @@
24+
@@ -13,13 +13,11 @@
3125
node_type_id: [NODE_TYPE_ID]
3226
num_workers: 1
3327
- - task_key: d_task
@@ -47,7 +41,7 @@ Resource: resources.jobs.my_job
4741
+ task_key: e_task
4842
- task_key: c_task
4943
notebook_task:
50-
@@ -28,7 +25,8 @@
44+
@@ -28,7 +26,8 @@
5145
spark_version: 13.3.x-snapshot-scala2.12
5246
node_type_id: [NODE_TYPE_ID]
5347
- num_workers: 2
@@ -58,12 +52,6 @@ Resource: resources.jobs.my_job
5852
+ timeout_seconds: 3600
5953
- task_key: a_task
6054
notebook_task:
61-
@@ -40,5 +38,4 @@
62-
depends_on:
63-
- task_key: c_task
64-
-
65-
rename_task_job:
66-
tasks:
6755
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/test-bundle-[UNIQUE_NAME]/default/files...
6856
Deploying resources...
6957
Updating deployment state...
@@ -88,7 +76,7 @@ Resource: resources.jobs.rename_task_job
8876
>>> diff.py databricks.yml.backup2 databricks.yml
8977
--- databricks.yml.backup2
9078
+++ databricks.yml
91-
@@ -40,14 +40,14 @@
79+
@@ -42,14 +42,14 @@
9280
rename_task_job:
9381
tasks:
9482
- - task_key: b_task
@@ -109,14 +97,14 @@ Resource: resources.jobs.rename_task_job
10997
+ - task_key: b_task_renamed
11098
notebook_task:
11199
notebook_path: /Users/{{workspace_user_name}}/d_task
112-
@@ -58,5 +58,5 @@
100+
@@ -60,5 +60,5 @@
113101
- task_key: c_task
114102
depends_on:
115103
- - task_key: b_task
116104
+ - task_key: b_task_renamed
117105
notebook_task:
118106
notebook_path: /Users/{{workspace_user_name}}/c_task
119-
@@ -67,7 +67,14 @@
107+
@@ -69,7 +69,14 @@
120108
- task_key: a_task
121109
notebook_task:
122110
- notebook_path: /Users/{{workspace_user_name}}/a_task

acceptance/bundle/config-remote-sync/job_pipeline_task/output.txt

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,23 +21,14 @@ Resource: resources.pipelines.my_pipeline
2121
>>> diff.py databricks.yml.backup databricks.yml
2222
--- databricks.yml.backup
2323
+++ databricks.yml
24-
@@ -1,14 +1,12 @@
25-
bundle:
26-
name: test-bundle-[UNIQUE_NAME]
27-
-
28-
resources:
29-
pipelines:
24+
@@ -6,5 +6,5 @@
3025
my_pipeline:
3126
name: test-pipeline-[UNIQUE_NAME]
3227
- development: false
3328
+ development: true
3429
libraries:
3530
- notebook:
36-
path: /Users/{{workspace_user_name}}/notebook
37-
-
38-
jobs:
39-
my_job:
40-
@@ -17,3 +15,3 @@
31+
@@ -17,3 +17,3 @@
4132
pipeline_task:
4233
pipeline_id: ${resources.pipelines.my_pipeline.id}
4334
- full_refresh: false

acceptance/bundle/config-remote-sync/multiple_resources/output.txt

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,29 +22,24 @@ Resource: resources.jobs.job_two
2222
>>> diff.py databricks.yml.backup databricks.yml
2323
--- databricks.yml.backup
2424
+++ databricks.yml
25-
@@ -1,9 +1,8 @@
26-
bundle:
27-
name: test-bundle-[UNIQUE_NAME]
28-
-
29-
resources:
25+
@@ -5,5 +5,5 @@
3026
jobs:
3127
job_one:
3228
- max_concurrent_runs: 1
3329
+ max_concurrent_runs: 5
3430
tasks:
3531
- task_key: main
36-
@@ -14,7 +13,8 @@
37-
node_type_id: [NODE_TYPE_ID]
32+
@@ -15,6 +15,8 @@
3833
num_workers: 1
39-
-
34+
4035
+ tags:
4136
+ team: data
4237
job_two:
4338
- max_concurrent_runs: 2
4439
+ max_concurrent_runs: 10
4540
tasks:
4641
- task_key: main
47-
@@ -25,2 +25,4 @@
42+
@@ -25,2 +27,4 @@
4843
node_type_id: [NODE_TYPE_ID]
4944
num_workers: 1
5045
+ tags:

acceptance/bundle/config-remote-sync/output_json/output.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Deployment complete!
88
{
99
"path": "[TEST_TMP_DIR]/databricks.yml",
1010
"originalContent": "bundle:\n name: test-bundle-[UNIQUE_NAME]\n\nresources:\n jobs:\n test_job:\n max_concurrent_runs: 1\n tasks:\n - task_key: main\n notebook_task:\n notebook_path: /Users/{{workspace_user_name}}/notebook\n new_cluster:\n spark_version: 13.3.x-snapshot-scala2.12\n node_type_id: [NODE_TYPE_ID]\n num_workers: 1\n",
11-
"modifiedContent": "bundle:\n name: test-bundle-[UNIQUE_NAME]\nresources:\n jobs:\n test_job:\n max_concurrent_runs: 3\n tasks:\n - task_key: main\n notebook_task:\n notebook_path: /Users/{{workspace_user_name}}/notebook\n new_cluster:\n spark_version: 13.3.x-snapshot-scala2.12\n node_type_id: [NODE_TYPE_ID]\n num_workers: 1\n tags:\n env: test\n"
11+
"modifiedContent": "bundle:\n name: test-bundle-[UNIQUE_NAME]\n\nresources:\n jobs:\n test_job:\n max_concurrent_runs: 3\n tasks:\n - task_key: main\n notebook_task:\n notebook_path: /Users/{{workspace_user_name}}/notebook\n new_cluster:\n spark_version: 13.3.x-snapshot-scala2.12\n node_type_id: [NODE_TYPE_ID]\n num_workers: 1\n tags:\n env: test\n"
1212
}
1313
],
1414
"changes": {

0 commit comments

Comments
 (0)