Skip to content

Commit 42bd421

Browse files
authored
Add SPOG bundle support: account_id, host URL parsing, workspace_id disambiguation (#4825)
## Why SPOG (Single Pane of Glass) hosts serve multiple workspaces from a single URL (e.g. `api.databricks.com` instead of separate per-workspace hosts). Bundles need to support this model so users can: 1. Set `account_id` in their bundle workspace config. 2. Paste SPOG URLs with query parameters (e.g. `https://host.databricks.com/?o=12345`) directly into `databricks.yml`. 3. Have profile resolution work when multiple profiles share the same SPOG host. Without this, bundles on SPOG hosts fail with "multiple profiles matched" errors because the CLI can't distinguish between profiles that share the same host. ## Changes **Before:** Bundles had no `account_id` field, no query parameter parsing from host URLs, and profile resolution only matched by host (erroring on duplicates with no way to disambiguate). **Now:** Three capabilities added: ### 1. `account_id` field in workspace config Added `AccountID` to the `Workspace` struct, passed through to the SDK config. Users can now write: ```yaml workspace: host: https://api.databricks.com account_id: "abc123" workspace_id: "6051921418418893" ``` ### 2. Host URL query parameter extraction `Workspace.Client()` now calls `NormalizeHostURL()` before building the SDK config. This extracts `?o=` (workspace_id), `?a=` (account_id), and their long-form aliases from the host URL, then strips the query params. **Why in `Client()` and not just in the mutator pipeline?** The bundle flow calls `WorkspaceClientE()` (which calls `Client()`) in `configureBundle` *before* `phases.Initialize()` runs. If extraction only happened in the Initialize mutator, workspace_id would be empty when the profile loader runs, and disambiguation would fail. The mutator is kept as a secondary pass to ensure the bundle config stays clean for any code that reads `b.Config.Workspace.Host` directly. **Why explicit fields take precedence over URL params?** A user who writes both `workspace_id: "111"` and `host: https://x/?o=222` clearly intended the explicit field. URL params are a convenience for copy-paste from browser URLs. ### 3. Profile resolution by workspace_id When multiple profiles match the same host, the loader now uses `workspace_id` to disambiguate. The matching order is: host first, then workspace_id as a fallback. **Why host first, not workspace_id first?** Most existing configs don't have workspace_id set yet. Running host matching first preserves existing behavior for the long tail of configs. workspace_id only kicks in when host matching is ambiguous. This was discussed with @pietern and @andrewnester, who preferred keeping the same detection logic with an additional disambiguation fallback over two separate code paths. **Why fall back to the original error when workspace_id doesn't match?** If a user has two profiles for host X with workspace_ids 111 and 222, but the bundle specifies workspace_id 999, the "multiple profiles matched" error is more helpful than "no matching profiles" since it tells them which profiles exist. ### Other changes - Added `workspace_id` and `account_id` to the auth interpolation warning in `NoInterpolationInAuthConfig`. These fields now participate in profile resolution at auth time, so `workspace_id: ${var.my_ws_id}` would be unresolved and cause silent failures. - Updated bundle JSON schema and annotations for the new `account_id` field. ## Test plan - [x] Unit tests for `NormalizeHostURL` mutator (8 cases: empty, no params, ?o=, ?a=, ?account_id=, ?workspace_id=, explicit precedence) - [x] Unit tests for `Workspace.NormalizeHostURL()` method - [x] Regression test: `TestWorkspaceClientNormalizesHostBeforeProfileResolution` verifies that `Client()` normalizes the host URL and populates workspace_id before profile resolution runs (the critical path) - [x] Unit tests for workspace_id disambiguation in the profile loader (5 cases: disambiguate to first/second profile, same workspace_id error, no workspace_id fallback, no match fallback) - [x] `make checks` passes - [x] `make lintfull` passes (0 issues) - [x] All existing tests continue to pass
1 parent 720e2af commit 42bd421

File tree

9 files changed

+260
-2
lines changed

9 files changed

+260
-2
lines changed

bundle/config/validate/interpolation_in_auth_config.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,10 @@ func (f *noInterpolationInAuthConfig) Apply(ctx context.Context, b *bundle.Bundl
4242
"azure_tenant_id",
4343
"azure_environment",
4444
"azure_login_app_id",
45+
46+
// Unified host specific attributes.
47+
"account_id",
48+
"workspace_id",
4549
}
4650

4751
diags := diag.Diagnostics{}

bundle/config/workspace.go

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import (
44
"os"
55
"path/filepath"
66

7+
"github.com/databricks/cli/libs/auth"
78
"github.com/databricks/cli/libs/databrickscfg"
89
"github.com/databricks/databricks-sdk-go"
910
"github.com/databricks/databricks-sdk-go/config"
@@ -43,6 +44,7 @@ type Workspace struct {
4344

4445
// Unified host specific attributes.
4546
ExperimentalIsUnifiedHost bool `json:"experimental_is_unified_host,omitempty"`
47+
AccountID string `json:"account_id,omitempty"`
4648
WorkspaceID string `json:"workspace_id,omitempty"`
4749

4850
// CurrentUser holds the current user.
@@ -124,6 +126,7 @@ func (w *Workspace) Config() *config.Config {
124126

125127
// Unified host
126128
Experimental_IsUnifiedHost: w.ExperimentalIsUnifiedHost,
129+
AccountID: w.AccountID,
127130
WorkspaceID: w.WorkspaceID,
128131
}
129132

@@ -137,7 +140,28 @@ func (w *Workspace) Config() *config.Config {
137140
return cfg
138141
}
139142

143+
// NormalizeHostURL extracts query parameters from the host URL and populates
144+
// the corresponding fields if not already set. This allows users to paste SPOG
145+
// URLs (e.g. https://host.databricks.com/?o=12345) directly into their bundle
146+
// config. Must be called before Config() so the extracted fields are included
147+
// in the SDK config used for profile resolution and authentication.
148+
func (w *Workspace) NormalizeHostURL() {
149+
params := auth.ExtractHostQueryParams(w.Host)
150+
w.Host = params.Host
151+
if w.WorkspaceID == "" {
152+
w.WorkspaceID = params.WorkspaceID
153+
}
154+
if w.AccountID == "" {
155+
w.AccountID = params.AccountID
156+
}
157+
}
158+
140159
func (w *Workspace) Client() (*databricks.WorkspaceClient, error) {
160+
// Extract query parameters (?o=, ?a=) from the host URL before building
161+
// the SDK config. This ensures workspace_id and account_id are available
162+
// for profile resolution during EnsureResolved().
163+
w.NormalizeHostURL()
164+
141165
cfg := w.Config()
142166

143167
// If only the host is configured, we try and unambiguously match it to

bundle/config/workspace_test.go

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,87 @@ func TestWorkspaceResolveProfileFromHost(t *testing.T) {
7373
})
7474
}
7575

76+
func TestWorkspaceNormalizeHostURL(t *testing.T) {
77+
t.Run("extracts workspace_id from query param", func(t *testing.T) {
78+
w := Workspace{
79+
Host: "https://spog.databricks.com/?o=12345",
80+
}
81+
w.NormalizeHostURL()
82+
assert.Equal(t, "https://spog.databricks.com", w.Host)
83+
assert.Equal(t, "12345", w.WorkspaceID)
84+
})
85+
86+
t.Run("extracts both workspace_id and account_id", func(t *testing.T) {
87+
w := Workspace{
88+
Host: "https://spog.databricks.com/?o=605&a=abc123",
89+
}
90+
w.NormalizeHostURL()
91+
assert.Equal(t, "https://spog.databricks.com", w.Host)
92+
assert.Equal(t, "605", w.WorkspaceID)
93+
assert.Equal(t, "abc123", w.AccountID)
94+
})
95+
96+
t.Run("explicit workspace_id takes precedence", func(t *testing.T) {
97+
w := Workspace{
98+
Host: "https://spog.databricks.com/?o=999",
99+
WorkspaceID: "explicit",
100+
}
101+
w.NormalizeHostURL()
102+
assert.Equal(t, "https://spog.databricks.com", w.Host)
103+
assert.Equal(t, "explicit", w.WorkspaceID)
104+
})
105+
106+
t.Run("explicit account_id takes precedence", func(t *testing.T) {
107+
w := Workspace{
108+
Host: "https://spog.databricks.com/?a=from-url",
109+
AccountID: "explicit-account",
110+
}
111+
w.NormalizeHostURL()
112+
assert.Equal(t, "https://spog.databricks.com", w.Host)
113+
assert.Equal(t, "explicit-account", w.AccountID)
114+
})
115+
116+
t.Run("no-op for host without query params", func(t *testing.T) {
117+
w := Workspace{
118+
Host: "https://normal.databricks.com",
119+
}
120+
w.NormalizeHostURL()
121+
assert.Equal(t, "https://normal.databricks.com", w.Host)
122+
assert.Empty(t, w.WorkspaceID)
123+
})
124+
}
125+
126+
func TestWorkspaceClientNormalizesHostBeforeProfileResolution(t *testing.T) {
127+
// Regression test: Client() must normalize the host URL (strip ?o= and
128+
// populate WorkspaceID) before building the SDK config and resolving
129+
// profiles. This ensures workspace_id is available for disambiguation.
130+
setupWorkspaceTest(t)
131+
132+
err := databrickscfg.SaveToProfile(t.Context(), &config.Config{
133+
Profile: "ws1",
134+
Host: "https://spog.databricks.com",
135+
Token: "token1",
136+
WorkspaceID: "111",
137+
})
138+
require.NoError(t, err)
139+
140+
err = databrickscfg.SaveToProfile(t.Context(), &config.Config{
141+
Profile: "ws2",
142+
Host: "https://spog.databricks.com",
143+
Token: "token2",
144+
WorkspaceID: "222",
145+
})
146+
require.NoError(t, err)
147+
148+
// Host with ?o= should be normalized and workspace_id used to disambiguate.
149+
w := Workspace{
150+
Host: "https://spog.databricks.com/?o=222",
151+
}
152+
client, err := w.Client()
153+
require.NoError(t, err)
154+
assert.Equal(t, "ws2", client.Config.Profile)
155+
}
156+
76157
func TestWorkspaceVerifyProfileForHost(t *testing.T) {
77158
// If both a workspace host and a profile are specified,
78159
// verify that the host configured in the profile matches

bundle/internal/schema/annotations.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,9 @@ github.com/databricks/cli/bundle/config.Target:
415415
"description": |-
416416
The Databricks workspace for the target.
417417
github.com/databricks/cli/bundle/config.Workspace:
418+
"account_id":
419+
"description": |-
420+
The Databricks account ID.
418421
"artifact_path":
419422
"description": |-
420423
The artifact path to use within the workspace for both deployments and workflow runs

bundle/schema/jsonschema.json

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

libs/databrickscfg/loader.go

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,21 @@ func (l profileFromHostLoader) Configure(cfg *config.Config) error {
9898
if err == errNoMatchingProfiles {
9999
return nil
100100
}
101-
if err, ok := err.(errMultipleProfiles); ok {
101+
102+
// If multiple profiles match the same host and we have a workspace_id,
103+
// try to disambiguate by matching workspace_id.
104+
if names, ok := AsMultipleProfiles(err); ok && cfg.WorkspaceID != "" {
105+
originalErr := err
106+
match, err = l.disambiguateByWorkspaceID(ctx, configFile, host, cfg.WorkspaceID, names)
107+
if err == errNoMatchingProfiles {
108+
// workspace_id didn't match any of the host-matching profiles.
109+
// Fall back to the original ambiguity error.
110+
log.Debugf(ctx, "workspace_id=%s did not match any profiles for host %s: %v", cfg.WorkspaceID, host, names)
111+
err = originalErr
112+
}
113+
}
114+
115+
if _, ok := AsMultipleProfiles(err); ok {
102116
return fmt.Errorf(
103117
"%s: %w: please set DATABRICKS_CONFIG_PROFILE or provide --profile flag to specify one",
104118
host, err)
@@ -120,6 +134,33 @@ func (l profileFromHostLoader) Configure(cfg *config.Config) error {
120134
return nil
121135
}
122136

137+
// disambiguateByWorkspaceID filters the profiles that matched a host by workspace_id.
138+
func (l profileFromHostLoader) disambiguateByWorkspaceID(
139+
ctx context.Context,
140+
configFile *config.File,
141+
host string,
142+
workspaceID string,
143+
profileNames []string,
144+
) (*ini.Section, error) {
145+
log.Debugf(ctx, "Multiple profiles matched host %s, disambiguating by workspace_id=%s", host, workspaceID)
146+
147+
nameSet := make(map[string]bool, len(profileNames))
148+
for _, name := range profileNames {
149+
nameSet[name] = true
150+
}
151+
152+
return findMatchingProfile(configFile, func(s *ini.Section) bool {
153+
if !nameSet[s.Name()] {
154+
return false
155+
}
156+
key, err := s.GetKey("workspace_id")
157+
if err != nil {
158+
return false
159+
}
160+
return key.Value() == workspaceID
161+
})
162+
}
163+
123164
func (l profileFromHostLoader) isAnyAuthConfigured(cfg *config.Config) bool {
124165
// If any of the auth-specific attributes are set, we can skip profile resolution.
125166
for _, a := range config.ConfigAttributes {

libs/databrickscfg/loader_test.go

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,3 +164,82 @@ func TestAsMultipleProfilesReturnsFalseForNil(t *testing.T) {
164164
assert.False(t, ok)
165165
assert.Nil(t, names)
166166
}
167+
168+
func TestLoaderDisambiguatesByWorkspaceID(t *testing.T) {
169+
cfg := config.Config{
170+
Loaders: []config.Loader{
171+
ResolveProfileFromHost,
172+
},
173+
ConfigFile: "profile/testdata/databrickscfg",
174+
Host: "https://spog.databricks.com",
175+
WorkspaceID: "111",
176+
}
177+
178+
err := cfg.EnsureResolved()
179+
require.NoError(t, err)
180+
assert.Equal(t, "spog-ws1", cfg.Profile)
181+
assert.Equal(t, "spog-ws1", cfg.Token)
182+
}
183+
184+
func TestLoaderDisambiguatesByWorkspaceIDSecondProfile(t *testing.T) {
185+
cfg := config.Config{
186+
Loaders: []config.Loader{
187+
ResolveProfileFromHost,
188+
},
189+
ConfigFile: "profile/testdata/databrickscfg",
190+
Host: "https://spog.databricks.com",
191+
WorkspaceID: "222",
192+
}
193+
194+
err := cfg.EnsureResolved()
195+
require.NoError(t, err)
196+
assert.Equal(t, "spog-ws2", cfg.Profile)
197+
assert.Equal(t, "spog-ws2", cfg.Token)
198+
}
199+
200+
func TestLoaderErrorsOnMultipleMatchesWithSameWorkspaceID(t *testing.T) {
201+
cfg := config.Config{
202+
Loaders: []config.Loader{
203+
ResolveProfileFromHost,
204+
},
205+
ConfigFile: "profile/testdata/databrickscfg",
206+
Host: "https://spog-dup.databricks.com",
207+
WorkspaceID: "333",
208+
}
209+
210+
err := cfg.EnsureResolved()
211+
require.Error(t, err)
212+
assert.ErrorContains(t, err, "multiple profiles matched: spog-dup1, spog-dup2")
213+
}
214+
215+
func TestLoaderErrorsOnMultipleMatchesWithoutWorkspaceID(t *testing.T) {
216+
// Without workspace_id, multiple host matches still error as before.
217+
cfg := config.Config{
218+
Loaders: []config.Loader{
219+
ResolveProfileFromHost,
220+
},
221+
ConfigFile: "profile/testdata/databrickscfg",
222+
Host: "https://spog.databricks.com",
223+
}
224+
225+
err := cfg.EnsureResolved()
226+
require.Error(t, err)
227+
assert.ErrorContains(t, err, "multiple profiles matched: spog-ws1, spog-ws2")
228+
}
229+
230+
func TestLoaderNoWorkspaceIDMatchFallsThrough(t *testing.T) {
231+
// workspace_id doesn't match any of the host-matching profiles.
232+
// Falls back to the original host ambiguity error.
233+
cfg := config.Config{
234+
Loaders: []config.Loader{
235+
ResolveProfileFromHost,
236+
},
237+
ConfigFile: "profile/testdata/databrickscfg",
238+
Host: "https://spog.databricks.com",
239+
WorkspaceID: "999",
240+
}
241+
242+
err := cfg.EnsureResolved()
243+
require.Error(t, err)
244+
assert.ErrorContains(t, err, "multiple profiles matched: spog-ws1, spog-ws2")
245+
}

libs/databrickscfg/profile/file_test.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ func TestLoadProfilesMatchWorkspace(t *testing.T) {
7070
profiler := FileProfilerImpl{}
7171
profiles, err := profiler.LoadProfiles(ctx, MatchWorkspaceProfiles)
7272
require.NoError(t, err)
73-
assert.Equal(t, []string{"DEFAULT", "query", "foo1", "foo2"}, profiles.Names())
73+
assert.Equal(t, []string{"DEFAULT", "query", "foo1", "foo2", "spog-ws1", "spog-ws2", "spog-dup1", "spog-dup2"}, profiles.Names())
7474
}
7575

7676
func TestLoadProfilesMatchAccount(t *testing.T) {

libs/databrickscfg/profile/testdata/databrickscfg

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,25 @@ account_id = abc
2222
[foo2]
2323
host = https://foo
2424
token = foo2
25+
26+
# SPOG profiles sharing the same host but with different workspace_ids
27+
[spog-ws1]
28+
host = https://spog.databricks.com
29+
workspace_id = 111
30+
token = spog-ws1
31+
32+
[spog-ws2]
33+
host = https://spog.databricks.com
34+
workspace_id = 222
35+
token = spog-ws2
36+
37+
# SPOG profiles with same host and same workspace_id (ambiguous)
38+
[spog-dup1]
39+
host = https://spog-dup.databricks.com
40+
workspace_id = 333
41+
token = spog-dup1
42+
43+
[spog-dup2]
44+
host = https://spog-dup.databricks.com
45+
workspace_id = 333
46+
token = spog-dup2

0 commit comments

Comments
 (0)