Skip to content

Add --force-refresh support for Databricks CLI token fetching#1377

Open
mihaimitrea-db wants to merge 1 commit intomainfrom
mihaimitrea-db/stack/cli-force-refresh
Open

Add --force-refresh support for Databricks CLI token fetching#1377
mihaimitrea-db wants to merge 1 commit intomainfrom
mihaimitrea-db/stack/cli-force-refresh

Conversation

@mihaimitrea-db
Copy link
Copy Markdown
Contributor

@mihaimitrea-db mihaimitrea-db commented Mar 31, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Summary

Pass --force-refresh to the Databricks CLI auth token command so the SDK always receives a fresh token instead of a potentially stale one from the CLI's internal cache.

See: databricks/cli#4767

Why

The SDK manages its own token caching via Refreshable. When the SDK decides it needs a new token and shells out to databricks auth token, the CLI may return a cached token that is about to expire (or has already expired from the SDK's perspective). This creates unnecessary refresh failures and retry loops.

The CLI recently added a --force-refresh flag (databricks/cli#4767) that bypasses its internal cache. By using this flag, the SDK is guaranteed a freshly minted token every time it asks for one, eliminating the stale-token problem.

What changed

Interface changes

None. CliTokenSource and DatabricksCliTokenSource are not part of the public API surface.

Behavioral changes

  • The SDK now passes --force-refresh when invoking databricks auth token with a --profile argument. If the CLI is too old to support this flag, the SDK falls back to the plain --profile command (and then to --host if --profile is also unsupported).
  • --force-refresh is only paired with --profile, never with --host. When --host is the only selector (no profile configured), no --force-refresh is attempted.
  • A warning is logged when falling back: "Databricks CLI does not support --force-refresh. Please upgrade your CLI to the latest version."
  • The existing --profile--host fallback warning is unchanged.

Internal changes

  • DatabricksCliTokenSource now holds a _force_cmd field (--profile + --force-refresh). This is None when no profile is configured.
  • DatabricksCliTokenSource.refresh() tries _force_cmd first. On "unknown flag: --force-refresh" or "unknown flag: --profile", it logs a warning and delegates to super().refresh(), which preserves the existing cmdfallback_cmd chain. When _force_cmd is None, it delegates directly to super().refresh().
  • CliTokenSource is unchanged. The existing cmd/fallback_cmd/refresh() logic is preserved as-is.
  • Azure CLI callers are unchanged; AzureCliTokenSource does not use _force_cmd.

How is this tested?

Unit tests in tests/test_credentials_provider.py:

  • test_force_refresh_tried_first_with_profile_force_cmd succeeds, no further commands are tried.
  • test_host_only_skips_force_refresh — when only host is configured, --force-refresh is not used.
  • test_force_refresh_fallback_when_unsupported_force_cmd fails with "unknown flag: --force-refresh", verifies fallback to plain --profile command.
  • test_profile_fallback_when_unsupported_force_cmd fails with "unknown flag: --profile" (very old CLI), verifies fallback cascades through --profile to --host.
  • test_two_step_downgrade_both_flags_unsupported — both --force-refresh and --profile fail, verifies the full chain: _force_cmdcmdfallback_cmd.
  • test_real_auth_error_does_not_trigger_fallback — non-flag error from _force_cmd is raised immediately without fallback.

@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (1d03f6d -> 9e82d18)
NEXT_CHANGELOG.md
@@ -0,0 +1,10 @@
+diff --git a/NEXT_CHANGELOG.md b/NEXT_CHANGELOG.md
+--- a/NEXT_CHANGELOG.md
++++ b/NEXT_CHANGELOG.md
+ ## Release v0.102.0
+ 
+ ### New Features and Improvements
++* Pass `--force-refresh` to the Databricks CLI `auth token` command so the SDK always receives a freshly minted token instead of a potentially stale cached one. Falls back gracefully on older CLIs that do not support the flag.
+ 
+ ### Security
+ 
\ No newline at end of file
databricks/sdk/credentials_provider.py
@@ -63,8 +63,7 @@
 +            flag = self._get_unsupported_flag(e)
 +            if flag in self._KNOWN_CLI_FLAGS:
 +                logger.warning(
-+                    "Databricks CLI does not support %s. "
-+                    "Please upgrade your CLI to the latest version.",
++                    "Databricks CLI does not support %s. " "Please upgrade your CLI to the latest version.",
 +                    flag,
 +                )
 +                token = super().refresh()
pyrefly.toml
@@ -0,0 +1,15 @@
+diff --git a/pyrefly.toml b/pyrefly.toml
+new file mode 100644
+--- /dev/null
++++ b/pyrefly.toml
++project-includes = [
++   "**/*.py"
++]
++
++project-excludes = []
++
++search-path = []
++
++disable-search-path-heuristics = true
++ignore-missing-imports = ["*"]
++ignore-errors-in-generated-code = true
\ No newline at end of file
tests/__init__.pyc
@@ -0,0 +1,3 @@
+diff --git a/tests/__init__.pyc b/tests/__init__.pyc
+new file mode 100644
+Binary files /dev/null and b/tests/__init__.pyc differ
\ No newline at end of file
tests/test_credentials_provider.py
@@ -121,9 +121,7 @@
 +        ts = self._make_token_source()
 +
 +        mock_run = mocker.patch("databricks.sdk.credentials_provider._run_subprocess")
-+        mock_run.side_effect = self._make_process_error(
-+            "cache: databricks OAuth is not configured for this host"
-+        )
++        mock_run.side_effect = self._make_process_error("cache: databricks OAuth is not configured for this host")
 +
 +        with pytest.raises(IOError) as exc_info:
 +            ts.refresh()

Reproduce locally: git range-diff 34d6184..1d03f6d 34d6184..9e82d18 | Disable: git config gitstack.push-range-diff false

@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 9e82d18 to 3195f5d Compare March 31, 2026 14:09
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (9e82d18 -> 3195f5d)
NEXT_CHANGELOG.md
@@ -1,9 +1,9 @@
 diff --git a/NEXT_CHANGELOG.md b/NEXT_CHANGELOG.md
 --- a/NEXT_CHANGELOG.md
 +++ b/NEXT_CHANGELOG.md
- ## Release v0.102.0
- 
  ### New Features and Improvements
+ * Add support for unified hosts. A single configuration profile can now be used for both account-level and workspace-level operations when the host supports it and both `account_id` and `workspace_id` are available. The `experimental_is_unified_host` flag has been removed; unified host detection is now automatic.
+ * Accept `DATABRICKS_OIDC_TOKEN_FILEPATH` environment variable for consistency with other Databricks SDKs (Go, CLI, Terraform). The previous `DATABRICKS_OIDC_TOKEN_FILE` is still supported as an alias.
 +* Pass `--force-refresh` to the Databricks CLI `auth token` command so the SDK always receives a freshly minted token instead of a potentially stale cached one. Falls back gracefully on older CLIs that do not support the flag.
  
  ### Security

Reproduce locally: git range-diff 34d6184..9e82d18 78914fa..3195f5d | Disable: git config gitstack.push-range-diff false

@mihaimitrea-db mihaimitrea-db changed the title Add best-effort --force-refresh support for databricks-cli auth Add --force-refresh support for Databricks CLI token fetching Mar 31, 2026
@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 3195f5d to 4115f37 Compare March 31, 2026 15:12
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (3195f5d -> 4115f37)
databricks/sdk/credentials_provider.py
@@ -1,34 +1,14 @@
 diff --git a/databricks/sdk/credentials_provider.py b/databricks/sdk/credentials_provider.py
 --- a/databricks/sdk/credentials_provider.py
 +++ b/databricks/sdk/credentials_provider.py
- import os
- import pathlib
- import platform
-+import re
- import subprocess
- import sys
- import threading
  
  
  class CliTokenSource(oauth.Refreshable):
-+    _UNKNOWN_FLAG_RE = re.compile(r"unknown flag: (--[a-z-]+)")
 +
      def __init__(
          self,
          cmd: List[str],
-             message = "\n".join(filter(None, [stdout, stderr]))
-             raise IOError(f"cannot get access token: {message}") from e
  
-+    @staticmethod
-+    def _get_unsupported_flag(error: IOError) -> Optional[str]:
-+        """Extract the flag name if the error is an 'unknown flag' CLI rejection."""
-+        match = CliTokenSource._UNKNOWN_FLAG_RE.search(str(error))
-+        return match.group(1) if match else None
-+
-     def refresh(self) -> oauth.Token:
-         try:
-             return self._exec_cli_command(self._cmd)
- 
          fallback_cmd = None
          if cfg.profile:
 -            # When profile is set, use --profile as the primary command.
@@ -45,11 +25,8 @@
          # get_scopes() defaults to ["all-apis"] when nothing is configured, which would
          # cause false-positive mismatches against every token that wasn't issued with
          # exactly ["all-apis"]. Only validate when scopes are explicitly set (either
-             fallback_cmd=fallback_cmd,
          )
  
-+    _KNOWN_CLI_FLAGS = {"--force-refresh", "--profile"}
-+
      def refresh(self) -> oauth.Token:
 -        # The scope validation lives in refresh() because this is the only method that
 -        # produces new tokens (see Refreshable._token assignments). By overriding here,
@@ -60,11 +37,11 @@
 +        try:
 +            token = self._exec_cli_command(self._force_cmd)
 +        except IOError as e:
-+            flag = self._get_unsupported_flag(e)
-+            if flag in self._KNOWN_CLI_FLAGS:
++            err_msg = str(e)
++            if "unknown flag: --force-refresh" in err_msg or "unknown flag: --profile" in err_msg:
 +                logger.warning(
-+                    "Databricks CLI does not support %s. " "Please upgrade your CLI to the latest version.",
-+                    flag,
++                    "Databricks CLI does not support --force-refresh. "
++                    "Please upgrade your CLI to the latest version."
 +                )
 +                token = super().refresh()
 +            else:
tests/test_credentials_provider.py
@@ -128,13 +128,6 @@
 +        assert "databricks OAuth is not configured" in str(exc_info.value)
 +        assert mock_run.call_count == 1
 +
-+    def test_get_unsupported_flag_extracts_flag(self):
-+        """The classifier correctly parses the flag name from CLI error output."""
-+        get = credentials_provider.CliTokenSource._get_unsupported_flag
-+        assert get(IOError("Error: unknown flag: --force-refresh")) == "--force-refresh"
-+        assert get(IOError("Error: unknown flag: --profile")) == "--profile"
-+        assert get(IOError("some other error")) is None
-+
 +
  # Tests for cloud-agnostic hosts and removed cloud checks
  class TestCloudAgnosticHosts:

Reproduce locally: git range-diff 78914fa..3195f5d 78914fa..4115f37 | Disable: git config gitstack.push-range-diff false

@mihaimitrea-db mihaimitrea-db self-assigned this Apr 1, 2026
@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 4115f37 to 469cb44 Compare April 1, 2026 08:44
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (4115f37 -> 469cb44)
pyrefly.toml
@@ -1,15 +0,0 @@
-diff --git a/pyrefly.toml b/pyrefly.toml
-new file mode 100644
---- /dev/null
-+++ b/pyrefly.toml
-+project-includes = [
-+   "**/*.py"
-+]
-+
-+project-excludes = []
-+
-+search-path = []
-+
-+disable-search-path-heuristics = true
-+ignore-missing-imports = ["*"]
-+ignore-errors-in-generated-code = true
\ No newline at end of file
tests/__init__.pyc
@@ -1,3 +0,0 @@
-diff --git a/tests/__init__.pyc b/tests/__init__.pyc
-new file mode 100644
-Binary files /dev/null and b/tests/__init__.pyc differ
\ No newline at end of file

Reproduce locally: git range-diff 78914fa..4115f37 78914fa..469cb44 | Disable: git config gitstack.push-range-diff false

@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 469cb44 to 32c2cbd Compare April 1, 2026 08:57
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (469cb44 -> 32c2cbd)
databricks/sdk/credentials_provider.py
@@ -8,23 +8,19 @@
      def __init__(
          self,
          cmd: List[str],
+             cli_path = self.__class__._find_executable(cli_path)
  
          fallback_cmd = None
++        self._force_cmd = None
          if cfg.profile:
 -            # When profile is set, use --profile as the primary command.
 -            # The profile contains the full config (host, account_id, etc.).
              args = ["auth", "token", "--profile", cfg.profile]
 -            # Build a --host fallback for older CLIs that don't support --profile.
++            self._force_cmd = [cli_path, *args, "--force-refresh"]
              if cfg.host:
                  fallback_cmd = [cli_path, *self.__class__._build_host_args(cfg)]
          else:
-             args = self.__class__._build_host_args(cfg)
- 
-+        self._force_cmd = [cli_path, *args, "--force-refresh"]
-+
-         # get_scopes() defaults to ["all-apis"] when nothing is configured, which would
-         # cause false-positive mismatches against every token that wasn't issued with
-         # exactly ["all-apis"]. Only validate when scopes are explicitly set (either
          )
  
      def refresh(self) -> oauth.Token:
@@ -34,18 +30,21 @@
 -        # when the cached token expires. This catches cases where a user re-authenticates
 -        # mid-session with different scopes.
 -        token = super().refresh()
-+        try:
-+            token = self._exec_cli_command(self._force_cmd)
-+        except IOError as e:
-+            err_msg = str(e)
-+            if "unknown flag: --force-refresh" in err_msg or "unknown flag: --profile" in err_msg:
-+                logger.warning(
-+                    "Databricks CLI does not support --force-refresh. "
-+                    "Please upgrade your CLI to the latest version."
-+                )
-+                token = super().refresh()
-+            else:
-+                raise
++        if self._force_cmd is None:
++            token = super().refresh()
++        else:
++            try:
++                token = self._exec_cli_command(self._force_cmd)
++            except IOError as e:
++                err_msg = str(e)
++                if "unknown flag: --force-refresh" in err_msg or "unknown flag: --profile" in err_msg:
++                    logger.warning(
++                        "Databricks CLI does not support --force-refresh. "
++                        "Please upgrade your CLI to the latest version."
++                    )
++                    token = super().refresh()
++                else:
++                    raise
          if self._requested_scopes:
              self._validate_token_scopes(token)
          return token
\ No newline at end of file
pyrefly.toml
@@ -0,0 +1,15 @@
+diff --git a/pyrefly.toml b/pyrefly.toml
+new file mode 100644
+--- /dev/null
++++ b/pyrefly.toml
++project-includes = [
++   "**/*.py"
++]
++
++project-excludes = []
++
++search-path = []
++
++disable-search-path-heuristics = true
++ignore-missing-imports = ["*"]
++ignore-errors-in-generated-code = true
\ No newline at end of file
tests/__init__.pyc
@@ -0,0 +1,3 @@
+diff --git a/tests/__init__.pyc b/tests/__init__.pyc
+new file mode 100644
+Binary files /dev/null and b/tests/__init__.pyc differ
\ No newline at end of file
tests/test_credentials_provider.py
@@ -41,9 +41,9 @@
 +        expiry = (datetime.now() + timedelta(hours=1)).strftime("%Y-%m-%dT%H:%M:%S")
 +        return json.dumps({"access_token": access_token, "token_type": "Bearer", "expiry": expiry})
 +
-+    def test_force_refresh_always_tried_first(self, mocker):
-+        """refresh() always tries --force-refresh first."""
-+        ts = self._make_token_source()
++    def test_force_refresh_tried_first_with_profile(self, mocker):
++        """When profile is configured, refresh() tries --force-refresh first."""
++        ts = self._make_token_source(profile="my-profile")
 +
 +        mock_run = mocker.patch("databricks.sdk.credentials_provider._run_subprocess")
 +        mock_run.return_value = Mock(stdout=self._valid_response_json("refreshed").encode())
@@ -54,10 +54,26 @@
 +
 +        cmd = mock_run.call_args[0][0]
 +        assert "--force-refresh" in cmd
++        assert "--profile" in cmd
++
++    def test_host_only_skips_force_refresh(self, mocker):
++        """When only host is configured, --force-refresh is not used."""
++        ts = self._make_token_source()
++
++        mock_run = mocker.patch("databricks.sdk.credentials_provider._run_subprocess")
++        mock_run.return_value = Mock(stdout=self._valid_response_json("token").encode())
++
++        token = ts.refresh()
++        assert token.access_token == "token"
++        assert mock_run.call_count == 1
 +
++        cmd = mock_run.call_args[0][0]
++        assert "--force-refresh" not in cmd
++        assert "--host" in cmd
++
 +    def test_force_refresh_fallback_when_unsupported(self, mocker):
 +        """Old CLI without --force-refresh: falls back to cmd without --force-refresh."""
-+        ts = self._make_token_source()
++        ts = self._make_token_source(profile="my-profile")
 +
 +        mock_run = mocker.patch("databricks.sdk.credentials_provider._run_subprocess")
 +        mock_run.side_effect = [
tests/test_config.py
@@ -1,11 +0,0 @@
-diff --git a/tests/test_config.py b/tests/test_config.py
---- a/tests/test_config.py
-+++ b/tests/test_config.py
- 
- def test_config_deep_copy(monkeypatch, mocker, tmp_path):
-     mocker.patch(
--        "databricks.sdk.credentials_provider.CliTokenSource.refresh",
-+        "databricks.sdk.credentials_provider.CliTokenSource._exec_cli_command",
-         return_value=oauth.Token(
-             access_token="token",
-             token_type="Bearer",
\ No newline at end of file
tests/test_core.py
@@ -1,27 +0,0 @@
-diff --git a/tests/test_core.py b/tests/test_core.py
---- a/tests/test_core.py
-+++ b/tests/test_core.py
- 
- def test_databricks_cli_credential_provider_installed_new(config, monkeypatch, tmp_path, mocker):
-     get_mock = mocker.patch(
--        "databricks.sdk.credentials_provider.CliTokenSource.refresh",
-+        "databricks.sdk.credentials_provider.CliTokenSource._exec_cli_command",
-         return_value=Token(
-             access_token="token",
-             token_type="Bearer",
-     config, monkeypatch, tmp_path, mocker, token_claims, configured_scopes, auth_type, expect
- ):
-     mocker.patch(
--        "databricks.sdk.credentials_provider.CliTokenSource.refresh",
-+        "databricks.sdk.credentials_provider.CliTokenSource._exec_cli_command",
-         return_value=Token(access_token=_make_jwt(token_claims), token_type="Bearer", expiry=datetime(2023, 5, 22)),
-     )
-     write_large_dummy_executable(tmp_path)
- 
- def test_databricks_cli_scope_validation_error_message(config, monkeypatch, tmp_path, mocker):
-     mocker.patch(
--        "databricks.sdk.credentials_provider.CliTokenSource.refresh",
-+        "databricks.sdk.credentials_provider.CliTokenSource._exec_cli_command",
-         return_value=Token(
-             access_token=_make_jwt({"scope": "all-apis"}), token_type="Bearer", expiry=datetime(2023, 5, 22)
-         ),
\ No newline at end of file

Reproduce locally: git range-diff 78914fa..469cb44 78914fa..32c2cbd | Disable: git config gitstack.push-range-diff false

@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 32c2cbd to 194256c Compare April 1, 2026 09:00
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (32c2cbd -> 194256c)
pyrefly.toml
@@ -1,15 +0,0 @@
-diff --git a/pyrefly.toml b/pyrefly.toml
-new file mode 100644
---- /dev/null
-+++ b/pyrefly.toml
-+project-includes = [
-+   "**/*.py"
-+]
-+
-+project-excludes = []
-+
-+search-path = []
-+
-+disable-search-path-heuristics = true
-+ignore-missing-imports = ["*"]
-+ignore-errors-in-generated-code = true
\ No newline at end of file
tests/__init__.pyc
@@ -1,3 +0,0 @@
-diff --git a/tests/__init__.pyc b/tests/__init__.pyc
-new file mode 100644
-Binary files /dev/null and b/tests/__init__.pyc differ
\ No newline at end of file

Reproduce locally: git range-diff 78914fa..32c2cbd 78914fa..194256c | Disable: git config gitstack.push-range-diff false

When the SDK's cached CLI token is stale, try `databricks auth token
--force-refresh` to get a freshly minted token from the IdP. If the
installed CLI is too old to recognise the flag, fall back to regular
`auth token` and remember the capability for future refreshes.

Centralise unknown-flag detection in CliTokenSource._exec_cli_command()
via UnsupportedCliFlagError so the same classifier is reused by both the
legacy --profile fallback and the new --force-refresh downgrade path in
DatabricksCliTokenSource.

See: databricks/cli#4767
@mihaimitrea-db mihaimitrea-db force-pushed the mihaimitrea-db/stack/cli-force-refresh branch from 194256c to cd6c876 Compare April 1, 2026 09:09
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

Range-diff: main (194256c -> cd6c876)
NEXT_CHANGELOG.md
@@ -4,7 +4,7 @@
  ### New Features and Improvements
  * Add support for unified hosts. A single configuration profile can now be used for both account-level and workspace-level operations when the host supports it and both `account_id` and `workspace_id` are available. The `experimental_is_unified_host` flag has been removed; unified host detection is now automatic.
  * Accept `DATABRICKS_OIDC_TOKEN_FILEPATH` environment variable for consistency with other Databricks SDKs (Go, CLI, Terraform). The previous `DATABRICKS_OIDC_TOKEN_FILE` is still supported as an alias.
-+* Pass `--force-refresh` to the Databricks CLI `auth token` command so the SDK always receives a freshly minted token instead of a potentially stale cached one. Falls back gracefully on older CLIs that do not support the flag.
++* Pass `--force-refresh` to the Databricks CLI `auth token` command so the SDK always receives a freshly minted token instead of a potentially stale cached one.
  
  ### Security
  
\ No newline at end of file

Reproduce locally: git range-diff 78914fa..194256c 78914fa..cd6c876 | Disable: git config gitstack.push-range-diff false

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1377
  • Commit SHA: cd6c876f1e8d8ed9e8606b8e49b5034440136df1

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant