Add revoke Vault token functionality to OIDC lookup creds#176
Add revoke Vault token functionality to OIDC lookup creds#176webknjaz merged 39 commits intoansible:develfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a module-level logger, a best-effort Changes
Sequence Diagram(s)sequenceDiagram
participant Backend as "kv_backend / ssh_backend"
participant Auth as "hashivault.handle_auth"
participant Vault as "Vault API"
participant Revoke as "hashivault.revoke_token"
Backend->>Auth: request token (OIDC / workload identity)
Auth->>Vault: auth request (OIDC/workload)
Vault-->>Auth: token + metadata
Auth-->>Backend: return token & client/session
Backend->>Vault: perform backend operation (KV read / SSH sign)
Vault-->>Backend: backend response
Backend->>Revoke: finally -> revoke_token(token, namespace, cacert)
Revoke->>Vault: POST auth/token/revoke-self (X-Vault-Token, X-Vault-Namespace)
Vault-->>Revoke: revoke response (ignored)
Revoke-->>Backend: returns (exceptions suppressed)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/unit/credentials/hashivault_test.py (1)
447-506: Add a static-token regression case.These new backend tests only assert revocation for
workload_identity_token, butkv_backend()andssh_backend()are also used by the non-OIDC plugins. Please add a case showing thatrevoke_token()is not called when authentication comes from a caller-suppliedtoken, so this regression stays covered once the production fix lands.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/credentials/hashivault_test.py` around lines 447 - 506, Add tests mirroring test_kv_backend_revokes_jwt_token and test_ssh_backend_revokes_jwt_token but using a caller-supplied 'token' key instead of 'workload_identity_token'; call hashivault.kv_backend(...) and hashivault.ssh_backend(...) with kwargs containing 'token', mock hashivault.handle_auth and requests.Session responses and CertFiles as in the existing tests, and assert mock_revoke.assert_not_called() (i.e., revoke_token is not invoked) to ensure no revocation occurs for static tokens; reference kv_backend, ssh_backend, revoke_token, and handle_auth when locating where to add these new test cases.src/awx_plugins/credentials/hashivault.py (1)
523-524: Don't suppress revocation failures without any signal.A blanket
except Exception: passmakes this security control a silent no-op whenever the revoke call is misconfigured or Vault starts rejecting it, and it also hides non-network bugs in the cleanup path. Keep the call best-effort, but catch request/IO failures explicitly and emit at least a debug or warning log.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/awx_plugins/credentials/hashivault.py`:
- Around line 645-647: The finally blocks call revoke_token(token, **kwargs)
unconditionally, which revokes caller-supplied static Vault tokens; change the
flow so only tokens minted by our plugin are revoked: have handle_auth() (or the
code path that creates a new Vault token) set a marker like
kwargs['_generated_by_plugin'] = True when it issues a temporary token, and in
the finally blocks (where revoke_token is invoked) only call revoke_token if
kwargs.pop('_generated_by_plugin', False) is True (or otherwise check a local
caller_supplied_token boolean captured before auth); update both revoke
locations (the revoke_token calls around lines referenced) to use that guard so
caller-managed tokens are never revoked.
---
Nitpick comments:
In `@tests/unit/credentials/hashivault_test.py`:
- Around line 447-506: Add tests mirroring test_kv_backend_revokes_jwt_token and
test_ssh_backend_revokes_jwt_token but using a caller-supplied 'token' key
instead of 'workload_identity_token'; call hashivault.kv_backend(...) and
hashivault.ssh_backend(...) with kwargs containing 'token', mock
hashivault.handle_auth and requests.Session responses and CertFiles as in the
existing tests, and assert mock_revoke.assert_not_called() (i.e., revoke_token
is not invoked) to ensure no revocation occurs for static tokens; reference
kv_backend, ssh_backend, revoke_token, and handle_auth when locating where to
add these new test cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7e3dd99c-e09e-491f-a557-9cb589d8e0ee
📒 Files selected for processing (2)
src/awx_plugins/credentials/hashivault.pytests/unit/credentials/hashivault_test.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/awx_plugins/credentials/hashivault.py (1)
649-651:⚠️ Potential issue | 🔴 CriticalDon't revoke caller-managed Vault tokens.
These backends are shared by the OIDC and static-token plugins, and
handle_auth()returnskwargs['token']unchanged on Lines 434-435. As written, a credential configured with a long-lived Vault token will revoke that token after the first lookup/sign; only tokens minted for this request should be cleaned up.Suggested change
def kv_backend(**kwargs): + caller_supplied_token = bool(kwargs.get('token')) token = handle_auth(**kwargs) @@ finally: # Revoke token to minimize token lifetime and improve security posture - revoke_token(token, **kwargs) + if not caller_supplied_token: + revoke_token(token, **kwargs) def ssh_backend(**kwargs): + caller_supplied_token = bool(kwargs.get('token')) token = handle_auth(**kwargs) @@ finally: # Revoke token to minimize token lifetime and improve security posture - revoke_token(token, **kwargs) + if not caller_supplied_token: + revoke_token(token, **kwargs)Also applies to: 698-700
🧹 Nitpick comments (1)
src/awx_plugins/credentials/hashivault.py (1)
523-524: Make revoke failures observable.If TLS or request construction breaks here, token cleanup silently stops working with no signal. A debug log keeps the best-effort behavior while making regressions diagnosable.
Suggested change
+import logging + import requests @@ +logger = logging.getLogger(__name__) + @@ - except Exception: - pass + except Exception: + logger.debug('Vault token revocation failed', exc_info=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/awx_plugins/credentials/hashivault.py` around lines 523 - 524, Replace the silent swallow at the revoke failure site (the "except Exception: pass" in src/awx_plugins/credentials/hashivault.py) with a best-effort debug log: catch the exception as a variable (e.g. "except Exception as e"), call the module's existing logger (use the logger instance used elsewhere in this file, e.g. "logger" or "self.logger") to emit a debug-level message including the exception details (or use exc_info=True) and then continue to preserve the best-effort revoke behavior; this makes TLS/request construction or other revoke errors observable while keeping the current fallback flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/awx_plugins/credentials/hashivault.py`:
- Around line 523-524: Replace the silent swallow at the revoke failure site
(the "except Exception: pass" in src/awx_plugins/credentials/hashivault.py) with
a best-effort debug log: catch the exception as a variable (e.g. "except
Exception as e"), call the module's existing logger (use the logger instance
used elsewhere in this file, e.g. "logger" or "self.logger") to emit a
debug-level message including the exception details (or use exc_info=True) and
then continue to preserve the best-effort revoke behavior; this makes
TLS/request construction or other revoke errors observable while keeping the
current fallback flow.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 54062183-19e0-4fb4-a6ce-2cfd50798c84
📒 Files selected for processing (2)
src/awx_plugins/credentials/hashivault.pytests/unit/credentials/hashivault_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/credentials/hashivault_test.py
bc0e305 to
63a540b
Compare
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (96.92%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. |
6405fbb to
20f46a1
Compare
|
Notes from review call:
|
| return {'role': kwargs.get('jwt_role'), 'jwt': workload_identity_token} | ||
|
|
||
|
|
||
| def revoke_token(token: str, **kwargs): |
There was a problem hiding this comment.
This shouldn't accept arbitrary keword args. Make them all known. Especially, since this is a private API that's only used internally. The existing catch-all pattern, however terrible, only exists in the callables that are public API, AFAICS. I see no reason to additionally obscure the logic here.
Additionally, add a single underscore to the name to show that it's indeed private.
There was a problem hiding this comment.
Here's a better signature that makes it obvious what's to be passed, makes sure it's always called w/ the kwargs syntax and helps understand typing expectations:
| def revoke_token(token: str, **kwargs): | |
| def _revoke_token(*, token: str, url: str, cacert: str | None, namespace: str) -> None: |
There was a problem hiding this comment.
This has been addressed in this commit 410a0a7
| with contextlib.suppress(Exception): | ||
| backend(**backend_kwargs) |
There was a problem hiding this comment.
This needs justification so that the reader would know why you're doing this here.
There was a problem hiding this comment.
Right, I figured how to remove this and fix the tests in 5126a1f
| mock_handle_auth = mocker.patch.object( | ||
| hashivault, | ||
| 'handle_auth', | ||
| return_value='test_token', | ||
| ) | ||
| mock_revoke = mocker.patch.object(hashivault, 'revoke_token') | ||
| mocker.patch('requests.Session') | ||
| mocker.patch.object(hashivault, 'CertFiles') | ||
| mocker.patch.object(hashivault, 'raise_for_status') |
There was a problem hiding this comment.
Remember to always use autospec in your mocks.
| mocker.patch.object(hashivault, 'CertFiles') | ||
| mocker.patch.object(hashivault, 'raise_for_status') | ||
|
|
||
| backend = getattr(hashivault, backend_func) |
There was a problem hiding this comment.
Instead of using dynamic trampoline logic for looking up things that are always there and don't need fallbacks, pass these objects as params by accessing the attributes directly, like test_non_oidc_plugins_have_no_internal_fields does, for example.
There was a problem hiding this comment.
@webknjaz How would you type the function parameter? I've tried these options and neither of them satisfy the linter:
- backend_func: object,
- backend_func: _t.Callable[..., str],
- backend_func: _t.Callable[..., _t.Any],
- backend_func: _t.Callable[dict[str, str], str],
There was a problem hiding this comment.
The last one should be fine. But it won't work unless you actually type the definition itself.
| } | ||
| if secret_backend: | ||
| path_segments = [secret_backend, 'data', secret_path] | ||
| def kv_backend(**kwargs): # noqa: PLR0915 |
There was a problem hiding this comment.
Unjustified linting suppressions must not be added. They must always be paired with explanatory comments, expecially those without human-readable identifiers with a bunch of numbers that don't really mean anything to the reader.
Anyway, this one doesn't look justified regardless. It seems like you want to have a cleanup in a bunch of high-level functions. This can likely be solved elegantly via @contextlib.contextmanager, which produces not only as a CM but the result can also be used as a function decorator too, saving us a lot of unnecessary indentation.
While writing this I'd drafted smth like this:
import contextlib as _ctx
import functools as _ft
# the above goes to the top of the module ^
...
@_ctx.contextmanager
def _clean_up_oidc_token(*args, **kwargs):
try:
yield
finally:
if 'workload_identity_token' not in kwargs:
return
_revoke_token(
token=...,
url=kwargs['url'],
cacert=kwargs.get('cacert'),
namespace=kwargs.get('namespace'),
)
def _auto_revoke_oidc_token(call_the_decorated_function, /):
@_ft.wraps(call_the_decorated_function)
def decorate_the_function_with_cleanup(*args, **kwargs):
with _clean_up_oidc_token(*args, **kwargs):
call_the_decorated_function(*args, **kwargs)
return decorate_the_function_with_cleanup
@_auto_revoke_oidc_token # <-- apply this to other callables the same way
def kv_backend(**kwargs):
... # the original contents hereBut then I realized that the token's only available within the plugin functions. So next I realized that the cleanup is actually quite coupled with the auth handling logic so handle_auth() should go into a CM:
import contextlib as _ctx
import functools as _ft
# the above goes to the top of the module ^
...
@_ctx.contextmanager
def _lifetime_bound_token(**kwargs) -> str:
token_is_revocable = 'workload_identity_token' in kwargs # e.g. ephemeral vault token
auth_token = handle_auth(**kwargs)
try:
yield auth_token
finally:
if not token_is_revocable:
return
_revoke_token(
token=...,
url=kwargs['url'],
cacert=kwargs.get('cacert'),
namespace=kwargs.get('namespace'),
)
def _inject_auth_token(call_the_decorated_function, /):
@_ft.wraps(call_the_decorated_function)
def decorate_the_function_with_cleanup(**kwargs):
with _lifetime_bound_token(**kwargs) as http_auth_token:
call_the_decorated_function(token=http_auth_token, **kwargs)
return decorate_the_function_with_cleanup
@_inject_auth_token
def kv_backend(*, token: str, **kwargs):
... # the original contents here, with the `handle_auth()` call removed
@_inject_auth_token
def ssh_backend(*, token: str, **kwargs):
... # the original contents here, with the `handle_auth()` call removedImportant
The above is sketched in-browser so it may contain typos/cosmetic linting errors
but should be good to go otherwise.
There was a problem hiding this comment.
Much a cleaner solution, thanks for taking the time to get this review in @webknjaz . I appreciate it a lot!!
There was a problem hiding this comment.
Yeah I was thinking of a similar contextmanager function:
import contextlib as _ctx
@_ctx.contextmanager
def vault_token(**kwargs):
"""Context manager that yields a Vault token and revokes it on exit if obtained via workload identity."""
token = handle_auth(**kwargs)
try:
yield token
finally:
# Only revoke tokens obtained via workload identity authentication
if kwargs.get('workload_identity_token'):
url = urljoin(kwargs['url'], 'v1/auth/token/revoke-self')
sess = requests.Session()
sess.headers['X-Vault-Token'] = token
if kwargs.get('namespace'):
sess.headers['X-Vault-Namespace'] = kwargs['namespace']
with CertFiles(kwargs.get('cacert')) as cert:
resp = sess.post(url, verify=cert, timeout=30)
resp.raise_for_status() # see my other commentThen
token = handle_auth(**kwargs)
...becomes
with vault_token(**kwargs) as token:
...There was a problem hiding this comment.
@dleehr my suggestion has abstraction layers separated and doesn't put things from multiple layers together. It's verbose to make said layers obvious. And the external decorator is a better fit for the handlers because (1) it doesn't add indentation which is better for maintainability, not just for the diff in review and (2) the auth is a common separate process so it doesn't make sense semantically to make it a part of the handlers but something that is performed externally. This can also be scaled to other plugins by moving to a helper module later if needed. Note that explicit args is better then the unclear semantics of dict.get('thing_name', None) which is one of "arg is missing, or arg is None, or arg is an empty string, or arg is anything falsy" that's not really explained in the condensed style that is overly indented internally as well.
I mentioned this here https://github.com/ansible/awx-plugins/pull/176/changes#r2988429724 and here https://github.com/ansible/awx-plugins/pull/176/changes#r2988429724.
There was a problem hiding this comment.
Context manager implementation, unindenting of the functions and explicit parameter listing in 410a0a7
Modify the tests so they use the new context manager approach for testing Removed logging module and function Removed try-except in most scenarios (the vault token CM still uses it)
811a713 to
05044b8
Compare
- Hidden imports to not clutter global namespace - Added context manager as decorator for kv and ssh backends - Removed the usage of kwargs into explicit function argument names for kv and ssh backends Co-Authored-By: Sviatoslav Sydorenko <webknjaz@redhat.com>
This is the actual use internally and so we shouldn't allow a mix of invocation styles for consistency.
Previously, it was passed through mutating an unpacked dict but it's more readable to pass it into the call directly.
The stdlib docs suggest that this is the way but this somehow does not influence how MyPy processes typing.
PEP 612 defines `typing.ParamSpec` but leaves out the problem of using `typing.Concatenate` for declaring keyword-only arguments that would not be considered a part of a decorator's generic keyword arguments as out of the scope [[1]]. This means that we cannot annotate decorators that add or remove arguments when wrapping decorated functions accurately. We, however, can attempt using type narrowing to let MyPy know what's definitely not expected to be passed into the wrapper function, which this patch does through the `assert` statements. [1]: https://peps.python.org/pep-0612/#concatenating-keyword-parameters
This patch reduces the number of duplicate checks in the test function as means of refactoring and reducing the control flow branches.
This allows preserving signatures of the decorated function annotations without leaking the token managed on a dedicated abstration layer and removing it from the context upon the lifetime exit.
This is needed due to mandatory imports exceeding the limit. The previously existing module structure pattern encapsulates big chunks of implementation in one module. In the future, we should look into breaking it up into smaller parts so that the suppression would not be needed anymore.
05044b8 to
f5203c6
Compare
AAP-64519
Summary by CodeRabbit
New Features
Bug Fixes
Tests