Skip to content

Conversation

@QSparks
Copy link

@QSparks QSparks commented Jul 31, 2023

Added a function that will take a session object and return a string of the database connection with the password masked.

Example:
Target database: "postgresql+psycopg2://scott:tiger@localhost/test"
Returned string: "postgresql+psycopg2://scott:***@localhost/test"

resolves #156

Copy link
Contributor

@jameshiebert jameshiebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you've covered most of the bases here, Quintin. Great work! I think that your approach should mostly work, modulo a few comments that I've put inline. I also think that there are a couple other modules that you may need to cover for database logging: align.py and infer.py. If you refer to slides 5-11 in this presentation, it explains the stages of crmprtd. align and insert both require database interaction (as does infer, but it was written after the presentation). Check those out and see if there are places which could use connection string information as extras.

Additionally, I'd like you to look at the test suite and see if there are ways that a few of the tests can check an ensure that the connection string is being logged. Try to use "Test Driven Development": i.e. write your test first (and presumably it will fail), and then write the code that makes the test pass.

Overall, good job!

@QSparks QSparks requested a review from jameshiebert August 8, 2023 19:15
Copy link
Contributor

@jameshiebert jameshiebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments to consider, but I think that we're pretty close :)

)


def records_contain_db_connection(test_session, caplog):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell you exactly why, but importing from conftest is not something that is done. Fixtures are defined in conftest.py, but not utility functions.

Maybe you could create this as a utility fixture as explained in this SO answer? I agree w/ the first commenter that it "feels a bit hacky", but there don't appear to be any obviously better options.

@rod-glover have you run across a requirement like this (sharing a test helper function across test modules) in your pytest trials?

A possible explanation of why it hasn't come up to date (to my knowledge) may be that we are encouraged to keep our unit tests concise and simple. And if your test conditions are so complicated that they require more logic in an external function, then maybe they need to be simplified.

There's an argument to be made for either approach. I think I'll be happy however you choose to proceed.

Copy link
Contributor

@rod-glover rod-glover Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have, and I have used three methods: the hacky one, a slightly clunky one (A), and another maybe less clunky one (B).

Clunky A is to define a fixture that returns the helper function. Then use the fixture as normal. Best to scope such fixtures broadly, i.e., session scope.

Clunky B is to treat the test directory (or any subdirectory of it) as a package, with an __init__.py. Put helper functions there, and import them using relative imports. For example

from . import a_helper_function

from .. import  another_helper_function

Alternatively, create a module in such a package and import from it.

from .helpers import yet_another

Right now B is my preferred setup.

UPDATE: Should have read the SO answer first. It is a variation of clunky A, which reads in my code more like:

def helper_function:
    def f():
       #...

   return f

I don't bother with the Helpers class; that seems ... unweildy and unnecessary unless you are importing a ton of helper functions ... in which case you have to ask why so many.

Copy link
Contributor

@rod-glover rod-glover Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible explanation of why it hasn't come up to date (to my knowledge) may be that we are encouraged to keep our unit tests concise and simple. And if your test conditions are so complicated that they require more logic in an external function, then maybe they need to be simplified.

In general, I like this prinicple. If you write simple functions/methods, and compose them in straightforward ways, then they are easier to understand, test and maintain. That said, some functions need somewhat complicated tests, and/or the same helper function is needed in several different places. So I treat this principle with a certain pragmatism.

Plus I do use helper functions fairly often. YMMV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great input!


def records_contain_db_connection(test_session, caplog):
for record in caplog.records:
if "database" in record.__dict__:
Copy link
Contributor

@jameshiebert jameshiebert Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong here, but I believe that you can just use the idiom if "database" in record and do not have to specifically access the __dict__ attribute. See: https://docs.python.org/3/reference/expressions.html#membership-test-operations

One thing that you don't check here, that maybe you should is whether the record is found in the correct level of logging. E.g. you check that a log entry is found, but it's possible that it's at a higher level of logging then specified.

for record in caplog.records:
if "database" in record.__dict__:
logged_db = getattr(record, "database", {})
if logged_db == test_session.bind.url.render_as_string(hide_password=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to simplify this to:

return record.database == test_session.bind.url.render_as_string(hide_password=True)

@QSparks QSparks requested a review from jameshiebert August 9, 2023 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Identify target database in log output

3 participants