-
Notifications
You must be signed in to change notification settings - Fork 2
i156: Adding the target database connection string for the insert stage logging info #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you've covered most of the bases here, Quintin. Great work! I think that your approach should mostly work, modulo a few comments that I've put inline. I also think that there are a couple other modules that you may need to cover for database logging: align.py and infer.py. If you refer to slides 5-11 in this presentation, it explains the stages of crmprtd. align and insert both require database interaction (as does infer, but it was written after the presentation). Check those out and see if there are places which could use connection string information as extras.
Additionally, I'd like you to look at the test suite and see if there are ways that a few of the tests can check an ensure that the connection string is being logged. Try to use "Test Driven Development": i.e. write your test first (and presumably it will fail), and then write the code that makes the test pass.
Overall, good job!
jameshiebert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments to consider, but I think that we're pretty close :)
tests/conftest.py
Outdated
| ) | ||
|
|
||
|
|
||
| def records_contain_db_connection(test_session, caplog): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't tell you exactly why, but importing from conftest is not something that is done. Fixtures are defined in conftest.py, but not utility functions.
Maybe you could create this as a utility fixture as explained in this SO answer? I agree w/ the first commenter that it "feels a bit hacky", but there don't appear to be any obviously better options.
@rod-glover have you run across a requirement like this (sharing a test helper function across test modules) in your pytest trials?
A possible explanation of why it hasn't come up to date (to my knowledge) may be that we are encouraged to keep our unit tests concise and simple. And if your test conditions are so complicated that they require more logic in an external function, then maybe they need to be simplified.
There's an argument to be made for either approach. I think I'll be happy however you choose to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have, and I have used three methods: the hacky one, a slightly clunky one (A), and another maybe less clunky one (B).
Clunky A is to define a fixture that returns the helper function. Then use the fixture as normal. Best to scope such fixtures broadly, i.e., session scope.
Clunky B is to treat the test directory (or any subdirectory of it) as a package, with an __init__.py. Put helper functions there, and import them using relative imports. For example
from . import a_helper_function
from .. import another_helper_function
Alternatively, create a module in such a package and import from it.
from .helpers import yet_another
Right now B is my preferred setup.
UPDATE: Should have read the SO answer first. It is a variation of clunky A, which reads in my code more like:
def helper_function:
def f():
#...
return f
I don't bother with the Helpers class; that seems ... unweildy and unnecessary unless you are importing a ton of helper functions ... in which case you have to ask why so many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible explanation of why it hasn't come up to date (to my knowledge) may be that we are encouraged to keep our unit tests concise and simple. And if your test conditions are so complicated that they require more logic in an external function, then maybe they need to be simplified.
In general, I like this prinicple. If you write simple functions/methods, and compose them in straightforward ways, then they are easier to understand, test and maintain. That said, some functions need somewhat complicated tests, and/or the same helper function is needed in several different places. So I treat this principle with a certain pragmatism.
Plus I do use helper functions fairly often. YMMV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great input!
tests/conftest.py
Outdated
|
|
||
| def records_contain_db_connection(test_session, caplog): | ||
| for record in caplog.records: | ||
| if "database" in record.__dict__: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong here, but I believe that you can just use the idiom if "database" in record and do not have to specifically access the __dict__ attribute. See: https://docs.python.org/3/reference/expressions.html#membership-test-operations
One thing that you don't check here, that maybe you should is whether the record is found in the correct level of logging. E.g. you check that a log entry is found, but it's possible that it's at a higher level of logging then specified.
tests/conftest.py
Outdated
| for record in caplog.records: | ||
| if "database" in record.__dict__: | ||
| logged_db = getattr(record, "database", {}) | ||
| if logged_db == test_session.bind.url.render_as_string(hide_password=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to simplify this to:
return record.database == test_session.bind.url.render_as_string(hide_password=True)
Added a function that will take a session object and return a string of the database connection with the password masked.
Example:
Target database: "postgresql+psycopg2://scott:tiger@localhost/test"
Returned string: "postgresql+psycopg2://scott:***@localhost/test"
resolves #156