Skip to content

Conversation

@kristjaneerik
Copy link
Collaborator

Here's an implementation of stor.xopen which works just like stor.open for regular files, but uses a layer of gzip.GzipFile for files ending in .gz such that you don't have to worry about compression, it happens behind the scenes.

Not sure if I did it correctly with respect to forcing the mode etc..

The portion of the test that runs with stor.xopen(stor.join(self.drive, 'A/C/utf8_file_with_unicode.txt'), 'rb') as xfp: doesn't work if the mode is r. I suspect it's something to do with the mocking rather than the implementation, since the code above it with mode r works. ¯\_(ツ)_/¯

Suggestions for better approaches to testing appreciated!

Inspiration from https://github.com/marcelm/xopen. Would be great if we could just use that, but I'm not sure we can.

@jtratner
Copy link
Contributor

jtratner commented Apr 20, 2018 via email

@jtratner
Copy link
Contributor

jtratner commented Apr 20, 2018 via email

@kristjaneerik
Copy link
Collaborator Author

I added an emoji to the test file.

All of these give PASSED as the result for the single test being run:

tox -e py27 -- stor/tests/test_swift.py::TestSwiftShared::test_xopen_gzip
tox -e py27 -- stor/tests/test_swift.py::TestSwiftShared::test_xopen_regular
tox -e py36 -- stor/tests/test_swift.py::TestSwiftShared::test_xopen_gzip
tox -e py36 -- stor/tests/test_swift.py::TestSwiftShared::test_xopen_regular

@jtratner
Copy link
Contributor

jtratner commented Apr 20, 2018 via email

if self.endswith('.gz'):
if mode == 'r':
mode = 'rb'
if mode == 'w':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these codepaths aren't actually tested (that's what is causing CI to fail).

Additionally, you need to pass the original mode to the gzip.GzipFile object, so perhaps you want something like:

if mode in ('r', 'rb'): fp_mode='rb'
if mode in ('w', 'wb'): fp_mode = 'wb'
fp = self.open(fp_mode, *args, **kwargs)
gzfp = gzip.GzipFile(mode=mode, fileobj=fp)

Additionally, GzipFile's docs say that it doesn't automatically close the underlying file object - which means that if you do something like this:

with stor.xopen('s3://unauthed-bucket/myfile.csv.gz') as fp:
    fp.write('somedata')

the exception will not bubble up to the user (I believe)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants