Compress test_data/largefile.txt fixture #1659

maciej · 2025-12-12T12:24:42Z

An attempt at reducing disk space usage of the published module. See #1658 for issue description.

The code was written by ChatGPT 5.1 Codex, using OpenCode with the following prompt:

The project is currently using test_data/largefile.txt for some of its tests.
It's a large ~70MiB file that when compressed with zstd (see the untracked changes) takes up only 15kiB.
In order to reduce the working directory size and the size of the published module change every instance
of where the largefile.txt is used in tests into the zstd equiv. Few things you'll have to remember:
the compress package is currently used as an indirect dependency, we'll need to run go mod tidy
to bring it into the direct deps section, tests cannot be run without setting up Snowflake DB;
I think we won't be able to do it here. Try reading from the zstd file wrapping any readers
with transparent zstd decompression. Try not to decompress the file in place.

The zstded file decompresses exactly to the largefile.txt, but I nonetheless suggest code reviewers verifying this claim.

I did not setup tests locally for this change.

github-actions · 2025-12-12T12:24:53Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

sfc-gh-pfus · 2025-12-17T06:27:59Z

put_get_test.go

+}
+
+func openLargeTestFile(sourceDir string) (io.ReadCloser, error) {
+	filePath := filepath.Join(sourceDir, "test_data", "largefile.txt.zstd")


As mentioned, if we want to fix it, I would prefer having randomly generated content instead of compressed file. My opinion it would be best to:

remove largefile.txt

generate random content if file not exists

save it in test_data dir

add it to gitignore

@sfc-gh-pfus OK, fair enough. I'm happy to contribute, but I'd appreciate if you'd point me to how to run tests (Snowflake setup); I'm a tourist here, just caring for small Go mod caches.

go mod why github.com/snowflakedb/gosnowflake # github.com/snowflakedb/gosnowflake [INTERNAL | REDACTED] github.com/burningalchemist/sql_exporter/cmd/sql_exporter github.com/burningalchemist/sql_exporter github.com/snowflakedb/gosnowflake

We don't use the lib directly ourselves, it's a sql_exporter requirement. I'll fork / tweak sql_exporter if I need to.

Well, to run integration tests locally, you have to have a Snowflake account, trial should be good enough. Our tests are disabled for external contributions by default, but when we have something close to final version, I'll trigger them manually.

Our tests create random schema in SF during the init, but it can be disabled by setting SKIP_SETUP=true env variable. Because your changes are mostly about what happens before a connection is made, my suggestion is (if you don't want to start a trial):

Set SKIP_SETUP=true

Add printing of a query text (to ensure file paths are correct) temporarily - remove it when your change is ready

Run a test

Observe manually if a correct file is created in a directory and it matches printed query text

An attempt at reducing disk space usage of the published module. See snowflakedb#1658 for issue description. The code was written by ChatGPT 5.1 Codex, using OpenCode with the following prompt: ``` The project is currently using test_data/largefile.txt for some of its tests. It's a large ~70MiB file that when compressed with zstd (see the untracked changes) takes up only 15kiB. In order to reduce the working directory size and the size of the published module change every instance of where the largefile.txt is used in tests into the zstd equiv. Few things you'll have to remember: the compress package is currently used as an indirect dependency, we'll need to run go mod tidy to bring it into the direct deps section, tests cannot be run without setting up Snowflake DB; I think we won't be able to do it here. Try reading from the zstd file wrapping any readers with transparent zstd decompression. Try not to decompress the file in place. ``` The _zstded_ file decompresses exactly to the `largefile.txt`, but I nonetheless suggest code reviewers verifying this claim. I did not setup tests locally for this change.

maciej · 2025-12-19T11:00:01Z

I have read the CLA Document and I hereby sign the CLA

maciej requested a review from a team as a code owner December 12, 2025 12:24

sfc-gh-pfus reviewed Dec 17, 2025

View reviewed changes

maciej force-pushed the issue-1658 branch from abcd988 to 7cb557f Compare December 19, 2025 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compress test_data/largefile.txt fixture #1659

Compress test_data/largefile.txt fixture #1659

Uh oh!

maciej commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Uh oh!

sfc-gh-pfus Dec 17, 2025

Uh oh!

maciej Dec 19, 2025

Uh oh!

sfc-gh-pfus Dec 22, 2025

Uh oh!

maciej commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Compress test_data/largefile.txt fixture #1659

Are you sure you want to change the base?

Compress test_data/largefile.txt fixture #1659

Uh oh!

Conversation

maciej commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Uh oh!

sfc-gh-pfus Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

maciej Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

sfc-gh-pfus Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

maciej commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants