Skip to content

Various S3 improvements#16

Open
klauspost wants to merge 11 commits intoor-shachar:mainfrom
klauspost:s3-improvements
Open

Various S3 improvements#16
klauspost wants to merge 11 commits intoor-shachar:mainfrom
klauspost:s3-improvements

Conversation

@klauspost
Copy link

@klauspost klauspost commented Dec 29, 2024

This breaks the placement on older versions. It will probably also break when using versions prior to 1.24 RC.

  • Add custom endpoint via GOCACHE_S3_URL.
  • Add auth with session token via GOCACHE_S3_SESSION_TOKEN
  • Use a default us-east-1 region.
  • Fix ObjectID -> OutputID field rename
  • Fix TimeNanos -> Time field rename
  • Replace GOCACHE_CACHE_KEY with GOCACHE_S3_PREFIX.
  • Add compression to >8KB files on S3.
  • Use first 3 bytes of hash for prefix.
  • Fix NaN in timekeeping (and just ignore time not spent doing requests)
  • Update all dependencies
  • Forks bytes.Buffer to make it seekable, since non-TLS requests will include checksum.
  • Remove arch/os from path.

If you want to test on our playground you can try...

GOCACHE_S3_BUCKET=testbucket
GOCACHE_S3_URL=https://play.min.io
GOCACHE_S3_ACCESS_KEY=minioadmin
GOCACHE_S3_SECRET_KEY=minioadmin

* Add custom endpoint via `GOCACHE_AWS_URL`.
* Add auth with session token via `GOCACHE_AWS_SESSION_TOKEN`
* Use a default `us-east-1` region.
* Fix ObjectID -> OutputID field rename
* Fix TimeNanos -> Time field rename
* Replace `GOCACHE_CACHE_KEY` with `GOCACHE_S3_PREFIX`.
* Add compression to >8KB files on S3.
* Use first 3 bytes of hash for prefix.
* Fix NaN in timekeeping (and just ignore time not spent doing requests)
* Update all dependencies
@klauspost
Copy link
Author

It is completely broken on Windows: golang/go#71059

RC1 also breaks if the experiment isn't enabled due to the renamed ObjectID field.

@klauspost
Copy link
Author

@or-shachar I was thinking, if there is a reason to (forcefully) separate on arch/platform.

AFAICT the files are pretty much unique per platform anyway.

We can either set it as metadata or tags (which will make it available to lifecycle filters).

@or-shachar
Copy link
Owner

Thanks @klauspost ! Any chance you can pull changes from upstream, and add 1.24rc1 to the test matrix (github action files)?

@klauspost
Copy link
Author

Main merged and workflows updated, but I expect you need to approve them to run.

@or-shachar
Copy link
Owner

mm... I was sure that I got the pull request trigger right. Can you kindly pull changes / rebase again?

@klauspost
Copy link
Author

Merged + pushed. Seems like changes to workflows aren't being picked up before they are merged.

@or-shachar
Copy link
Owner

Not sure why... I'll need to read a little more to figure it out. Will continue tomorrow. If you have an idea - let me know.

Copy link
Owner

@or-shachar or-shachar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick comment about the workflows change

Copy link
Owner

@or-shachar or-shachar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to review the bytes operations but here is some questions I gathered.

envVarS3AwsSessionToken = "GOCACHE_S3_SESSION_TOKEN"
envVarS3BucketName = "GOCACHE_S3_BUCKET"
envVarS3CacheKey = "GOCACHE_CACHE_KEY"
envVarS3Prefix = "GOCACHE_S3_PREFIX"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. See discussion above. Honestly I thought I was talking to you until I had already sent in the change.

Comment on lines +1 to +3
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this ? external source file? why is it checked in here? should we import it from somewhere else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forks bytes.Buffer to make it seekable, since non-TLS requests will include checksum.

The aws SDK needs an input that is seekable to work with HTTP endpoints. So I forked bytes.Buffer to include that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants