Skip to content
This repository was archived by the owner on Jun 6, 2018. It is now read-only.

Make Sbtix faster#40

Closed
vquintin wants to merge 3 commits intonightkr:masterfrom
vquintin:faster
Closed

Make Sbtix faster#40
vquintin wants to merge 3 commits intonightkr:masterfrom
vquintin:faster

Conversation

@vquintin
Copy link
Copy Markdown

@vquintin vquintin commented Jun 6, 2018

This PR aims to make sbtix generation faster.
It depends on #37.
The interesting bit is the last commit (2d7125c).

The modifications are short, I only made the I/O more efficient.

I ran sbtix-gen-all on typelevel cats:

  • Before it took 1136s
  • Now it takes 113s

This is 10x faster on this particular project.

nightkr and others added 3 commits May 5, 2018 18:45
This commit improves I/O related to artefact checksum computation.

It easily brings great performance improvements.

I ran `sbtix-gen-all` on typelevel cats:
 - Before it took 1136s
 - Now it takes 113s
This is 10x faster on this particular project.
@nightkr
Copy link
Copy Markdown
Owner

nightkr commented Jun 6, 2018

Wow, I have no idea why it was taking the round-trip through that file. That should absolutely be dropped.

The semaphore is there to avoid annoying the Maven repo with too many simultaneous requests. It can probably be increased a bit safely, but I wouldn't be comfortable removing it entirely.

I'm not comfortable depending on sun.* internals (though a 10x improvement would definitely be worth having a fast path for). I'm also curious about what difference it would make to steam bytes into the hash rather than do it all at once in the end.

@nightkr
Copy link
Copy Markdown
Owner

nightkr commented Jun 6, 2018

Also: I'm in the process of moving the project to GitLab due to the recent acquisition. Issues and PRs are supposed to be migrated, but I'm not sure how it handles changes after the migration has started.

@Ma27
Copy link
Copy Markdown

Ma27 commented Jun 6, 2018

@vquintin awesome! I'll try to have a look at this as well in the afternoon :-)

Also: I'm in the process of moving the project to GitLab due to the recent acquisition. Issues and PRs are supposed to be migrated, but I'm not sure how it handles changes after the migration has started.

👍 👍 👍

@vquintin
Copy link
Copy Markdown
Author

vquintin commented Jun 6, 2018

Is the semaphore there because CacheFetch_WithCollector is used concurrently ? If so, I will put the semaphore back.

Regarding the sun.* import. I did this to get something quickly. I can remove it and instead:

  • Add a dependency on Apache IOUtils
  • Or reimplement it (it is not very long) to avoid the dependency

I think streaming would bring two things:

  • Lower memory usage: in this PR the code loads all the downloaded artifacts on the heap. If the artifact is too big, we run out of heap. We can get away with it as long as the artifacts are not too big
  • Theoretically, it could reduce the time of download + hash computation to the time download if the hash computation is done concurrently. There, I think the download time is orders of magnitude bigger than the hash computation time so it would not bring much. Of course, to be sure, we should measure 😉

Don't forget to communicate when the GitLab migration is done so we can move there 👍

@nightkr
Copy link
Copy Markdown
Owner

nightkr commented Jun 6, 2018

The migration is now done: https://gitlab.com/teozkr/Sbtix/merge_requests/40

@vquintin vquintin closed this by deleting the head repository Nov 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants