Although most algorithms are efficient enough when calculating a hash for a single file, you will experience a noticeable performance impact when hashing multiple large files, the issue comes from the fact that they read the whole file at once
Fast Hash purpose is to obtain good speed compared to others by reusing any hashing algorithm you want but applying it only on specific chunks of the file
Fast Hash internally use MessageDigest to calculate the hash, you can use any algorithm supported by MessageDigest such as MD5, SHA1, SHA-256...
Fast Hash is inspired by Oshash but implemented differently, here's how it works:
- It reads up to a configurable chunk size from the beginning
- It reads the file size
- It reads up to a configurable chunk size from the end
- Digest the 3 parts and generate a unique hash for the file
You can use SHA1 algorithm and a chunk size of 64 * 1024 which should be good for most cases
This project is built with Maven
JDK 11+ and Maven
- Clone the repo
git clone https://github.com/marwenlahmar/fast-hash.git
- Install
mvn clean install
To get an instance of Fast Hash
FastHash fastHash = FastHash.getInstance("SHA1", 64 * 1024);Fast Hash instances are thread safe
To calculate the hash
String hash = fastHash.digest(seekableDataStream);Fast Hash takes a SeekableDataStream interface as argument Fast Hash provides two implementations for SeekableDataStream: ByteArrayDataStream and FileDataStream
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
Marwen Lahmar - @lahmar_marwen - marwen.lahmar@gmail.com
Project Link: https://github.com/marwenlahmar/fast-hash