-
Notifications
You must be signed in to change notification settings - Fork 3
[refer #1132] Change ModifierId representation #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 95
💛 - Coveralls |
aslesarenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests that checks that new implementation of ModifierId:
- equals or not equal if and only if the corresponding Base16 strings are equal
- new modifierOrdering is equivalent with ordering of Base16 strings.
| // This is much more efficient than hashing whole array or String. | ||
| // We can use the first 4 bytes and convert them into Int. | ||
| override def hashCode: Int = { | ||
| val len = math.min(hashBytes.length, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knizhnik I assume hashBytes is always at least 32 bytes. If it is not, then ModifierId should not be used in a first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but there ARE such places. At least in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class ArrayCmp {
static int myhashfunc(byte[] hashBytes) {
int len = Math.min(hashBytes.length, 4);
int hash = 0;
for (int i = 0; i < len; i++) {
hash = (hash << 8) | (hashBytes[i] & 0xFF);
}
return hash;
}
static int yourhashfunc(byte[] hashBytes) {
return hashBytes[0] << 24 | (hashBytes[1] & 0xFF) << 16 | (hashBytes[2] & 0xFF) << 8 | (hashBytes[3] & 0xFF);
}
public static void main(String[] args)
{
byte[] a = new byte[32];
long x = 0;
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000000; i++) {
x += myhashfunc(a);
}
long stop = System.currentTimeMillis();
System.out.println("Elapsed time for my hash: " + (stop - start));
start = stop;
for (int i = 0; i < 1000000000; i++) {
x += myhashfunc(a);
}
stop = System.currentTimeMillis();
System.out.println("Elapsed time your hash: " + (stop - start));
}
}
time java ArrayCmp
Elapsed time for my hash: 2789
Elapsed time your hash: 3386
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@knizhnik Some issues with this benchmark:
- Looks like you don't call yourhashfunc
- initialize
awith real hash value - need to warm up JIT, use JMH for that, or ScalaMeter at least
- print accumulated x value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was also surprised by this result.
Now yourfunc s ~10x times faster.
Yes, this is a good point, 3 days definitely deserve a detailed description. |
I think the tests should be fixed. /cc @kushti |
|
|
||
| object ModifierId extends TaggedType[String] | ||
| type ModifierId = ModifierId.Type | ||
| /** Represents hash based id of a modifier (see [Content-addressable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not see correspondence to content-addressable storage, modifier id is not necessarily derived from the content of the modifier only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then this reference can be removed I guess.
|
|
||
| implicit val modifierOrdering : Ordering[ModifierId] = new Ordering[ModifierId] { | ||
| // TODO optimize: use java.util.Arrays.compare after JDK8 support is dropped | ||
| def compare(a: ModifierId, b: ModifierId): Int = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is unsigned comparison, let's state it clearly in the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tests should be fixed. /cc @kushti
There is the following extension:
Extension(id: b9e2c5321993b37fb34608d08ae66c516120fed6c381a771e7e96790b64adff5, headerId: 743f7b91da88f5781bd1cdc457e8d2e50931bdc9ec0a3ec759e2b6e9e7c1b29c, fields: Vector(0005 -> 00000064, 0001 -> 001312d0, 0006 -> 000007d0, 0002 -> 00000168, 0007 -> 00000064, 0003 -> 00080000, 007b -> 00000001, 0008 -> 00000064, 0004 -> 000f4240, 007c -> 0000))
And here it is checked using ModifierId hash function:
.validate(exDuplicateKeys, extension.fields.map(kv => bytesToId(kv._1)).distinct.length == extension.fields.length, extension.encodedId)
Is it a valid extension body? If so, should I replace this check not to use bytesToId function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace bytestoId with just Base16.encode
| */ | ||
| case class ModifierId(hashBytes: Array[Byte]) { | ||
| // This is much more efficient than hashing whole array or String. | ||
| // We can use the first 4 bytes and convert them into Int. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aslesarenko @knizhnik let's take LAST 4 bytes, as in case of proof-of-work hash with partial collision may start with zero bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point.
No description provided.