Skip to content

Conversation

@knizhnik
Copy link

No description provided.

@coveralls
Copy link

coveralls commented Jun 15, 2020

Pull Request Test Coverage Report for Build 95

  • 15 of 23 (65.22%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-2.3%) to 81.5%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/main/scala/scorex/util/package.scala 15 23 65.22%
Totals Coverage Status
Change from base Build 83: -2.3%
Covered Lines: 163
Relevant Lines: 200

💛 - Coveralls

Copy link
Collaborator

@aslesarenko aslesarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests that checks that new implementation of ModifierId:

  1. equals or not equal if and only if the corresponding Base16 strings are equal
  2. new modifierOrdering is equivalent with ordering of Base16 strings.

// This is much more efficient than hashing whole array or String.
// We can use the first 4 bytes and convert them into Int.
override def hashCode: Int = {
val len = math.min(hashBytes.length, 4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knizhnik I assume hashBytes is always at least 32 bytes. If it is not, then ModifierId should not be used in a first place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but there ARE such places. At least in tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public class ArrayCmp {
	static int myhashfunc(byte[] hashBytes) {
		int len = Math.min(hashBytes.length, 4);
		int hash = 0;
		for (int i = 0; i < len; i++) {
			hash = (hash << 8) | (hashBytes[i] & 0xFF);
		}
		return hash;
	}

	static int yourhashfunc(byte[] hashBytes) {
		return hashBytes[0] << 24 | (hashBytes[1] & 0xFF) << 16 | (hashBytes[2] & 0xFF) << 8 | (hashBytes[3]  & 0xFF);
	}

	public static void main(String[] args)
	{
		byte[] a = new byte[32];
	    long x = 0;
		long start = System.currentTimeMillis();
		for (int i = 0; i < 1000000000; i++) {
			x += myhashfunc(a);
		}
		long stop = System.currentTimeMillis();
		System.out.println("Elapsed time for my hash: "  + (stop - start));
		start = stop;
		for (int i = 0; i < 1000000000; i++) {
			x += myhashfunc(a);
		}
		stop = System.currentTimeMillis();
		System.out.println("Elapsed time your hash: "  + (stop - start));
	}
}

time java ArrayCmp
Elapsed time for my hash: 2789
Elapsed time your hash: 3386

Copy link
Collaborator

@aslesarenko aslesarenko Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knizhnik Some issues with this benchmark:

  1. Looks like you don't call yourhashfunc
  2. initialize a with real hash value
  3. need to warm up JIT, use JMH for that, or ScalaMeter at least
  4. print accumulated x value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was also surprised by this result.
Now yourfunc s ~10x times faster.

@aslesarenko
Copy link
Collaborator

aslesarenko commented Jun 18, 2020

Array.utils.compare performs signed byte comparison which is not compatible
with comparison of Base16 strings.
This is why I have to implement comparison myself.
Actually I spent three days trying to understand why all testes are passed by node load is failed.
Sorry, I should mention it in comment.

Yes, this is a good point, 3 days definitely deserve a detailed description.
Please also create the corresponding tests, I mentioned them in the review.

@aslesarenko
Copy link
Collaborator

There are places where ModifierId is used for just two-bytes key ("0005").

I think the tests should be fixed. /cc @kushti


object ModifierId extends TaggedType[String]
type ModifierId = ModifierId.Type
/** Represents hash based id of a modifier (see [Content-addressable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not see correspondence to content-addressable storage, modifier id is not necessarily derived from the content of the modifier only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this reference can be removed I guess.


implicit val modifierOrdering : Ordering[ModifierId] = new Ordering[ModifierId] {
// TODO optimize: use java.util.Arrays.compare after JDK8 support is dropped
def compare(a: ModifierId, b: ModifierId): Int = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is unsigned comparison, let's state it clearly in the comments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests should be fixed. /cc @kushti
There is the following extension:

Extension(id: b9e2c5321993b37fb34608d08ae66c516120fed6c381a771e7e96790b64adff5, headerId: 743f7b91da88f5781bd1cdc457e8d2e50931bdc9ec0a3ec759e2b6e9e7c1b29c, fields: Vector(0005 -> 00000064, 0001 -> 001312d0, 0006 -> 000007d0, 0002 -> 00000168, 0007 -> 00000064, 0003 -> 00080000, 007b -> 00000001, 0008 -> 00000064, 0004 -> 000f4240, 007c -> 0000)) 

And here it is checked using ModifierId hash function:

  .validate(exDuplicateKeys, extension.fields.map(kv => bytesToId(kv._1)).distinct.length == extension.fields.length, extension.encodedId)

Is it a valid extension body? If so, should I replace this check not to use bytesToId function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace bytestoId with just Base16.encode

*/
case class ModifierId(hashBytes: Array[Byte]) {
// This is much more efficient than hashing whole array or String.
// We can use the first 4 bytes and convert them into Int.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aslesarenko @knizhnik let's take LAST 4 bytes, as in case of proof-of-work hash with partial collision may start with zero bits

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants