Skip to content

Conversation

@Fate-JH
Copy link
Contributor

@Fate-JH Fate-JH commented Feb 23, 2017

The purpose of this PR was going to be the necessary quantitization functions. The scope of the changes in the PR greatly exceeds that.

The three main functions to which to pay attention are BitStream::readQuantitizedFloat, BitStream::writeQuantitizedFloat, and util/compareFloats. I also wrote tests for the former two, while the test for the latter function is worked into the read test. It's worth noting that C++ does handle casual decimal rounding differently than Scala. It's also worth noting that the decompiled client code contains four, maybe five, quantitization algorithms. Though they all do roughly the same thing, and would arrive at roughly the same result, each one has subtle differences in organization of statements or conditions.

The original BitStream.h has been deconstructed into a BitStream.cpp source file and a BitStream.h header file. Of note, however, the templated read and write function definitions were left in the header file and the header file alone. It worked fine when the BitStream class was all in one file; but, separated like this, the defintions can only be in one file easily. All class and function documentation is currently with the declarations whereas only inline comments are left in the definitions. In the process of separating the file with @tfarley's permission, I also sorted out the other #include statements in a number of files in common that intersected either bitstream.h or util.h thus that the remaining ones are mostly necessary.

@Fate-JH
Copy link
Contributor Author

Fate-JH commented Feb 23, 2017

Amusing.
Where lexico is shorthand for the modification of floats to integers in compareFloats:
lexico(260.5) + 100 < lexico(-265.0) was true (which is wrong);
but,
lexico(260.5) + lexico(100) < lexico(-265.0) is false (which is correct).

Unlike the float values which turn into different integer values, that 100 is already an integer and the casting doesn't actually change its value as far as the raw math is concerned.

Copy link
Contributor

@tfarley tfarley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, thanks. I know some of my comments might seem nitpicky, but I want to make sure we're staying fairly consistent in our style and formatting or things will get out of hand quickly.

}

void testBitstreamReadQuantitizedFloatLimits() {
static std::vector<uint8_t> expectedBuf = std::vector<uint8_t>({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some tabs mixed in with the spaces in this PR, e.g. here. Please make sure you are being consistent, or you end up with inconsistent indentations like this.
Tools->Options->Text Editor->C/C++->Tabs->Insert Spaces in VC++ can ensure spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll clean this up.

lastError(Error::NONE) {

}
BitStream(std::vector<uint8_t>&);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the parameter names in these header declarations, otherwise it becomes really difficult to tell what each parameter is without flipping back and forth between the .h and .cpp. Doc gen tools also have trouble if they aren't named.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting them all back in.

assertEqual( compareFloats(z, 91.1576f, 100), 0);
}

void testBitstreamWriteQuantitizedFloat() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test as well as the *Limits version below should use hardcoded values for x/y/z as in your read test, and only use bitstream write. I like the idea of this test though, I think we should keep it but it's really more like testing that read <-> write is an identity operation.

Copy link
Contributor Author

@Fate-JH Fate-JH Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue I had going into the write test was precision. I took the same values I had calculated for the read test, tried to use them for the write test, only to discover that the the previous's test wasn't displaying the full decimal precision during an assert failure. That output was two places after the decimal and then rounded, whereas the actual number was four or so, unrounded.

For this write test, using hard numbers, the x-value was fine, but the y-value was off by 2, and the z-value was off by 1, iirc, a matter of some thousandths digit.

Edit: for the limits write test, I think you are referring to the second part, after the rewind the stream comment? The first part of the test suffices the condition that the written bitstream has to match the sample bitstream. IO can just move that second part to its own test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm just thinking that there are 3 kinds of tests here: 1) that reading data produces expected float values, 2) that writing float values produces expected data, and 3) that reading data to floats and then writing those floats produces the data you started with (aka nothing 'lost' when reading + writing). The write tests you have are more like the 3rd case, but I suppose that transitively covers the 2nd case - so as long as the read test passes, and this read-then-write test passes, you can assume that write is working. So perhaps it's good as-is.

if(streamBitPos >= sizeBits) {
return 0;
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep our else-newlining consistent, so one of us needs to change our settings. I like to keep them on the same line to prevent this strange kind of dangling closing brace.
Tools->Options->Text Editor->C/C++->Formatting->New Lines->Position of Keywords->uncheck Place 'else' on a new line in VC++ will make it stop shunting them to a new line.

Copy link
Contributor Author

@Fate-JH Fate-JH Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like putting them on newlines to create a visual interrupt between changes in control statement blocks. If the next part is an else if, that also means the conditional doesn't flush vertically with the previous block. For now, I'll conform.

When I collapse an if...else... statement's block where the closing parenthesis for one block has the beginning of the next block on the same line (like you're suggesting), the collapsed-to statement gets pushed up to immediately follow after the previous statement. It looks like if (something) { ... } else if (somethingElse) { all on one line when the if block is collapsed. Is there a formatting rule that starts the next un-collapsed block on a new line?

Copy link
Contributor

@tfarley tfarley Feb 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think VS has an option for that specific case, although I'm wondering if it might not autoformat if you newline it manually. Hopefully we shouldn't have too many single-line conditionals that it would be a problem. That's the main thing I'm worried about actually - not so much the specifics of what particular style we actually use, but just that VS autoformatting will sprinkle all sorts of irrelevant little changes into the commit/PR diffs and make them harder to read if we have different settings.


size_t BitStream::getRemainingBits() const {
size_t sizeBits = getSizeBits();
if(streamBitPos >= sizeBits) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep our spaces after control flow statements consistent.
Tools->Options->Text Editor->C/C++->Formatting->Spacing->Spacing for control blocks->check Insert space between keyword... in VC++ will keep the space. I believe this is the default option anyway, unless you are using a different version of VC++ or something.

Copy link
Contributor Author

@Fate-JH Fate-JH Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correcting, in a big way.

The VS auto-formatter was being a pain for me and I felt it had way too many settings to want to fiddle with, so I admit to just having turned the whole thing off at some earlier point. =P I can't avoid that anymore. (The said setting was off for me, in any case, so even if the auto-formatter had previously been "on" ...)

readBytes((uint8_t*)str.data(), strLen * 2);
}

void BitStream::readQuantitizedFloat(float& data, const size_t numBits, const float max, const float min) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to const-ify parameters that are passed by value like size_t and float here. We definitely want const for pointers and references where appropriate so that we can be sure that it won't modify caller data, but whether the function is allowed to manipulate its own copies of the parameters should be an implementation detail left to the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only const now left in, of the three functions, is the one next to the data parameter in writeQuantitizedFloat.

The error becomes a count of "the maximum number of lexico-numbers allowed between equal lexico-a and lexico-b."
To avoid encountering a NaN value, error can not risk being too big, however.
*/
if(epsilon >= 4194304) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a comment explaining the meaning of the magic number 4194304?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually borrowed from a collaborated project I haven't touched since my college days. I do recall someone else used the basic convert and dereference method and here I just reframed. I recall that float NaNs were the reason but the specifics beyond "avoiding a float NaN value" are lost on me right now.

Before there's any kind of sign-off on this, let me see what I can dig up by searching for keywords that get jogged from my memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read your other comment - seems to make sense to me (somewhat :) ). No problem if there isn't some relatively easy meaning for this, as it would be somewhat of an insane epsilon to use in our cases anyway (as you noted). I think it's totally fine to leave as-is, was just wondering if there was some quick explanation.

throw std::invalid_argument("float comparison - error margin too big");
}

int epsilon2 = *(int*)&epsilon;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this as far as I can tell - make your epsilon parameter an int instead of a size_t and you should be able to get rid of epsilon2.

Copy link
Contributor Author

@Fate-JH Fate-JH Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did change it to size_t just to get rid of a negative test. Huh. Going back ...

@Fate-JH
Copy link
Contributor Author

Fate-JH commented Feb 24, 2017

Okay, it didn't take long to search for, to be honest, but I used "compare floats two's complement." Out came this page which may very well be where the original algorithm came from. The advice is outdated as of 2012, however, as the author linked to this revision which is, furthermore, part of a blog.

The explanation for the integer epsilon is actually something called ULPs or "Units in the Last Place." The same explanation I left in the code from memory holds up otherwise. It's then number of representable float values between two other comparable float values. Oddly enough, that's not terriblty different from what quantitizing float values should be doing for us. I say "should." The 4194304 is connected to NaN values but not in the way as I remember. The most common NaN produced by floating point math seems to be 0xFFC00000, which is about four million ULPs away from the nearest float infinity value and the caution is to avoid being in striking distance of that infinity, or of zero.

I'm still dissecting the update but the recommendation looks to be a mixture of normal epsilon and ULPs and to use whatever is most appropriate for one's application. In the common cases of quantitized floats we have, the count between any two whole numbers in the float is:

  • 32 (-256 to 256 @ 14 bits)
  • 64 (0 to 1024 @ 16 bits)
  • 128 (0 to 8129 @ 20 bits)

The first one is velocity. For the latter two, that's a surprising amount of fidelity between each meter mark. All the quant floats function really needs to check is that we don't go to zero. The tests comparison function has more general situations that just near the limits.

@Fate-JH
Copy link
Contributor Author

Fate-JH commented Feb 26, 2017

After reviewing the code for the decompiled client, I noticed how the client claims to be extracting a "QuantitizedDouble" and then casting it into a float value I figured, why the heck not. The precision is good, especially when small precision errors at the level of a float seem to be able to turn 0xCA16 into 0xC916.

Also, I scaled back the comparison support function. At the moment, what we need is something for testing, and fabs should be enough for that. We can convert compareDecimals into three preprocessor directives for test.h if you think that will be better than leaving it in util.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants