In the lpar17 benchmark suite (and jayhorn-set), the SatHanoi.java test case appears to actually be unsafe. The comment states // result and the counter should be the same! but the program asserts false if the values are equal (UnsatHanoi appears to assert equality correctly). Removing the call to Random.nextInt and replacing it with a value in the range [1,31] confirms that SatHanoi indeed hits an assertion failure, and is therefore unsafe (similarly, UnsatHanoi does not appear to throw an exception). Am I misinterpreting the Sat/Unsat labels here?