Skip to content
Matthew Flatt edited this page Feb 15, 2023 · 43 revisions

This page is a glorified scratchpad for me to keep track of issues (or potential issues) I encounter as I review the racket fork of Chez Scheme. It’s also grown to include a more general to-do list, reminders, and changes we should consider (but which we may ultimately decide not to make). I’ve attempted to organize things to categorize items by whether they need followup (and by whom).

The file is probably is more useful (to me, at least) when viewed in something that really understands Org Mode (like, say, emacs), but the GitHub wiki rendering is ok… except that it doesn’t render the TODO tags in headers, which I am using to mark items that have been RESOLVED. So that’s less than ideal. (Here’s a test of how well various Org-mode features are supposed on GitHub.) In addition, I’ve hidden resolved/completed items under org nodes marked with COMMENT, which makes them not render in the output (including the GitHub page renderer).

Also, the unless GitHub suddenly gains the ability to expand and collapse sections in Org Mode documents, it’s going to be easier to read in an editor that lets you treat the file like a tree instead of a big flat document with some parts rendered in a bigger font than others.

To Do

These items are outstanding, with a clear action to be taken

Documentation

Review release notes

fasl-read with externals

Bad things happen if the vector of “externals” does not have the correct length. This I think this needs to have some error checking and well-defined behavior, at least in “safe” mode.

MF: Added a check, along with some clarifications in the documentation and implementation about how fixnums and immediates are not passed to the externals predicate. And those changes got tied up with some other corrections and improvements to make boot files more deterministic with cross compiling from a 32-bit machine to a 64-bit machine and vice versa. That commit could use further review.

JT: keeping this in the TO-DO section until someone (with experience in fasl and cross compilation) can take a look at that commit.

Outstanding

These items are outstanding (as in “unresolved”, not “most excellent” - although they may be that as well), with no consensus on what action(s) (if any) should be taken.

List operations in the face of concurrent modification

Not strictly related to merging the Racket fork, but captured here since it’s partially addressed by a commit in the Racket fork.

As noted in cisco#599, the behavior of certain list operations has no clear semantics in the presence of concurrent modification to the list structure. Commit 26f79b57 addresses the possibility of a crash (via invalid memory reference) specifically in list?. The tortoise/hare idiom is pervasive throughout the list operations, so one might reasonably assume that the same or similar deficiencies exist in the other places that it is used. (Note that in most of the other operations these variables are called fast and slow.) Also note that it is possible to construct a sequence of concurrent edits that result in the algorithm looping forever.

MF: I really should have noticed that the problem is more pervasive. I advocate fixing it everywhere, as reflected in 6a164b34. Except that it seems I still missed the one in “syntax.ss”, at least. (Edit: 1729721b51) (If “4.ss” is about do-continuation-marks-first, I don’t think that has the same issue.)

This leads to the following questions:

Do we accept the changes in 26f79b57?

I don’t think there’s much argument that avoiding an invalid memory reference is better than not, even when you’re already off in undefined-behavior-land. So I think this comes down to two questions:

Are we ok with the performance impact?

As noted, it’s likely to be small, since the inserted if branch is easily predicted. I imagine some of the maintainers will want to compare benchmarks, if Andy has managed to get those in some more usable shape.

MF: One attempt to measure performance is here.

Is a solution that still allows for infinite loops ok?

I hope so, since I don’t really see a way around this that isn’t horribly expensive.

Do we need to change other code using this idiom?

list-length uses this idiom (with the tortoise variable name), and there are 14 other instances that use slow as the variable (mostly in 5_2.ss, but one each in 4.ss and syntax.ss).

Ideally we’d evaluate each one and decide whether that particular algorithm is succeptible to errors in the presence of concurrent modification, but that seems like a lot of work. Copying the lengthy comment repeatedly seems not great, and the extra text will make the overall algorithm less clear. One could imagine defining a macro to encapsulate the logic and accompanying comment; changing each call site to, say, (advance-tortoise tortoise hare).

MF: As I tried this, it seemed most workable to adapt each copy, partly because a non-pair is handled in varying ways.

How should the documentation be updated?

What updates, if any, should we make to the documentation? At one extreme, we could attempt to document the thread-safety of every builtin procedure. At the other extreme (disregarding the possibility of making no changes at all), we could simply add text to the Engines introduction to note that mutations to shared structures require care (similar to the language already present in the Thread System introduction).

box-immoble

Maybe should be immobile-box for symmetry with make-immmobile-vector and make-immobile-bytevector?

MF: box-immobile is meant to be like box-immutable. Maybe box-immutable was a bad choice, too, but that would be a change to Chez Scheme at this point.

begin-unsafe

As a user, I’d expect this to work in definition context as well (and be confused when it doesn’t, even though the documentation doesn’t say that it does). The implementation for that looks a bit more complicated, but users don’t care about that.

No mats.

MF: For now, I’ve just tried to sweep this under the rug by changing it to a $begin-unsafe system form.

JT: I think that’s a good idea. I feel like there should be better way to achieve what you described in the email thread, but I can’t seem to think of it. I think maybe coding in c++ for work is giving me brain damage.

assert-unreachable

Why no “who”, message, and optional irritants args? (And presumably we’d want an assert-unreachablef version to specify a format string as the message, too).

I guess this is supposed to be more like assert than assertion-violation, but it seems a shame not have a form that both allows a user-defined message and has the “unreachable” property.

MF: I originally avoided this because I didn’t want to think about what (assert-unreachable who "oops") should mean if who turns out not to be a symbol. But it seems that the misuse of an unsafe primitive is allowed to do anything, anyway, including acting like a valid use of assert-unreachable. So, I’ll think more, but likely make this change.

MF: I now remember this road that I’ve been down before. The problem is not (assert-unreachable who ....) where who might not be a symbol, but something like (assert-unreachable (get-who) ....) where get-who might raise an exception. In that case, the call to assert-unreachable is not reached. The optimizer pass that recognizes unsafe-unreadable to prune other code could try to make sure the arguments are simple enough, but that makes the intended use of the function more fragile. (Also, it turns out to be in a pass where the simple-enough predicate is not readily available.)

So why not an assert-unreachable syntactic form that behaves differently depending on whether it’s used in safe or unsafe context? The problem is that it’s intended to be used as unsafe within an otherwise safe context. Absent begin-unsafe, there’s not a simple way to use an assert-unreachable form that way.

That’s why assert-unreachable is a function: so it can be referenced with #3%. And the complications of dealing with arguments are why it doesn’t take them. There may be another way around that I’m not seeing, even on a second try.

JT: Iiinteresting. That all makes sense, but I feel fundamentally unsatisfied with this implementation. (From a user standpoint, regardless of any opinions as an implementer.)

I think at the root of it all is that if some ostensibly unreachable code is actually reached in safe mode, there is no feedback to the user about which unreachable code was executed. The assert form reports a source location (if one is available). So, as a user, I would be wondering why bother using the assert-unreachable form at all rather than writing my own meta-cond and using a regular assert in “safe mode” so that I get useful information if my code is broken. From a user’s perspective at the very least we should allow a string literal to identify which assert-unreachable was actually reached.

With all that said, a zero-argument form is forwards compatible with forms that accept additional arguments, should we figure out how to implement them in the future.

compile-to-port

Probably resolved, but previous discussion follows.

Documentation appears incorrect for the two new signatures, due to not including the (currently undocumented) machine and hostop args (see compile.ss) before the new args. Not sure if I should add the extra signatures and fix up the documenation or if they’re undocumented for a reason. It’s already a little ridiculous to have 6 different signatures documented, but what’s a couple more.

Alternately, we could move the previously undocumented arguments to the end. That’s a backwards-incompatible change, but a change to an undocumented “feature”.

MF: I updated the documentation here to match the current implementation, even though I have otherwise been leaving documentation changes to your edits.

reference bytevectors

possible name changes extracted out of the text below: Rename bytevector-reference-{ref,set!} to reference-bytevector-reference-{ref,set!}, since they only work on reference bytevectors? (It does seem a bit redundant to have “reference” in the name twice.)

If I understand this feature correctly, a reference bytevector is just a bytevector that’s handled specially by the garbage collector. It feels like it’s not 100% fully baked around the edges - or at least the documentation is incomplete around the edges. It’s pretty easy to force an invalid memory reference, so the documentation should at least be clear about (a) in which functions this can happen, and (b) exactly where the “undefined behavior” line is.

I think most of the issues around this stem from the fact that you can do unaligned reads and writes in the bytevector. (Setting aside the fact that you can use any of the other bytevector set functions to arbitrarily corrupt addresses.) I assume it’s that way for consistency with the other bytevector ref/set functions, which is probably for the best (although one could imagine an API where the offset is given in terms of pointer word size rather than bytes). One might also imagine run-time checks to disallow offsets that aren’t aligned if interrupts are currently enabled. The expository text seems to indicate that bad things can happen if there’s garbage in the bytevector, but that wasn’t one of the ways I managed to trigger an error.

I kinda want to rename these to reference-bytevector-reference-{get,set!}, since they only work on reference bytevectors, but it does seem a bit redundant to have “reference” in the name twice.

(bytevector-reference-set! bv n obj) is just (bytevector-uNN-set! bv n (object->reference-address obj) (native-endianness)) with NN appropriate for the current machine’s pointer width, right? (Plus the check that it’s a reference bytevector.) And similarly for ~bytevector-reference-ref?

It’s not entirely clear to me whether “reference to a Scheme Object” also includes immediates, but I think that everything works as described if it does. Probably not worth trying to make any distinction in the docs.

I’ve attempted to explicitly identify the places where there can be undefined behavior (e.g., invalid memory references) in my copyediting of the docs.

MF: (bytevector-reference-set! bv n obj) is the same as (bytevector-uNN-set! bv n (object->reference-address obj) (native-endianness)) only if interrupts are disabled to prevent a GC in between the steps, right?

Continuation Marks

*This section has only the “unresolved” or not-yet-acted-upon bits of the discussion *

I don’t really understand the last sentence in the opening paragraph:

When a continuation is captured with call/cc, only the marks of the rest of the continuation are captured, and continuation-next-marks returns the captured marks.

Part of the issue is the phrase “rest of the continuation”. I’m pretty sure I know what it means, but I don’t think it’s defined anywhere. The rest of the issue is that I’m not sure why the last clause is there. Isn’t it just the logical consequence of the first part of the sentence? Is there some particular reason it’s being called out again explicitly?

MF: Its just meant to be a pointer to the `continuation-next-marks` procedure, a kind of forward reference alongside the mentions of the other procedure names. I agree about defining “rest of the continuation”.

Time complexity of continuation-mark-set->iterator is also incomplete. There must be some component that’s proportional to the size of key-vector.

Obtaining an iterator from continuation-mark-set->iterator takes constant time. Each call to an iterator takes time proportional to the size of continuation mark tables that are traversed to find one of the keys in key-vector.

MF: agreed

Backwards-incompatible changes

A major version bump is a time when we can make backwards-incompatible changes, if there’s a good reason to do so. If we do, then the change needs to be clearly documented in the release notes.

immutable quoted values?

Nothing to do with the Racket fork, but a major version bump would be a good time to introduce this backwards-incompatible change: make the objects “returned” by quoted values be immutable (as in the reified run-time property of whatever object type), not just “immutable” (as in the specification says the value should not be modified but doesn’t specify behavior if the program attempts to modify them).

For example, make (immutable-vector '#(1 2 3)) return #t.

MF: Since we don’t have an immutable variant for pairs and some other datatypes, enforcement of immutability would be pretty limited. Instead of distnguishing the values where we happen to have immutable variants, it seems easier, clearer, and not much worse to leave this alone. (FWIW, Racket does enforce immutability, but that’s combined with an avoidance of reader syntax for datatypes that have only mutable values.)

Hashtable cell on immutable hashtabes

As noted in the (now hidden) resolved issue about inconsistent documentation regarding the -cell and -ref-cell functions on various kinds of hashtables, the documentation says the cdr of those pairs “should” not be changed. We could instead either reject immutable hashtables, or return immutable pairs (although I’m not really sure why anyone would want that).

The current implementation definitely allows modification of an immutable hashtable, as shown in the following transcript:

> (eq-hashtable-cell iht 'a 5)
(a . 1)
> (eq-hashtable-set! iht 'a 2)
Exception in eq-hashtable-set!: #<eq hashtable> is not mutable
Type (debug) to enter the debugger.
> (set-cdr! (eq-hashtable-cell iht 'a 5) 3)
> (eq-hashtable-cell iht 'a 5)
(a . 3)

MF: I changed all the hashtable ...-cell and ...-cells functions to require a mutable hashtable.

JT: Any particular reason the changes to hashtable-cell put the mutablility check inside the case rather than before it? I would assume that doing the check before instead of on each branch will produce better code, and since this is already a backwards incompatible change I don’t know why we’d need to preserve the error message produced when trying to get a cell from an immutable symbol hashtable using a non-symbol key. (Related: the mutability check appears to be missing in the symbol hashtable case in hashtable-ref-cell, which, of course, wouldn’t have happened if the check was before the case.)

Release notes should probably be more specific about exactly which procedures have different behavior now. I can take that as a to-do, if you haven’t already done it by the time I get around to it.

MF: Thanks for improving the mutability check and release notes. Yes, the original goal was just to preserve the order of checking.