Skip to content

Proposal: remove multi-line strings from the language #25

@ahelwer

Description

@ahelwer

Another one for the "features SANY has had since the 1990s that nobody knew about before looking at the code this year" category. Issue tlaplus/tlaplus#802, to get SANY to support backticks in strings, was thought to be a simple one good for new contributors and was highlighted in one of the monthly development updates. New contributor @moiseev very competently took on the issue and discovered that actually SANY did support backticks in strings, as long as they were balanced by a closing apostrophe; like:

balanced == "this `string' contains `backticks'"

If a closing apostrophe was not given, SANY will read to the very end of the file or the next time it encounters an apostrophe (say, the prime operator) resulting in a confusing error.

@moiseev also discovered this backtick/apostrophe balancing had ramifications for the behavior of the tla2tex translator program. tla2tex maps backticks & apostrophes in strings to UK English-style inverted commas ‘ and ’. However, tla2tex does not require that the backticks & apostrophes be balanced; this is a constraint enforced only by SANY.

@calvin then noticed something further: SANY allows newlines, tabs, and other blank space characters in between backticks and apostrophes! Although the highlighting will be broken in the below snippet (since the tree-sitter grammar does not support this) the following will be parsed by SANY:

---- MODULE Test ----
EXTENDS TLC

MultiLineString == "`
this
is
a
multi
line
string
'"

ASSUME Print(MultiLineString, TRUE)
====

If checked by TLC, it will print out "`\nthis\nis\na\nmulti\nline\nstring\n'" so this is supported all the way up the stack. So, TLA+ has multi-line strings! Who knew!?

There are two questions for the Specification Language Committee (SLC) to answer:

  1. Should TLA+ require matching backticks to apostrophes in strings, or allow arbitrary unbalanced backticks & apostrophes in strings?
  2. Should TLA+ support multi-line strings as currently implemented in the Java tools?

None of us could find any reference to backticks in strings in the various language standards documents. It is also possible that the multi-line string functionality is actually an unintended bug in the backtick/apostrophe matching regex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions