Skip to content

Conversation

@mabruzzo
Copy link
Collaborator

This introduces a data structure called FrozenStringIdxBiMap. This is a bidirectional map (aka a bidirectional dictionary), that can be used to map between a unique set of n strings (keys) and a unique set of indexes (with values of 0 through n-1) and vice-versa.

Note

The implementation of the data structure in this PR is overly simplistic (it has O(N) complexity rather than O(1) complexity). This was intentional to make the PR easier to review.

PR #270 proposed a very similar data structure that has a much faster implementation. It's on my todo list to redo PR #270 and replace the implementation proposed here (I'm just going to need to make a few tweaks to creation/destruction functions so that the API is consistent with what we propose in this PR)

I intend to use this as a building block to implement other types (namely types with a map-like interface) and initialization code. There are some obvious applications related to the dynamic API. There are a couple of times where I have wanted this kind of data structure and instead I wrote crude workarounds

The underlying implementation is based on a hash table. To implement this, I ported a class (from C++ to C) that I previously implemented in Enzo-E to perform the same function (called StringIndRdOnlyMap). In addition to porting the logic from C++ to C1, I also added logic to handle a few edge cases. For context, I originally wrote the Enzo-E class so that to make use of more specialized logic, which should make this faster that a solution that makes use of more generic types like std::map from the C++ standard library.

Footnotes

  1. With the benefit of hindsight, the effort to convert from C++ to C took way more time than I expected and it probably wasn’t worth it. In fact, it may make more sense

There is some unforunate consequences, but if we don't do this you end
up with some pretty confusing looking code... (the confusion primarily
arises if new_FrozenKeyIdxBiMap or drop_FrozenKeyIdxBiMap has different
behavior from other functions with similar names). I really think it
would be better to make this into a simple C++ class with a constructor
and destructor, but thats a topic for another time
@mabruzzo mabruzzo changed the base branch from main to newchem-cpp November 19, 2025 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant