-
Notifications
You must be signed in to change notification settings - Fork 70
Finalize Maze Generation #946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Use list `pop` and `append` methods instead
|
A quick summary of what I did or tried to do:
Upcoming:
Unfortunatly, I was not able to get away with a single Despite that, I am especially excited about the partitioning parameters. With them I could adjust the mazes predictively. |
Debilski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work. I haven’t looked at it in too much detail but I’ve added added a few comments.
A few more general remarks: The Partition class looks slightly over-engineered, but if it allows us to expose some partition properties in an invariant way, then I guess it makes sense. :)
I think I would move all logic to the split function and then remove the need of having to use a rng in __init__. (Or the other way round: We could already do the whole splitting in __init__ as well and keep the sub-partitions in a list and evaluate the whole tree …)
Some further design ideas regarding the algorithm (these do not need to be implemented now but I think some ideas are useful to keep in mind while writing the code):
- Currently we split horizontal, vertical, horizontal, … I think we could also split in an irregular pattern using the rng (with some bias maybe)
- There is no reason for splitting in halves or is there? So we might also have more than two sub-partitions (again perhaps drawn with a biased rng). (Maybe irregular splitting is also equivalent to splitting in more than two sub-partitions?)
Instead of being the area inside these walls
e03d037 to
321492e
Compare
The sampling still shuffles candidates for removal instead of sampling the candidates to keep directly
In `u`-`v`-space, the inner partition wall is always vertical, which means that the wall position is always sampled on the `u` axis and the wall length is measured along the `v`-axis
|
@Debilski Thanks for your feedback! I felt the same and hoped for exactly that: proper invariant definitions of partition measures. Unfortunately, in the end, the encapsulation in the Despite that, your feedback remains valuable for the other design decisions I took and will take.
I would come back to this once we consider parametrization of the maze generation. 👍
More than two sub-partitions wouldn't conceptually make a difference, if we broke up the alternating vertical-horizontal pattern. |
|
In 0ac6d44, I got rid of all the Also, I think my obsession with variable naming payed off here, clearifying variable relations and aligning similar lines of code. (I hope you like it 😏) |
|
I updated all comments regarding the half maze generation alongside an ASCII art of a partition. The next thing before breaking compatibility with NumPy mazes would be to fix the
|
Connections between border gap pairs, which would be connected after mirroring, emulate the continuation on the right maze
|
Great news! The chamber finding now works solely on one maze half. 😎 The clue was to add edges between the border gaps which would also be connected after the maze was mirrored. I included an ASCII art for this in the code comments: That also improved performance: reference on
|
|
For my part, we should be ready for breaking NumPy compatibility. 🚀 |
|
@otizonaizit That looks weird, but it's indeed in line with our definition of chambers 😁 The chambers we are interested in are actually the "side" ones which have a single articulation point in common with the "main" chamber. We can thereby go on with life - or reason about making the total food count dependent on the chamber size, i.e. giving the number of food pellets in terms of percentages instead of absolute numbers. This would avoid such surprises, but actually I like these kind of quirky cases. |
otizonaizit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I am really liking this. It is work well invested, I am sure our future selves are going to be for ever grateful that this has been redesigned with legibility in mind :-)
I left some minor/middle-sized comments.
|
|
||
|
|
||
| def find_trapped_tiles(graph, width, include_chambers=False): | ||
| def find_trapped_tiles(graph, gaps, include_chambers=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be consistent with the way stuff is defined in this module and call this find_chambers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know what you mean, but this particular name clashes with the return types: this function returns the union of all side chamber tiles and, on demand, the individual side chambers as well.
But I also don't have a solution to that. 🤔
|
|
||
|
|
||
| def find_trapped_tiles(graph, width, include_chambers=False): | ||
| def find_trapped_tiles(graph, gaps, include_chambers=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that gaps is not a good name for the argument, or we need a doc-string that explains what it means. In general, maybe this could be by default None, so that find_chambers can be tested and used independently of our particular use case? Not sure if the generalization is useful, but still thinking about us in 5 years looking back, I think it will be odd if we don't explain this or make it general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find_trapped_tiles should be useful for users, right? Wasn't it used before for the bots?
Therefore, a generalized function would be nice to have, I think.
I unfortunately don't know how to do this properly (/wrt naming, signature, functionality).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m only reading with one eye (can check more in a few days I hope), but I’d suggest to remove the inlude_chambers from the function and do this separately. It doesn’t feel right to have the type of the returned value depend on an argument.
| # only the main chamber covers both sides | ||
| # our own mazes should only have one central chamber | ||
| # but other configurations could have more than one | ||
| if (chamber & gaps): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I get the logic. It clearly works, so I just need help to understand why. So, we have the following maze:
####################
# .....XX #
# #####XX #
# XX #
# XX #
# XX #
# XX##### #
# XX #
####################
the tiles with the dots are within a chamber, are they not? And this chamber intersects the border (denoted by X), right? So how can this be automatically assigned to the main chamber?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, this condition was
if min_x < width // 2 <= max_x:checking the left-most and right-most tiles of a chamber yielded by networkx.biconnected_components.
As it is chamber, by definition we know that those two tiles are seemlessly connected.
In essence, only a main chamber crosses the border.
We can rephrase this: When there is any tile of that chamber intersecting with the border gaps, then this is considered a main chamber.
The new condition
if (chamber & gaps):for the half maze is equivalent to that, when every border gap on the left will be connected to a border gap on the right.
In our case, this is always given by the border being centrosymmetric.
Regarding your example: Assuming you mean
X X
| |
########## ##########
# ...... #
# ###### #
# #
# #
# #
# ###### #
# ...... #
########## ##########
then you are right. The dots mark a chamber in a half maze, but on a full maze we consider this to be part of the main chamber (it has two entrances and crosses the border). Actually, networkx.biconnected_components yields the dots as multiple chambers:
# (9, 1) is a border gap
[(8, 1), (9, 1)]
[(7, 1), (8, 1)]
[(6, 1), (7, 1)]
[(5, 1), (6, 1)]
[(4, 1), (5, 1)]
# (3, 1) is deleted as it is also included in the big chamber
[(3, 1), (4, 1)]
# [ plus the big chamber, left out for brevity ]
because removing any of those (articulation) points would split the graph into two separated subgraphs.
For completeness, we now add our fake edge to emulate the right (mirrored) maze half:
X
|
##########
# ───┐
# ###### │
# │
# │
# │
# │
# ───┘
##########
and we can do that like this because our border is centrosymmetric. We guarantee left and right border gaps to exist as mutual pairs only.
The nodes which were articulation points before are no articulation points anymore since they don't split the graph into individual components when removed: the chamber is now a part of the main chamber by definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@otizonaizit Does my answer help?
| ngaps = max(1, ngaps) | ||
|
|
||
| # copy framing walls to avoid side effects | ||
| walls = walls.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I don't get. It is so that you can return walls at the end of the function that can then get intersected with the one you passed? I don't know, I think pure functions are not a Python requirement, so maybe here, given that the function is called add_inner_walls, we can just modify walls in place and do not return anything? We can even all it insert_inner_walls if you want to be even more clear that you are messing with the original set of walls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came from this review conversation.
And I thought to remember that pure functions are desired in Pelita code.
I wanted to make this function just returning inner walls, so that I can unite these with the framing walls. However, the binary space partitioning depends on outer walls and it does not make sense to have a collection of all walls within that function and keeping another set with only the inner walls to return, just to unite those with the outer walls again outside the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have convention of favoring pure functions in Pelita. It may be a personal reference, but I personally think it is unpythonic to copy the variable at the beginning of a function, then change it in place, and then return it to then merge it with the thing you passed in to begin with. It seems to me much more pythonic to just give the function a proper name, change the thing in place and return None.
|
|
||
| # Generate a wall with gaps at the border between the two homezones | ||
| # | ||
| # BORDER SAMPLING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above: we are punching holes, we are not sampling
| gaps.add(lower) | ||
|
|
||
| # edges between those gaps which would be connected after mirroring | ||
| edges.add((upper, lower)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this trick, but I am afraid this will be very confusing in the long run. Maybe it is enough to find a better name for this? After all these are "fake" edges, in the sense that they are not going to be there once we have the full maze, right? It should be clear that this here is a trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It definitely took me some time to get it.
Yes, these edges are fake and not in the graph representation of the maze in a game run.
Renaming is always a good start. I will try.
| # emulate the right maze side by wiring up pairs of left border gaps which | ||
| # would be connected after mirroring: | ||
| # | ||
| # ############ | ||
| # # ───┐ | ||
| # # # │ | ||
| # # ──┐│ | ||
| # # # ││ | ||
| # # ─┐││ | ||
| # # ─┘││ | ||
| # # # ││ | ||
| # # ──┘│ | ||
| # # # │ | ||
| # # ───┘ | ||
| # ############ | ||
| # | ||
| # motivation: mitigate border gaps being detected as individual chambers, | ||
| # which would make it impossible to detect the main chamber | ||
| # assumption: border gaps are sampled centrosymmetric with always a | ||
| # wall segment in the middle on odd heights | ||
| graph.add_edges_from(edges) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given this explanation, maybe you can refer to it from the code above where you introduce the variable edges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't like that to be that separated, but I need to figure out a better code structure - if there is any.
Otherwise yes, mutual reference of both code blocks clarifies the relation.
| if neighbor not in walls: | ||
| # check if the new node is free and within the bounds as | ||
| # the walls might not be closed | ||
| if neighbor not in walls and neighbor[0] < width and neighbor[1] < height: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this could be changed so that the if exits faster when one condition is false? Wouldn't this be the same logic?
if neighbor in walls or neighbor[0] >= width or neighbor[1] >= height:
continueGiven that this function is one of the bottlenecks in maze generation but also, unfortunately, also for Bot, because the graph gets generated once for each Bot instance, it would be cool to optimize as much as possible. I am not sure my suggestion above does indeed improve things, but I guess it should?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, and the changes in the condition are not significant to the speed as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then it doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an idea how to speed up walls_to_graph further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the game it should only run once per team (and otherwise be cached), so I think it is ok. Also the function spends 60% of the time in graph.add_edge which we don’t have any control over. At best, we can therefore half the time.
OK, so let's go on with our lives :-) |
|
@otizonaizit I feel like the I am thinking about stuffing all of the code into |
|
Hmm, as long as stuff is still testable... And that the effort does not become over engineering... Because yes, it is useful for stuff to be more general but not everything must be general for a hypothetical use case that is very unlikely to materialize. I thought that a function that is useful to reuse while writing a bot may be And even if that function were reusable, I wouldn't advertise it, or it will make the whole idea of having tunnels and chambers much less challenging for the students. So, the code should be very readable and transparent, because we want it to be of educational grade if students look at it, but we are not inviting people to reuse it in some other random projects. |
| # minimum partition width | ||
| MIN_WIDTH = 5 | ||
|
|
||
| # minimum partition height | ||
| MIN_HEIGHT = 5 | ||
|
|
||
| # partition padding for wall position sampling | ||
| PADDING = 2 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future work, once the design has settled: I assume that these constants are – in principle – configurable. If so, I think it would make sense to add a check in generate_maze that the variables make sense and fail with a sensible error message. The maze generation should then ideally not fail anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, they can be added as keyword arguments to the main function, which makes it a bit uglier because then you have to pass them around. But I don't see the need to make them configurable. We haven't felt the need to change them until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean to keep them as constants here and not expect it to be user-configurable. More like as an experimentation playground (and as a tool to gain some understanding on how the maze algorithm works).
That said, MIN_WIDTH and MIN_HEIGHT work in u-v space so I am not sure these would be the right names here.
Also, we currently seem to have the minimum size for a partition (which is the term here for a rectangular area that still contains a wall – and that must contain a wall): So if I increase the minimum width to a high number, I can still end up with very narrow alleys because the wall could be placed at the side of a partition.
Is there a reason why it is done like this? We could also have the splitting mechanism check that the walls are placed such that the resulting parts are big enough. (And if they are not then no split occurs.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MIN_WIDTH currently means "minimal splittable width" and MIN_HEIGHT "minimal wall length for a splittable partition". Maybe MIN_SPLITTABLE_WIDTH and MIN_SPLITTING_WALL_LENGTH are better, but I would invest a good amount of time (again) to come up with better fitting short constant names.
if I increase the minimum width to a high number, I can still end up with very narrow alleys
Currently, you would tune PADDING for that, which is effectively the hard size minimum of a partition.
Is there a reason why it is done like this? We could also have the splitting mechanism check that the walls are placed such that the resulting parts are big enough. (And if they are not then no split occurs.)
With the current implementation, you can realize a wider variation of chamber sizes. If you wanted the bigger chambers to be even bigger, but also have some very small chambers here and there, you would just increase MIN_WIDTH and leave PADDING as is. If you wanted to have bigger chambers overall, increase PADDING - and possibly MIN_WIDTH, too - with respect to the implicit relation MIN_WIDTH > 2 * PADDING (otherwise the code throws an exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would renaming like this improve that?
| old | new |
|---|---|
MIN_WIDTH |
MIN_SPLITTABLE_WIDTH |
PADDING |
MIN_WIDTH |
In code, we would need to add an appropriate offset to the new MIN_WIDTH to resemble the old PADDING.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Debilski And what would be the meaning of a minimal partition width for you:
- the partition with walls (in line with the partition definition including surrounding walls; the minimal value allowed would be
3) or - the inner of the partition, i.e. the chamber size (against current partition definition, but more intuitive; minimal value allowed would be
1)?
| rng = default_rng(rng) | ||
|
|
||
| if k < len(nodes): | ||
| return set(rng.sample(sorted(nodes), k=k)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for sorting the nodes first? Also, is this function really needed? Just using rng.sample in the code doesn’t seem to be any worse to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is just temporary until we decide to break numpy compatibility. This will happen soon in this very same PR. For now we keep it like this so that we are sure that refactoring does not change the generated mazes, that until now, when generated with the same seed, look exactly the same as they used to look when generated with numpy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, I forgot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this sample_nodes function in distribute_food only. Here it could be the case that we have, say, 10 tiles (n), but want to distribute 30 food pellets (k) on it. Since the number of samples is greater than the size of the population to sample from (k > n), random.sample throws an error; it requires 0 <= k <= n.
sample on the other hand is the temporary one for NumPy compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for sorting the nodes first?
random.sample requires the population to be a list, but nodes is a set in our case. Additionally, sorted ensures the same order of elements across runs and thereby reproducability despite randomness.
| if include_chambers: | ||
| subgraphs = graph.subgraph(chamber_tiles) | ||
| chambers = list(nx.connected_components(subgraphs)) | ||
| else: | ||
| chambers = [] | ||
|
|
||
| return chamber_tiles, chambers | ||
| return chamber_tiles, chambers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a new function. (Also it isn’t used in the code currently.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that, but then we should make clear somehow that both functions must use the same graph object.
| # then we need to rewrite mirror to mirror a set of coordinates around the center | ||
| # by discarding the lower part of the border | ||
| # start with a full wall at the left side of the border | ||
| pos = width // 2 - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this name confusing. This seems to be something specific but calling it pos makes me think that it gets updated at a later point (also it is not a pos which should be 2d).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, I see. What about these:
xborderxxvaluexval
and also for pos in u-v-space in add_inner_walls accordingly?
| PADDING = 2 | ||
|
|
||
|
|
||
| def mirror(nodes, width, height): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not what a mirror does :)
Also, maybe take a shape instead of w, h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The proper term for this symmetry operation here is "inversion" with respect to a center of symmetry.
I thought about passing that - the center of symmetry - instead of width and height or shape to be consistent with the function rename, but with that comes just more unnecessary complexity: calculating the center of symmetry from width and height, possible float operations, vector manipulations.
So, I guess the next best thing is shape.
But then, this is inconsistent with generate_maze. 🙄
| p += 1 | ||
| # insert a wall only if there is some space around it in the | ||
| # orthogonal `u`-direction, otherwise move on with the next partition | ||
| if ulen < rng.randint(MIN_WIDTH, MIN_WIDTH + 2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a magical coin flip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without that variable size check, the mazes look kind of the same with equally sized areas, just arranged differently.
Maybe the magic goes away when we define the 2 in a VARIATION constant?
| transform = identity if vertical else transposition | ||
|
|
||
| # map `x`-`y`-coordinates into `u`-`v`-space where the inner wall is | ||
| # always vertical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* perpendicular to the u-plane
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, or "oriented in v-direction".
| # wmax pmax | ||
| # | ||
| # | ||
| # Note: the inner wall is always vertical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t use ‘vertical’ here. Otherwise it is very confusing with the variable vertical, which is still in xy space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point! I would use "oriented in v-direction" instead of "vertical".
| xmin, ymin = pmin | ||
| xmax, ymax = pmax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should not be needed if we strictly work in uv space, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without that, I would need to write
umin, vmin = transform(*pmin)
umax, vmax = transform(*pmax)instead of
(umin, umax), (vmin, vmax) = transform((xmin, xmax), (ymin, ymax))to get the u- and v-ranges from the x- and y-ranges, right? 🤔
Then, I prefer the current implementation, as I can directly see the mapping x, y -> u, v and I have to call transform only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Debilski Or have I misunderstood your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yeah, your reasoning makes sense. But I’ve now thought about the lines here:
wmin = transform(pos, vmin)
wmax = transform(pos, vmax)
And I believe it would all in all be more natural to have transform take a tuple that gets transposed instead of two separate variables.



Closes #945