r/adventofcode Apr 20 '26

Other [2018 Day 20] In Review (A Regular Map)

Today we find ourselves suddenly inside the newly built North Pole complex, which is a maze.

This one was particularly fun. Instead of abusing regex to get an answer, we're given a big regex and asked to understand its extent. Which is the description of the maze as relative directions, including branches (and even empty branches like (NEWS|)). The regex in my input does fully visit every room of a 100x100 maze, starting just off the center.

Not that I bothered with walls and doors like in the examples... I just did the rooms and trusted it was a maze. I did a recursive descent parser on the parenthesized expression. Tracking the coordinates visited, length, and depth. Of particular interest are those empty branches... the other part is always some path that doubles back to the same spot (as seen in the example text with NEWS, WNSE and SWEN). Basically tracing out a dead-end before moving things further along. It makes sure that everything got covered.

I always like writing a parser, and this one was interesting, with its connection to both regex and grids.

4 Upvotes

16 comments sorted by

2

u/DelightfulCodeWeasel Apr 20 '26 edited Apr 20 '26

I did the facility exploration as a search queue after some preprocessing on the regex. For every '(' I store both the pattern index of the closing ')' and the pattern index for each of the branch choices:

    case '(':
        for (size_t branchTarget : regex.Branches[patternIndex])
        {
            queue.Add({ branchTarget + 1, roomLocation });
        }
        patternIndex = regex.Pattern.size();
        break;

    case '|':
    case ')':
        queue.Add({ regex.ScopeEnd[patternIndex] + 1, roomLocation });
        patternIndex = regex.Pattern.size();
        break;

Probably going to have to re-think that for the memory constrained version though, there are just over 100,000 unique states explored which is going to be a minimum of 600kb 400kb.

For my input there are just over 1,000 open brackets, so an explicit stack on a recursive descent parser should fit okay. What was your runtime like?

3

u/e_blake Apr 20 '26 edited Apr 20 '26

My recursive-descent parser in m4 takes a mere 120ms, with a single pass through the input one byte a time. Each '(' pushes x, y, and distance, each ')' pops all three, and each '|' calls pop and then push. ^, $, and newline are ignored, and the remaining 4 characters perform motion along a grid of x,y points storing distance - parts 1 and 2 can be computed in tandem by the right conditions each time I detect the first time a room is entered. It's already fast, but could probably be made slightly faster by storing the grid as a 1D array instead of x and y as independent variables. This works because the regex doesn't contain any paths that form a loop - any branch appears to merely be to a dead end that retraces steps (things would be a bit messier if loops in the maze were possible).

2

u/DelightfulCodeWeasel Apr 20 '26 edited Apr 20 '26

I was just thinking about room and door storage a few minutes ago!

I think the scheme I'm going to go with is just have a single 64kb buffer: we know that the each axis doesn't vary by more than 100 units in any direction, so the byte pair can be used as a direct look-up in the 1D array and we just need 4 of those bits per byte to encode if NSEW are possible. Feels a little wasteful, but 1,000 rooms will only need a ~4kb hash table and ~2kb search queue for the BFS.

EDIT: Wait... There are 10,000 rooms :) Should still be able to fit under 200kb all-in though!

3

u/ednl Apr 20 '26

I did exactly what Blake described, so I need 10,000 x 4 bytes (2x 1 for position, 2 for distance). Max stack size was only 225 for my input (same 4 bytes per entry).

Except the array is VERY naively being kept sorted (binary search, move bytes out of the way with memmove). A hashmap would work 1000x faster but possibly take twice as much space.

Runs in less than 4 ms on a RPi5. https://github.com/ednl/adventofcode/blob/main/2018/20.c

3

u/DelightfulCodeWeasel Apr 20 '26

Nice! I was still stuck thinking in terms of splitting it into two phases: construct the facility and then BFS. But of course you don't really need to have an in-memory representation of the facility at all if you're counting steps as you go through the regex.

For a bit more memory I think 128kb would get you a flat array of uint16_t distances, initialised to max, indexed directly by the room XY bytes.

3

u/e_blake Apr 21 '26 edited Apr 21 '26

12.5kbytes to store a bitmap of one bit per location once visited. You don't need to store distances anywhere but the stack - just track those as you recurse. Then a stack depth of 225 adds 450 bytes for distance rewinding, and 450 bytes for x/y location rewinding. The entire solution should be doable in under 16k (although a bitmap is not necessarily as fast as a perfect hash of 100k*2 bytes by using distance as your witness of a position being seen).  Unless I'm missing something where a room can be reached by more than one path, so you really would need to store a best distance per room.

[Edit] Confirmed - when I turned my grid into just bools instead of integers, my code still got the right answer thanks to tracking length on the stack at each branching point.

3

u/ednl Apr 21 '26 edited Apr 21 '26

Unless I'm missing something where a room can be reached by more than one path, so you really would need to store a best distance per room.

That was precisely the reason I thought I had to store the distance for every room. To my mind, the explanation suggested strongly that anything was possible. But I didn't check.

Now I checked, and for my input there were LOTS but each time the distance is greater on subsequent visits. This means no loops except the localised ones from (...|). So yeah, one "seen" bit is enough. Not sure we could have known beforehand.

3

u/DelightfulCodeWeasel Apr 21 '26

I've added in a quick per-room visit count and I'm seeing the same room being hit multiple times (some of them nearly a couple of hundred times) but that could just be an artefact of the way my traversal is working. They also might be getting hit at exactly the same step count, I don't have that value to hand during traversal without a bit of a rewrite.

I have a hunch that the furthest room for part 1 is likely to have a single unique visit, so it's that border of 1,000 steps for part 2 where it's likely to make a difference.

3

u/ednl Apr 21 '26

I was still editing so you may have missed it: the way I check, there were also lots of revisits (not sure about multiple revisits of the same room, didn't check that) but each time the distance was greater.

3

u/DelightfulCodeWeasel Apr 21 '26

I did miss it; I was busy pinching and [modifying your traversal] to tally up visit counts and visit distances :)

For my input I'm seeing a lot rooms being revisited, and nearly 1,000 rooms being visited at different distances. But for the critical boundary of 1,000 steps it doesn't look like there are any rooms that straddle the boundary.

The stack traversal doesn't have the same shortest-first guarantee as BFS, so we would be relying on the niceness of the input if we weren't storing distances.

EDIT: I'm not seeing any cases where a room is re-visited at a shorter distance in my input either.

→ More replies (0)

1

u/DelightfulCodeWeasel Apr 21 '26 edited Apr 21 '26

I am assuming that you can reach the same room by multiple paths, yeah. The wording of the puzzle seems to imply that it might be possible. Do you have leaked inputs for this one to be able to check that?

2

u/e_blake Apr 21 '26 edited Apr 21 '26

The regex was long enough that I'm not finding examples of it pasted directly in the megathread.  And I don't feel it appropriate to try and scrape repositories linked from the megathread just to see if someone accidentally checked in their input file (not to mention scouring one repository at a time doesn't scale as well as reading the megathread). But I did get to test on a second input in 2018 because my daughter tried doing some of the puzzles that year, and let me access her inputs at the time. That was still before the current edict reminding us to be more careful about not sharing inputs. 

I agree that the wording makes it sound ambiguous on whether the input is non-looping - but I also figure that since my input does not loop, that Eric's generator didn't create loops for anyone's input (otherwise, some people got a much easier puzzle than others). There are "multiple paths" for substrings like "(NEWS|)" both returning to the same point, but that is not a loop where two paths converge to lower the distance to an already-visited room where more than the stack would be needed.

2

u/ednl Apr 23 '26 edited Apr 24 '26

I opted for the somewhat convoluted insert function because you don't know the dimensions of the grid, so even the static size (with margin) is with hindsight, as is using signed chars for the xy-coordinates. The starting point might not be near the middle, or it might not be a square, or it might be very big where you don't visit every room in a rectangle. Conceptually and for programmer convenience too if your language has it, you should use a set. On the other hand if you go the route of "seen" bits and are also happy to go with the int8_t "gamble", you can use the 2 chars as a uint16 index into a bitfield of 8192 bytes (13 bit size, 3 bit as index into a byte).

Edit: OK, I made it as an alt version! Runs in 144 µs on a Raspberry Pi 5. https://github.com/ednl/adventofcode/blob/main/2018/20alt.c Edit2: added comments and a minor cleanup.

1

u/terje_wiig_mathisen Apr 22 '26

My solution here once again did it "by the book", with a Vec<Vec<u16>> (initialized to u16::MAX) to store the 500x500 cell maze, this needs half a megabyte of memory, but would be able to handle multiply-connected paths during the BFS traversal. Runtime 2 ms on Surface.