r/adventofcode Apr 19 '26

Other [2018 Day 19] In Review (Go With The Flow)

Today we get back to working on the assembly language of our device. This time we find out that the instruction pointer can be bound to one of the regular registers, so now we have flow control. And the note about using opcodes other than set and add for that having "truly fascinating" effects is somewhat appropriate. As the jump to end the execution is done via squaring the IP (sending it well past the end). Which is tamer than bit operations like the mouse-over talks about.

As for what the rest of code... well, this time the code to calculate the values for the two parts is done after the main code block, so the first statement jumps over it. The code block does sum of divisors of the calculated value in a simple and very inefficient way (using nested 1 to x loops, so the value for part 2 which over 10 million becomes 100 trillion loops). You can get this via reading the assembly, or just outputting the registers whenever register 0 changes (the target). For mine that gets you:

[    1     7     1   896     1   896]
[    3     7     2   448     1   896]
[    7     7     4   224     1   896]
[   14     7     7   128     1   896]
...

Register 1 is the IP, and we see that the accumulation happens only on 7 which is addr 2 0 0. It's not too hard to guess what's happening.

So this really is a standard assembly problem... part 1 is easily simulated, part 2 not so much. Sum of divisors is easy to calculate in a number of ways. So I just broke on instruction 1, grabbed the value and did the pretty standard one (find the factors... product of (pe+1 - 1) / (p - 1) for all the pe).

3 Upvotes

2 comments sorted by

2

u/e_blake Apr 19 '26 edited Apr 19 '26

Unlike some other assembly puzzles, this one absolutely needs a peephole optimization or other reverse analysis speedup to be tractable for part 2. My m4 implementation took 26 seconds on part 1 before lifting my sum-of-divisors code from 2015 day 20, which dropped runtime to around 10ms. At which point, jumping from part 1 to part 2 added no appreciable runtime (more overhead in reading the file than in performing O(sqrt n) computation even with part 2's larger n - compared to the brute force O(n2)). But it did cost me some hair-pulling on how to rewrite my implementation to stick with just 32-bit math since the numbers being factored are bigger here (for example, computing a*b/c had to be rewritten a*(b/c) after ensuring b was evenly divisible, to avoid the intermediate value overflowing).

2

u/ednl Apr 20 '26 edited Apr 21 '26

I didn't need 64-bit numbers using the prime product function that /u/musifter mentioned:

// Ref.: https://en.wikipedia.org/wiki/Divisor_function#Formulas_at_prime_powers
//   sigma1(n) = prod[(p^(a+1) - 1)/(p - 1)]
//   product over every prime where a is maximum p^a that divides n
//   (if p is not a divisor of n, product term is 1)
static uint sumofdivisors(uint x)
{
    // Special case: prime factor = 2
    uint pa = 2;  // p^(a+1) starting with 2^1
    while (!(x & 1)) {
        pa <<= 1;
        x >>= 1;
    }
    uint prod = pa - 1;  // = (2^(a+1)-1)/(2-1)
    // Other potential prime factors
    for (uint p = 3; p <= x; p += 2) {
        pa = p;  // p^(a+1) where a=0
        while (!(x % p)) {
            pa *= p;
            x /= p;
        }
        if (pa != p)  // mathematically unnecessary test but speeds it up
            prod *= (pa - 1) / (p - 1);
    }
    return prod;
}

Edit: sorry, got a and a+1 confused in the comments.