Continuing my series showing Advent of Code solutions in MiniScript, last night's challenge — Day 7 — was in interesting one. We're given a terminal session showing a bunch of cd and ls commands, from which we need to reconstruct a file hierarchy, and then search that for some aggregate directory sizes.

More specifically: for Part A of the challenge, we needed to report the sum of all directories containing less than 100000 bytes. And for Part B, we needed to report the size of the smallest directory that was at least big enough that, if deleted, it would provide a given amount of free space on the disk.

A file hierarchy is clearly a tree structure, but for some reason, I quickly abandoned making an actual tree of little node objects for a different approach. But after completing the challenge and chatting with my spouse — who is doing these challenges in Chapel — I couldn't shake the feeling that I had done it the hard way.

So today, I'll share both solutions!

Solution #1: Directory Map

The data structure I actually chose last night was a map in which each key was a complete directory path, and the corresponding value was a list of [name, size] pairs. In the case of a directory, the pair was actually [name, "dir"] instead.

As I read the input file, I kept track of the current directory path as a string, using Mini Micro's built-in file module to find child and parent paths as needed. Then to find the total size of a directory, I wrote a recursive dirSize function that added up all the sizes of its children (calling itself in the case of any entry with "dir" for a size). A similar recursive function was used to find all the small directories.

The complete program — after a little morning-after cleanup — comes to 89 lines, including both parts of the challenge.

f = file.open("input.txt")
//f = file.open("example.txt")

contents = {} // key: path; value: list of [name, size]
curDir = "/"
curEntries = []
contents[curDir] = curEntries

while not f.atEnd
    line = f.readLine
    if line == null then break // end of file
    print line
    m = line.match("$ cd {subdir}")
    if m then
        if m.subdir == "/" then
            curDir = "/"
        else if m.subdir == ".." then
            curDir = file.parent(curDir)
        else
            newDir = file.child(curDir, m.subdir)
            curDir = newDir
        end if
        if contents.hasIndex(curDir) then
            curEntries = contents[curDir]
        else
            curEntries = []
            contents[curDir] = curEntries
        end if
    else if line == "$ ls" then
        continue
    else
        // file line
        sizeOrDir = line.split(" ")[0]
        name = line.split(" ")[1]
        if sizeOrDir == "dir" then
            curEntries.push [name, "dir"]
        else
            curEntries.push [name, sizeOrDir.val]
        end if
    end if
end while
f.close

dirSize = function(path)
    entries = contents[path]
    sum = 0
    for e in entries
        if e[1] == "dir" then
            sum = sum + dirSize(file.child(path, e[0]))
        else
            sum = sum + e[1]
        end if
    end for
    return sum
end function

resultA = []
findSmallDirs = function(path)
    for e in contents[path]
        if e[1] == "dir" then
            subpath = file.child(path, e[0])
            sz = dirSize(subpath)
            if sz <= 100000 then
                print "Small dir: " + subpath + " of size: " + sz
                resultA.push sz
            end if
            findSmallDirs subpath
        end if
    end for
end function

findSmallDirs "/"
print "final answer (part A): " + resultA.sum

diskSize = 70000000
spaceUsed = dirSize("/")
curFree = diskSize - spaceUsed
spaceRequired = 30000000
needed = spaceRequired - curFree
print "Need to find " + needed + " bytes"

// Now find smallest dir that's at least needed bytes
best = null
for d in contents.indexes
    sz = dirSize(d)
    if sz < needed then continue
    if best == null or sz < best then best = sz
end for
print "final answer (part B): " + best

It worked. But it wasn't super fast to code; it took me 28 minutes, the longest time yet. And my final rank was 1016, not quite reaching my goal of the top 1000.

Solution #2: A Tree of Nodes

So this morning, I coded up the other approach, of making a proper tree. I defined a Node class, which would be both the base class and the class instantiated for any file; and a DirNode subclass, which represents a directory. DirNodes keep a map of their children (from name to Node). We need to be able to get from a child to its parent, in order to process the cd .. commands in the input, so I gave DirNode an addChild method that would both add the child to its map, and set the child's parent to itself.

Finally, while a file Node would have a size member set to its own size, DirNode.size was assigned a function that added up the sizes of all its children. (Notice that in MiniScript, it's perfectly fine for a subclass to override a simple value with a function, or vice versa.)

The code for reading and processing the input came out very similar. The subsequent code was simpler, though. Given any node, we can simply check its .size, and not worry about whether it's a file or a directory. (However I did find I had no quick way to find all the directory nodes; so I ended up adding an allDirs list, and pushing each new DirNode onto that as it is made.)

The complete program comes to 87 lines — almost the same as last night's version.

f = file.open("input.txt")
//f = file.open("example.txt")

Node = {}
Node.name = ""
Node.size = 0

DirNode = new Node
DirNode.children = null // map from name to Node
DirNode.addChild = function(node)
    self.children[node.name] = node
    node.parent = self
end function
DirNode.size = function
    sum = 0
    for c in self.children.values
        sum = sum + c.size
    end for
    return sum
end function

root = new DirNode
root.name = ""
root.children = {}

curDir = root
allDirs = []

while not f.atEnd
    line = f.readLine
    if line == null then break // end of file
    print line
    m = line.match("$ cd {subdir}")
    if m then
        if m.subdir == "/" then
            curDir = root
        else if m.subdir == ".." then
            curDir = curDir.parent
        else
            curDir = curDir.children[m.subdir]
        end if
    else if line == "$ ls" then
        continue
    else
        // file line
        sizeOrDir = line.split(" ")[0]
        name = line.split(" ")[1]
        if sizeOrDir == "dir" then
            noob = new DirNode
            noob.children = {}
            allDirs.push noob
        else
            noob = new Node
            noob.size = sizeOrDir.val
        end if
        noob.name = name
        curDir.addChild noob
    end if
end while
f.close

// find sum of small dirs
resultA = []
for e in allDirs
    sz = e.size
    if sz <= 100000 then
        print "Small dir: " + e.name + " of size: " + sz
        resultA.push sz
    end if
end for
print "final answer (part A): " + resultA.sum

diskSize = 70000000
spaceUsed = root.size
curFree = diskSize - spaceUsed
spaceRequired = 30000000
needed = spaceRequired - curFree
print "Need to find " + needed + " bytes"

// Now find smallest dir that's at least needed bytes
best = null
for d in allDirs
    sz = d.size
    if sz < needed then continue
    if best == null or sz < best then best = sz
end for
print "final answer (part B): " + best

Conclusion

In the end, the two approaches were of similar length and complexity. The node approach feels cleaner somehow, but when actually doing the challenge for time, all that matters is the fastest (not always the cleanest) path to an answer. Would the tree approach have been faster? Hard to say.

As always, I come away from one of these challenges thinking about what sort of library code could have made it faster. Perhaps some methods for searching, filtering, and applying-methods-to trees? With those in hand, the tree approach would certainly have been faster than my custom map-to-lists code.

Once again, the Advent of Code challenge was a fun and thought-provoking exercise. I look forward to the days ahead!

Advent of Code (in MiniScript), Day 7

Solution #1: Directory Map

Solution #2: A Tree of Nodes

Conclusion