47. The Greatest Puzzle Generation

I promised in the last blog post that I would write a bit about how optimizing the solution-finder paved the way for puzzle generation.

The good news is that I worked on the game a lot in the past week. The bad news is that I still haven’t gotten the offline puzzle generator entirely working. Since it’s a relatively low priority task, and I wanted to ship another quick build to a few people over the weekend, I set the puzzle generator aside in order to make improvements to the starting flow in two of the areas of the game.

Once you have a puzzle solver, making a puzzle generator is actually pretty straightforward. Really all you have to do is plug in random layouts into the solver until you find some that are solvable.

This is already good enough to generate a bunch of solvable puzzles, but it’s a rather blunt tool. It will generate duplicate puzzles, and it doesn’t tell us which puzzles might be worth looking at.

No Dupes

To prevent duplicates, we could generate puzzles in a way that never produces duplicates in the first place. However, this could mean that for any given stretch of time, we could be stuck in a local minima of uninteresting puzzles.

Finding interesting puzzles usually involves a lot of intuition, but generating them could be considered analogous in some ways to a hill-climbing problem. I don’t want to lean much into the analogy, so I’ll just say that since we are hoping to find interesting puzzles that are not very similar to each other, it is good to be as random as possible.

With random puzzles, the only way to prevent duplicates is to keep track of all previously generated layouts. This way, when we want to generate a new layout, we just keep generating layouts until we find one that we have not used already. Since we are already keeping a list of all solvable puzzles, we can easily keep an additional list of all puzzles tested, and check new layouts against the list.

The only real concern here is a trade-off between memory usage and performance. Keeping the list of all tested puzzles in memory will eventually use a lot of memory, but writing the list to disk and parsing that list over and over could be time consuming.

For the time being, I’ve decided to go with the in-memory approach. This simplifies a few things for now, and I can always write things out to disk later.

I chose to use C#’s built-in HashSet collection type to store all the previously generated panels. The benefit of this data structure is that it should be the fastest built-in type for checking if a certain item is on the list (although I have not profiled this).

Currently, puzzles are stored as several arrays. However, for testing the puzzle generator, I’ve just been generating one of these arrays: the symbols array, which stores any symbols that might be on the panel and is just a 1-dimensional integer array of size width*height. (I use a single dimensional array because I’m relying on Unity’s built-in serialization to save the panels and it does not natively support multi-dimensional arrays. I could take the time to write a custom serializer, but I’ve just been accepting the cognitive overhead of doing x+width*y every time I want to do a lookup.)

An annoyance that I immediately encountered when attempting to store generated panels in the HashSet is that when doing the following:

bool done_generating = false;
while(!done_generating)
{
    //symbols is an int[]
    //previousPanels is a HashSet<int[]>
    symbols = GenerateNewPanel();
    if(!previousPanels.Contains(symbols)) 
    {
        //The panel is unique!
        done_generating = true; 
    }
}

previousPanels.Contains() would sometimes let duplicate panels through. That is to say, it would declare that a certain panel state was not in the HashSet when it really should have been.

I would have expected to get hash collision problems, and have panels get rejected as duplicates when they were not, but for some reason the opposite problem arose. I’m not sure why this is. Perhaps there is something funny going on with the implicit call to symbols.GetHashCode() since it’s an integer array (You can find the implementation in the reference source code, but I don’t understand it enough to tell what went wrong). Perhaps I don’t understand hash generation well enough in general.

In any case, I averted my eyes from whatever madness was happening there, and just passed a string in instead and moved along.

One other important thing to consider in my specific use case is that puzzles can sometimes be rotated or reflected duplicates. What I mean is that the following three puzzles are identical, as the solution gets rotated and reflected along with the symbols.

rotational_reflectional_symmetry.png
Left: A Panel;   Center: Left, Rotated 90° Clockwise;   Right: Left, Mirrored Vertically;

Checking for these duplicates is really just a matter of comparing all possible manipulations of a generated panel to the HashSet before declaring it unique.

The Cream of the Crop

Once you’ve got a list of unique puzzles, the problem then becomes how to sort through the many generated puzzles to find ones that are interesting.

This is the thing that I have mostly not finished. I began by keeping track of how many solutions any given puzzle had, as generally speaking, puzzles with only one solution are more interesting to me (although not always more interesting to put in the game, for various peculiar reasons) than ones that have several solutions.

However, I ran into a strange bug where the number of solutions is being incorrectly reported during puzzle generation. I spent a few hours trying to track down some fault in my generation code, or in the way in which I was serializing the finished puzzles to disk.

My best hypothesis at the moment for the location of the bug is that it has to do with the  solution validation code. When checking if a puzzle is solved or not, I am assuming that the panel has been completely initialized. This is usually a safe assumption because panels don’t have symbols moving around at runtime (yet?) and because if I move them around in the editor, the panel gets reinitialized when I’m done.

However, inside the puzzle generator, I’m not re-initializing the entire panel when the symbols change (primarily as an optimization). This may be causing the solution checking to fail because some state has not been properly set.

More on that later. I won’t guarantee that I will be back in the next post to talk more about puzzle generation, as I am beginning to hack away at the tasklist for the next real test build(!)

The list is only 11 items long, but they’re big items.

Leave a comment