68. In The Mountains of Madness (Win32 Wrangling Across the API Boundary)

As I mentioned in the previous blog post, I ran into some issues with mapping the mouse movement from the screen into the game. One of those issues required venturing outside the bounds of the Unity API.

So, for the longest time, I had a certain number of playtesters who were reporting that the mouse movement in the game was very sluggish for them. Unfortunately, I was not able to reproduce this issue until I purchased another monitor. It’s possible that the issue still exists in other circumstances and for other reasons, but in this blog post I will be covering what I did to at least resolve the issue that I could reproduce.

The Issue

“…Unity does not always properly tell you the resolution information for the monitor that the game is running on, and instead shows you the resolution of the “main monitor” as it is set in the display settings in Windows.”

–Me, earlier

Although Unity provides us with a Screen structure that contains information corresponding to the resolution of the screen and the size of the game window, this information is not always correct for our purposes. I was wrong however, in the previous post, in asserting that it always pulls info from the main monitor; what it actually does depends on whether the game is running in windowed mode or fullscreen mode.

If the game is running in windowed mode, then Screen.currentResolution is the native resolution of the monitor that the game is currently running on. However, if the game is running in fullscreen mode, then currentResolution will always be the same as the game’s internal rendering resolution, regardless of the final output resolution.

For example, if we are running the game fullscreen at 1920×1080 but the display is 3840×2160, even though Unity upscales the game to 3840×2160 in order to make it fullscreen, currentResolution will be 1920×1080.

This is a big problem, because in order to scale our mouse movement from screen space into world space, we need to know the ratio between the size of the screen pixels and game pixels. In fullscreen mode, if the player is running below their native monitor resolution, a single pixel in the game’s internal resolution will correspond to multiple pixels on the screen because of upscaling.

Funnily enough, even though we can get this ratio in windowed mode, it is totally unnecessary there, as the game is be rendered unscaled in windowed mode.

The Solution

Because I couldn’t reproduce the issue for the longest time, my initial hunch here was that solving this problem would involve coming up with some other way to handle mouse input appropriately.

I hoped that the new input system for Unity would allow me to solve this issue, however my initial experiments showed some kind of cumulative performance issue which would cause the game to become unplayable after about 30 minutes or so. (I am not sure if this issue has been resolved at this point, but in any case, I decided to pursue other possibilities for fixing the mouse movement.)

Upon finally reproducing the issue, and coming to the diagnosis that I mentioned in the previous section of this article, I set about trying to get the information about the monitor in some other way.

There are other functions in the Unity API, but none of them were very helpful. For example, you can easily find information about all the monitors using the Display class, but there is no information about which monitor the game is currently running on. Display.main is simply the main system monitor according to Windows (this was the cause of my earlier confused reporting about Screen)

So I did what any confused programmer would do at this point; I googled the problem. This led me to this thread on the Unity forums, and some very confusing code.

There’s no real good way around saying it, I just copied and pasted that code and tried to work from there. This was after locating the only thing I could find written officially by Unity about this type of functionality, which was really no help at all.

(I also found a Unity Answers thread with some similar code to the thread on the forums.)

So, in hopes of explaining what I have learned in a better way, and adding one more random thing to the internet about how to handle calls to the Win32 api from Unity. I will post the entire class I wrote and we’ll go through the code as far as I can explain it.

First, the code:

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using System;
using System.Runtime.InteropServices;

public static class MonitorInfo 
{
    [DllImport("user32.dll")]
    private static extern IntPtr GetActiveWindow();
    [DllImport("user32.dll")]
    private static extern IntPtr MonitorFromWindow(IntPtr hwnd, int flags);
    
    [StructLayout(LayoutKind.Sequential)]
    public struct RECT
    {
        public int left;
        public int top;
        public int right;
        public int bottom;
    }
    
    [StructLayout(LayoutKind.Sequential)]
    public class MONITORINFO
    {
        public int cbSize = Marshal.SizeOf(typeof(MONITORINFO));
        public RECT rcMonitor = new RECT();
        public RECT rcWork = new RECT();
        public int dwFlags = 0;
    }
    
    [DllImport("user32.dll", CharSet = CharSet.Auto)] [return: MarshalAs( UnmanagedType.Bool )]
    private static extern bool GetMonitorInfo(IntPtr hmonitor, [In, Out] MONITORINFO info);
    
    
    public class monitor
    {
        public int width, height;
    }
    
    public static monitor current;
    static MONITORINFO info;
    public static bool isValid;
    
    public static void Update()
    {
        if(info == null) info = new MONITORINFO();
        isValid = GetMonitorInfo(MonitorFromWindow(GetActiveWindow(), 0), info);
        if(isValid)
        {
            if(current == null) current = new monitor();
            current.width = info.rcMonitor.right - info.rcMonitor.left;
            current.height = info.rcMonitor.bottom - info.rcMonitor.top;
        }
    }
    
    
}

Looking at the boilerplate at the top of the file, it’s pretty standard fare, however we need the following in order to interface between Unity’s API and the Win32 API layer:

using System.Runtime.InteropServices;

Next, in our class we need to declare prototypes for the external functions that we plan on calling. We also tell Unity with the [DllImport] attribute to import these functions at runtime from the user32.dll file.

[DllImport("user32.dll")]
private static extern IntPtr GetActiveWindow();
[DllImport("user32.dll")]
private static extern IntPtr MonitorFromWindow(IntPtr hwnd, int flags);

These function definitions are based off the interfaces specified for these functions on MSDN:

GetActiveWindow()

HWND GetActiveWindow();

MonitorFromWindow()

HMONITOR MonitorFromWindow(
  HWND  hwnd,
  DWORD dwFlags
);

A big part of the annoyance here is these types that Microsoft uses for their API calls. What the heck is an HWND? Well, unfortunately I do have some experience with the terrible windows API, so I know that HWND is a Handle to a WiNDow. Similarly for HMONITOR, which is a handle to a monitor. And by handle, they just mean a 32-bit pointer.

God knows how you’re supposed to find this out, but the type that we’re supposed to use in C# to deal with 32-bit pointers is IntPtr.

Okay, so that just leaves DWORD to figure out.

Well, a DWORD, or Double WORD is just a 32-bit integer. This is essentially based off the idea that the processor word size is 16 bits (which it is not, but probably was at the time the Windows API was designed).

Anyway, moving on, we can just use int in our C# code, as the C# integer size is 32 bits. That gives us the definitions below:

private static extern IntPtr GetActiveWindow();
private static extern IntPtr MonitorFromWindow(IntPtr hwnd, int flags);

private and static may not be necessary for you, but in my case this is part of a static class so that it can be easily accessed at global scope from all my other classes in Unity. Static classes in C# require all members to be declared static (and unfortunately it doesn’t do this automatically, we actually have to type static a billion times). I also chose to make these members private because this makes this all a bit less worry-prone as a globally scoped class.

So, next we have a couple of structure definitions:

[StructLayout(LayoutKind.Sequential)]
public struct RECT
{
    public int left;
    public int top;
    public int right;
    public int bottom;
}
[StructLayout(LayoutKind.Sequential)]
public class MONITORINFO
{
    public int cbSize = Marshal.SizeOf(typeof(MONITORINFO));
    public RECT rcMonitor = new RECT();
    public RECT rcWork = new RECT();
    public int dwFlags = 0;
}

These are again, based on the MSDN documentation for these structures.

Rect

typedef struct tagRECT {
  LONG left;
  LONG top;
  LONG right;
  LONG bottom;
} RECT, *PRECT, *NPRECT, *LPRECT;

MonitorInfo

typedef struct tagMONITORINFO {
  DWORD cbSize;
  RECT  rcMonitor;
  RECT  rcWork;
  DWORD dwFlags;
} MONITORINFO, *LPMONITORINFO;

We could use long in the C# code for the RECT members, and it would technically be more correct, however integers work just fine here. This may be some legacy feature of the API though.

We also need to use the [StructLayout] attributes, because normally, in C# the compiler is free to re-order the elements of a struct or class for memory efficiency purposes, however in our case we need the elements of these structures to be in the exact order that they are in the Win32 API.

One strange detail is this line:

public int cbSize = Marshal.SizeOf(typeof(MONITORINFO));

This is essentially code that I copied from the forum post I mentioned earlier, and as such I can’t entirely explain why it works this way, but the Marshal class is part of the System.Runtime.InteropServices that we included earlier. Specifically, it is a class that is used in order to convert between managed and unmanaged memory types.

Based on that, I can hazard a guess that what we’re doing here is actually pulling the size of the MONITORINFO struct from the Win32 API side of the the code, and not the size of the MONITORINFO class that we are currently in the process of defining in our own code. This seems reinforced by the fact that if we change this line to….

public int cbSize = sizeof(typeof(MONITORINFO));

…then Unity will complain that MONITORINFO is undefined.

Okay, moving on. Now we have this whopper of a function prototype definition.

[DllImport("user32.dll", CharSet = CharSet.Auto)] [return: MarshalAs( UnmanagedType.Bool )]
private static extern bool GetMonitorInfo(IntPtr hmonitor, [In, Out] MONITORINFO info);

This is, of course, still based on the definition on the MSDN page:

GetMonitorInfoA()

BOOL GetMonitorInfoA(
  HMONITOR      hMonitor,
  LPMONITORINFO lpmi
);

“Wait, hold on, why is this called GetMonitorInfoA?”

Answering that will also answer why we need to have this CharSet = CharSet.Auto attribute as part of the dll import.

There are two versions of GetMonitorInfo in the Win32 api, one for ASCII characters (GetMonitorInfoA, which is the legacy version) and one for UTF-32 characters (GetMonitorInfoW).

“But wait, why on earth are we worried about text?”

We only have to care about this because this function could potentially return a MONITORINFOEX structure, which contains a string that is a name of the monitor. In our case, we are just throwing that data away and using the smaller MONITORINFO struct, but we still have to support it as part of our function prototype definition.

*sigh*

Another oddity as part of the Attribute definition is this:

[return: MarshalAs( UnmanagedType.Bool )]

Why do we have to marshal bools? Don’t ask me, but the function returns a bool specifying whether or not a monitor could successfully be found, and if you actually want to know that, you’ll need to marshal the bool across the API boundary, because managed and unmanaged bools are not compatible.

The only detail that might be confusing about the actual definition of the function is the [In, Out] attribute. This seems to be the way to specify that a parameter is passed by reference across the API boundary here. Changing it to ref does not work.

At this point, the rest of the code should be fairly understandable, if you have any experience with Unity C# coding:

public class monitor
{
    public int width, height;
}
    
public static monitor current;
static MONITORINFO info;
public static bool isValid;
    
public static void Update()
{
    if(info == null) info = new MONITORINFO();
    isValid = GetMonitorInfo(MonitorFromWindow(GetActiveWindow(), 0), info);
    if(isValid)
    {
        if(current == null) current = new monitor();
        current.width = info.rcMonitor.right - info.rcMonitor.left;
        current.height = info.rcMonitor.bottom - info.rcMonitor.top;
    }
}

One thing that’s worth noting, is that I keep track of a isValid bool publicly, so that I can always check if the calls to the Win32 api returned valid data before I go around using it.

Implementation

So with all that done, we can now change the code that handles the mouse scaling to the following:

Vector2 ScreenRes = new Vector2(Screen.width, Screen.height);
if(MonitorInfo.isValid) ScreenRes = new Vector2(MonitorInfo.current.width, MonitorInfo.current.height);

This means, that as long as we are able to return some valid information from the Win32 API, we will use that. If not, we will fall back to the Screen structure that Unity provides for us, which may be wrong in some cases.

Hope you learned something!

66. Burnout and Walking Animations

I plan on posting these video development logs on a weekly basis over at my YouTube channel. I may post some occasional reminders here going forward, but I’d rather keep this written devlog as it’s own separate thing rather than simply as a cross-posting of the videos. So, if you don’t want to miss any of the video dev logs, I recommend you subscribe to my YouTube channel.

However, since you’re here instead of at YouTube, I’ll reward you with a few sneak peeks into what I left out of the video log.

This past week, I’ve been working on relatively small polish features, which is a bit of a continuation of the work I did last week with footstep sounds (still have some more of those to do actually). I think this is partly as a way to give myself a break from tearing up major chunks of the game to add artwork. But even if it feels like less progress, these small things propagate across the entire game and affect your interactions throughout.

One of these small improvements is to the walking animations. The previous animations were serviceable, however when I added running animations, the running animations looked significantly better in comparison. So I added more frames to the walking animation and made some small tweaks. You can see the comparison between the old (left) and new (right) animations below:

I still want to add animations for when you move diagonally, and hope to get to that next. But I think even this goes some ways towards giving the game a more polished feeling.

I did a few other fun things, but I’ll save those for next week’s video. Hope to see you then. 🙂

35. Thoughts on Accessibility

3&5

So, I’ve finally finished up auditing all the puzzles in the game. Essentially, just solving them all and taking notes on what might need to be improved or removed. Overall, I think the game is in a pretty good place moving forward. There are some areas that I’m pretty happy with as is, but there is still a lot of work to be done to improve some other areas. Obviously this is just considering design work. Let’s not even get into how behind the curve I am from an aesthetic point of view.

Early Accessibility

This week, I chatted with a deaf accessibility advocate for games. This was an interesting and challenging conversation, and has left me thinking a little bit about what’s involved in making a game more accessible, and how that intersects with the design of Taiji. Obviously, I think that accessibility is an important and often ill-addressed concern, and my goal with the game is to never make a puzzle difficult for reasons that have nothing to do with its subject matter.

However, I think there is some tension between accessibility concerns and the pursuit of particular subject matter. As an example, if you want to do puzzles that are “about sound”, deaf people will unfortunately, but necessarily, be excluded.

It is easy to see why making a puzzle that relies on pure audio cues is a bad move from an accessibility standpoint. In many cases, it would be easy to make additional visual cues, and to not do so can simply be chalked up to laziness.

But my big question is, are there particular cases when not adding those cues can be justified? And if there are, what are they?

Some of my thoughts on this are a bit hard to clarify without specifically addressing and spoiling some of the puzzles planned for Taiji. Even still, after discussing the details of the puzzles with the aforementioned accessibility advocate, they did not seem particularly convinced that there was any tension other than my laziness and lack of care.

Similar concerns arose in the wake of the release of The Witness. There are some puzzles in that game which people who are color-blind or hard of hearing will have trouble with, or may simply find impossible without just looking up the solutions. Because of this, the designer—Jonathan Blow—has been called callous, “ableist”, or at best unconcerned about accessibility. The last one is particularly strange, considering the game received a PC patch to add a click-to-move mode, and nearly all of the puzzles in the game—including many that use color as a cue—are designed to be as accessible as possible; symmetry puzzles that care about colored hexagons using cyan and yellow, for example. In my last point of defense of Jon Blow before I move on, he has been quoted saying that he wanted to ship the game with a puzzle which only color-blind people would be able to solve, however in this case he was hamstrung by the the poor color consistency of most display technology.

So, what is the reason to make those kinds of decisions? To create puzzles which by their very nature exclude certain people from fully enjoying your game? Jon Blow says it is because the puzzles in The Witness are all “about things.” I agree with this sentiment, but it perhaps requires a bit more clarification to even make sense.

In many puzzle games, the “point” of the puzzles is essentially to be a challenge. The puzzles are meant to be hard for the player to solve, and probably fun as well. In games like The Witness (or Taiji, for that matter) the point of the puzzles is significantly subtler. The puzzles are intended to be interesting and, about something real. By “real” I mean that the subject matter is, at best, not confined simply to the game. The thinking that a player will do when solving the puzzles can be taken with them back into the real world.

It may seem a bit callous, but it should go without saying that both sound and color are phenomena that really exist, even if some people cannot experience them.

This can seem to lead down a path of reckless disregard for other people, so I think it is also very important to emphasize that what I’m talking about here—both in my own case and the case of The Witness—are puzzle games. These are games that, by design, will exclude people who are simply not intelligent enough to complete all of the puzzles in the game.

So What?

Perhaps all of this can just be seen as a long elaboration intended to serve as an excuse for my laziness, or “ableism”, or perhaps some other unidentified flaw of character. However, I still have not addressed the core issue:

What am I going to do about accessibility?

Currently the plan is to make the game as accessible as I can. When puzzles involve the use of color for separation, but are not explicitly about color, I will endeavor to choose colors that will work for as many people as possible for that purpose.

But what about when puzzles are explicitly about those things? What about puzzles about sound, for example?

In these cases, I will primarily design the puzzles with the intent of pursuing the subject matter that interests me. If I have to choose between a puzzle which can be made accessible or painting myself into an inaccessible but more interesting corner, I will most likely choose the corner.

However, I also intend to provide some level of assistance when possible. This, in itself is a tricky proposition, because I do not simply want to condescend to players who are using the assistance options. In essence, adding accessibility seems as though it will often amount to designing an entirely different set of puzzles. The puzzles therefore must be interesting in their own unique ways, and must endeavor to be analogous to, and at least as challenging as, the inaccessible puzzles.

Perhaps this will not be enough, or not even be possible. After all, I am mostly opining here without having fully designed any set of puzzles like this. But I think this is the best way for me to balance these two ideals: accessibility and truth.

Addendum

I want to clarify a couple things: What exactly is the difference between a puzzle which is about sound or color and one that simply uses it as a cue. And why do I think that the former cannot be made accessible without simply designing a different set of puzzles?

To do so will require spoiling a couple puzzles. The Witness is a hugely broad puzzle game, so I can actually find examples in both the positive and the negative without talking about any other game.

Spoiler Warning: If you have not completed the two areas shown below in The Witness, then avoid reading further.

witnessareas.jpg
The Keep (left) and The Jungle (right) in The Witness

The Keep

So, in the example of puzzles which simply use audio as a cue, but are not really about sound; we have the example from The Keep. The third hedge maze puzzle in the front courtyard must be entirely solved by listening to the loudness of your footsteps while walking on the gravel pavement inside it. The maze on the panel matches the shape of the hedge maze which contains it, and particularly crunchy spots found while walking through the physical maze denote an area to be avoided when drawing a corresponding path through the maze on the panel.

The reason that I say this is not a puzzle about sound, is that it would work just as well if you simply wrote out separate captions for the footstep sounds. The softer ones being written as “crunch“, and the louder ones being written as “CRUNCH“. The puzzle would essentially work the same.

It is important here to note the difference between subtitles and captions. Subtitles only show you spoken dialogue, whereas captions will also give you a textual indication of important audio cues.

The actual puzzle here is about noticing that the loudness of the footstep might be important, and then figuring out in what way. The argument against accessibility here is that hearing players are inundated with the sound of their footsteps throughout the whole game, and so this is a subtle thing to notice. Consequently, captions would be much less subtle.

I think this is actually not a very good argument: First, it isn’t really that subtle, as the important footstep sound is unusually loud here. And secondly, if the consistency is really that important, just put captions on all of the footstep sounds in the game. Make them small and unobtrusive if you have to.

Sadly, The Witness does not support captions at all, and therefore this puzzle is impossible for deaf players without simply looking it up.

The Jungle

So, in the positive example, we have the puzzles in the Jungle. These are puzzles that are actually about sound, or more specifically: about hearing. This will be harder to explain, because the puzzles themselves are much more subtle.

First, if you need a refresher on the content of these puzzles, I actually have an analysis video on the first half of this area, in which I discuss some of the subtle details involved:

 

Now, I want to bring the attention to my specific point earlier. Why do I think that these puzzles cannot be made accessible without simply making a different set of puzzles?

In this area, the essential task that the player is doing is listening to some bird songs, and transcribing the different pitches of the notes onto the panel. One could easily imagine some sort of analogous visual cue: a series of mechanical birds which all chirp in time with the notes, and are set on branches of varying height, with the height of the branch corresponding to the pitch of the note.

This type of cue would in fact work quite well, but only for the first three puzzles in the sequence. Past that point, the puzzles begin to play with both the particular difficulty of distinguishing different notes by ear, and the way in which we focus our attention on certain sounds by filtering out others. Both to our benefit and our detriment.

Perhaps again, there could be some analogous changes to our cues here. Perhaps the birds start out in linear order as they would be on the panel, but they begin to be shuffled up, and the player must watch the order in which they chirp. Perhaps the branches which the birds are situated on begin to blow in the breeze, and so it is more difficult to tell which bird is supposed to be higher. Perhaps there is a branch which is broken and the bird has fallen onto the ground. Perhaps there are birds which are not situated on branches at all, and they are irrelevant. These could be interesting ways to evolve the sequence, but what I am trying to argue is not that the puzzles cannot be made accessible, but that by doing so, they are now puzzles which are fundamentally about something different.

They are no longer puzzles about sound, and now are puzzles about spatial relationships between moving objects.

Does this make these accessible puzzles bad? No. But it does make them different puzzles.

31. How Do We Pace Puzzle Games?

A few weeks ago I read this tweet:

The linked image is reproduced below, in case the twitter embed breaks at some point:DC8Rx6fXsAA5bbHObviously, pacing in games is very important, but this comparison got me wondering about how exactly you pace games which focus on puzzle-solving. It’s not that I haven’t thought about this at a subconscious and abstract level. In designing Taiji, I care very deeply about making sure that the game is as interesting as possible, and this often means addressing pacing, at least in some hand-wavy way. But, I’ve mostly been playing by ear, and I haven’t put forth my approach as a formal set of design guidelines.

I did give a semi-formal lecture on puzzle game design a couple of years back, intending it to be a bit of an elaboration on a lecture given by Marc Ten Bosch and Jonathan Blow in 2011, taking those ideas and expressing them as a practical set of rules. I thought it might become a useful resource for beginning puzzle game designers, which seems to have been validated by some of the responses I’ve received. Watching it is useful in understanding the design approach that I use, but because it is so high level, there is still a large amount of intuition required when the proverbial puzzle design rubber meets the road. And furthermore, it doesn’t really say what pacing in a puzzle game is, much less how you should go about improving it.

So, where do I go from there?

I actually spent around six months working on a video essay about The Witness, wherein the explicit goal was to use that game as a case study for exploring the details of good puzzle game design. In particular, I find The Witness to be a standout example of non-verbal communication of ideas through gameplay (Conveyance), both when it comes to teaching the player new mechanics through play, and more remarkably, the way in which the mechanics combine to express something fundamental about humanity’s search for Truth. However, in my writing and editing process, I got lost in the details of individual puzzles and didn’t manage to connect them to the larger picture of the game.

Although I don’t consider that effort completely abandoned, I still feel I lack the proper words to really express my thoughts about the subtle things happening in that game’s design.

I sometimes get questions or suggestions from viewers on my twitch streams. Sometimes these are good ideas, but when they are bad ones, I tend to take an inordinately large amount of time attempting to explain why I think the idea is bad. I mostly do this because I do not wish to appear as though I am simply a diva who is following his own whims, and is insulted by anyone’s criticism of his creative vision. Rather, I want people to understand that I am working within a design framework, and there are certain decisions which very clearly step outside the framework.

The issue is that it’s only very clear when viewed from inside my own mind, but outside of my mind, the framework is ill-defined because I oftentimes lack the words to really express what I am even talking about fully.

I tend to detest jargon, but sometimes creating new terminology is useful, or even necessary to logically explain the choices that one is making, or to make those choices more effectively.

A good example of useful jargon is the entire field of music theory. Without the abstract concept of playing “in the key of A minor”, we would instead have to rely on everyone’s instruments being tuned into the same key by one person who understands the harmony on an intuitive level, or we would require an inordinate amount of communication and correction among the musicians who wish to play together as a group.

So, the main question that I have been turning over in my mind, is how exactly do I define what I think of as the “ideal” pacing or flow for a sequence. What follows is a first attempt to address that question.

•   â€˘   â€˘

One of the great things about the structure of Taiji is that the entire game is open from the beginning. The player can choose from several major areas and tackle them in any order that they wish. Sometimes progress in an optional area will be halted because of lack of knowledge of a puzzle mechanic from another area, but there are no artificial barriers to progress. This allows the player to set their own pace, to a certain extent. If they get bored with the nature of a puzzle that they are stuck on, they are free to travel to one of the other areas and think about a different type of problem.

This design could easily be used as an excuse to not care about the internal structure of each area, but I still try to do my best to pace and structure the areas properly from within.

Still, until I read that tweet, I didn’t even think to use the word “pacing” to describe the concern. Other genres have been thinking about pacing for decades, but the primary historical difference between puzzle-solving games and those other genres, is that other genres have core gameplay, and puzzle games do not.

In a shooter, for example, the core gameplay is running around, jumping, and shooting enemies make progress. As the game goes on, you might get new and more powerful guns, the enemies might become more varied or have different movement patterns, but the core of the game is always that same core gameplay. In fact, the core gameplay is why we even call these games “shooters”.

So, one might ask, why would puzzle games not have core gameplay?

The answer, is that some do and some do not.

Disregarding the variety of action puzzle games (such as Tetris) which clearly do have core gameplay, and focusing on the category of puzzle-solving games, there are a two main categories: games with systemic puzzles, and those which feature one-off, ad-hoc puzzles. The main difference being that the systemic ones have core gameplay, and the non-systemic ones do not.

Without core gameplay, it is next to impossible to pace a game. This is why adventure games have such a reputation for ridiculous difficulty spikes and poor pacing. Some puzzles can be solved in a few minutes, whereas others will hold up players for hours or even days. Although the primary gameplay in an adventure game is “solving puzzles”, because each puzzle is entirely different, nothing the player learns in one puzzle can be carried forward to help them solve the next. (There are some exceptions, but this is more due to rigor on the part of the designer than any true systemic underpinnings.)

The great thing about systemic puzzle games (games like Taiji, or The Witness, or Portal) is because all of the puzzles are built atop an underlying system, every puzzle has the potential to teach the player something about how that system behaves, and the player can use this information to better solve new puzzles.

When there is a sequence of puzzles which are specifically designed to build up an understanding of a systemic idea in the player’s mind, this functions very similarly to a paragraph of text. (In the lecture I gave a couple years back, I referred to this concept formally as a capital-S Sequence, but whatever you call it, it’s just a smart way of structuring of a series of puzzles.) In the same way that a paragraph is a series of sentences elaborating on one shared idea, a sequence is a series of puzzles with a shared idea at their core.

So, back to pacing for systemic puzzle games. How exactly do we go about it, and can we at least draw a broad outline of what constitutes good pacing?

I would say that the ideal structure for puzzle sequences follows a dramatic arc. The beginning is marked by a slow and logical build-up, the player works through puzzles step by step, and then there is a climactic moment to cap off the sequence. This climax to the sequence can take on a lot of forms, but the first two parts are usually very similar across all sequences.

This structure shares an obvious similarity to the Three Act structure for storytelling. So, let’s just borrow that terminology in order to define it formally:

The 3-Act Structure of Puzzle Sequences

INTRO    >    BUILD-UP  >   CLIMAX

Every puzzle sequence has an main idea at its core, this is either a system or something that all of the puzzles in the sequence are “about”. For purposes of discussion, we’ll just call this The Main Idea. 

INTRO: In this act, The Main Idea is introduced in a relatively opaque, but simple way. The player solves a puzzle, or a short series of puzzles, and then they understand (or believe they understand) The Main Idea well enough to begin deductive reasoning about puzzles in the next act.

Because the player may not fully understand The Main Idea even by the end of this act, the puzzles here should be very simple and able to be solved quickly in order to not lose momentum early.

The player will be solving most of the puzzles in this act using intuition, so it is important to keep their assumptions in check and make sure they don’t get lost in the weeds. This is a particularly good time to use Reprises (when the structures of two puzzles are very nearly identical, in order to emphasize the differences) to create counterpoints to certain wrongheaded assumptions and build understanding.

BUILD-UP: In the second act of the sequence, The Main Idea is explored more deeply, with various puzzles gently increasing the challenge or layering on complexity.

Puzzles in this section are less intuitive, and require the player to use a more top-down approach, deducing the solutions based on what they have learned in the first act. Momentum is ideally constant or very gradually decreased during this part, with each puzzle taking about the same amount of time as the one before it.

CLIMAX: In the final act, we finish exploring The Main Idea, and the puzzles reach their highest point of challenge as the sequence ends.

Sometimes the most satisfying way to conclude the sequence is by introducing a twist. This is a puzzle which must be solved through lateral thinking rather than deduction. There is something that the player does not know which is required to understand and solve the puzzle. Optionally, this knowledge can redefine or re-contextualize The Main Idea. If this happens, when we proceed to the next sequence, it will be with the same Main Idea, but with the new context.

Pacing in this final act is less important, and the puzzles can be very difficult and time-consuming here. As a rule though, the final puzzle in the sequence should take longer than the ones that came before it.

(Additionally, it is useful to remember that a larger area which explores one Main Idea can be built up out of multiple sequences. To reprise the literature metaphor, the larger area would be similar to a chapter in a book, where the sequences would be paragraphs, and the individual puzzles, the sentences. In this case, each sequence will usually explore a small aspect of the overall Main Idea, and the structure of the full area will have an arc of increasing difficulty that is somewhat similar to the individual sequence, although not as rigid.)

The best sequences are generally the ones which hit the main points of each of the acts very clearly and without losing player momentum at the wrong times. Ideally the overall pace should start quickly and slow as the sequence progresses, with the easiest puzzles being in the Intro stage, and the hardest and most time-consuming puzzles being the Climax to the sequence.

Ideally we want to think about pacing as controlling the difficulty of the puzzles over time, but evaluating the difficulty of any given puzzle is not straightforward. In fact, it is one of the more subjective things in the entire field of game design. Although the skills developed by playing one action game oftentimes carry over to the next, this is rarely the case with puzzle games, and the perceived challenge of any given puzzle can vary quite dramatically between individual players. There is some correlation between higher IQ and lower perceived difficulty of puzzles, but there is still enough variation in the ways in which people think that some difficulty spikes are probably unavoidable.

I think this is a big reason why, perhaps more so than for other genres, it is important to have a large number of testers on the game, and to take each individual test with a grain of salt. One has to have a strong vision for what the player’s experience should be like, and also have a developed sense for when the deviations in a particular player’s experience are acceptable and when they are not.

This is more or less where my theory starts to unravel a bit. There may be a way to develop a good method for objectively evaluating puzzle sequences, but I see this as one of the areas where it is still more art than science. Even with infinite time and good luck, when making puzzle games, one must always accept that a certain number of players will have a bad experience. Really the best you can hope for is that a greater number of players will have an experience that is closer to the ideal than those that do not.

•   â€˘   â€˘

I’d like to close with a quote from Italo Calvino’s Invisible Cities. I didn’t really find a way to work it into the essay naturally, but it resonates with me in relation to the type of puzzle game design that I do. It comes from a chapter in the book which is sort of acting as a meta-commentary for the whole book. For context: the structure of Invisible Cities is a series of descriptions of the ways in which a city might be constructed. However, there are some descriptions which are more valid than others:

“…from the number of imaginable cities we must exclude those whose elements are assembled without a connecting thread, an inner rule, a perspective, a discourse. With cities, it is as with dreams: everything imaginable can be dreamed, but even the most unexpected dream is a rebus that conceals a desire or, its reverse, a fear. Cities, like dreams, are made of desires and fears, even if the thread of their discourse is secret, their rules are absurd, their perspectives deceitful, and everything conceals something else.”

“Invisible Cities” – Italo Calvino

16. More Art Style Testing (COLORS!)

In the last post, I mentioned a bit about getting back to looking at doing color in the game, because believe it or not, I’m not making an entirely black and white game. I very much want to do an art style which is mostly monochrome but has splashes of bright colors that pop against the rest of the image.

Here’s my first test of that:

spritingtest4_2

It’s again not a proper mock-up, but is more of a doodle, but I am very happy with where things are going and will try to perhaps do a proper mock-up/concept in the next week.

I’ve gone back and forth about whether I should talk about spoilery things on this blog or not. In general I feel like it’s okay because I haven’t really planned on making the blog public, but as time goes on, I keep thinking maybe I should make this public anyway. I just worry about talking so openly about big reveals in the game that I would like to keep under wraps. Or especially as may be the case here, big reveals that haven’t fully formed yet and so talking about the “ideas” is somewhat irrelevant.

I think for now I’ll just talk about what I’m thinking anyway and worry about it down the road, and perhaps it will make this blog stay more of a pure thing anyway by doing that, because I’ll have reason to not publish anything for spoilers.

(SO, SPOILERS, also some spoilers for The Witness here)

So, what I keep thinking about and what’s stumped me for like 6 months is the interface issue, as well as my desire to have some sort of equivalent of The Witness’s layer 2. This is mainly because I feel like my game takes obvious inspiration in terms of the panel puzzles from that game, but I don’t have any equivalent of the environmental puzzles, nor did I ever even really conceive of a thing. Since, in the case of The Witness’s design, the environmental type hidden object thing came before the rest of the idea, I am kinda at a disadvantage and maybe working backwards.

It’s certainly possible that I should throw in the towel here and just not have any thing of the sort in my own game, but I feel like that would probably be a disappointment to anyone who came to my game based on its similarity to the Witness and were hoping for a similar type of experience.

However, this is a real tall order, because I can’t simply do the same thing that The Witness did. For example, one could imagine a first-person version of Taiji where nothing in the world is square, with organic architecture with lots of flowing curves, and hidden squares in the environment which the player needs to line up their perspective to see. It would perhaps be a cool thing in a way, but it would nonetheless be entirely derivative, and for me, not that satisfying to design.

Essentially it would be the same game, only with a different symbology. The circle would be traded out for the square:

circlesquare
The circle is the one on the right.

So, I would rather take a cue from The Witness on a higher level than that, and instead think about how I can have a similar level of secret. How I can have a game that builds towards a startling revelation but does not have the same revelation, so that it would be just as satisfying to a player who had already played The Witness and might go in with a certain mindset and expecting a certain thing.

This more or less means that I have to work backwards, but I am hopeful that I will come up with something that is at least interesting and different, even if it doesn’t end up being profound.

So, the real spoilery thing, if I actually succeed is what exactly my thinking is on that secret.

I’ve thought a lot about what it should be, and my current best guess, as well as the only idea that I’ve taken far enough to visually prototype is this idea of selective color.

Over a year ago, when I first started the project in its conceptual phase with Martin Cohen, I was thinking about a Zelda-type game with a color theme, but I was not exactly sure how that would manifest. I kind of had an idea that you might be able to add or remove objects which were a certain color. Now, I am coming back to that idea, but I am planning on taking a somewhat different approach to executing on it, which is mostly in having the primarily black and white world, so that color objects will stand out.

My hope is that if I give the player explicit ability to interact with colored puzzle panels, they will perhaps not put together the link that they can also interact with objects of the same color in order to solve puzzles. We will have to see how that turns out in practice, but essentially the idea is just that color will be used as a marker for something in the environment that can be interacted with and perhaps this can be done in a subtle and revelatory way.

The other thing is thinking about generosity in terms of game design, because I have a primary worry about this “layer 2” nonsense since forever ago. Namely that it will feel too restrictive and therefore won’t be very satisfying. It will feel more like, oh I just need to find the red things and then click on them. So, that’s the main thing that I haven’t really figured out yet, but part of me has wondered if I shouldn’t just opt for a game design where the entire world is toggle-able on and off. I’m just not even sure how a game holds up when you can do that type of thing. I certainly don’t want to go halfway and then put limits on what you can and can’t toggle in a way that feels arbitrary. So instead, I must simply come up with rules about what is interactable that feels good but also feels surprising.

sokoban
An example of an explicitly tile-based game, perhaps you could toggle tiles on and off?

I suppose that’s all a bit rambly, but my main thinking is that the color limitation will make it not feel arbitrary what is interact-able or not. The only concern then becomes whether or not it makes it feel too restrictive or unsurprising. I may need to think about it for a while longer before coming to the proper conclusions about how exactly to implement the idea.

3. Damn the Interface, Full Speed Ahead!

This week I didn’t really get a chance to work on the game very much. Was busy with some other things (podcasts, getting stuff done for Patreon backers), and I had a major depressive episode. However, I did work on the game a bit today and streamed some designing of a new mechanic. We’ll get to that in a bit, but first I wanted to talk about some of the things I’ve been thinking about through the week whilst not “actually working.”

Interface Options

The interface problem continues to haunt me, so I’ve been thinking about that a lot and I’ve really narrowed it down to three main options:

Option 1

Keep the current interface and just press onwards with all of its problems. I don’t really like this option for obvious reasons. The current interface is super modal and it just feels kinda bad to interact with a discrete tile-based movement system in 2016 (or whenever the game is done). However, all of the game’s design has been done so far using this interaction method, so it naturally requires the least amount of redesigning work to proceed with.

Option 2

Replace everything in the interface with a “player as cursor” type model, where you’re always walking around on the panels to solve them. The primary issue with this method is that it is a bit incongruous with the way some of the puzzles already work (some of which I rather like), and reconciling this problem either requires some kind of ghost player to walk around on panels that the player cannot access and/or requires introducing the ability for the player to walk up on walls. Or both. So, although this is a promising approach, it entails introduction of some new mechanics that I am not sure that I can really capitalize on in any other major ways besides just using it as an interaction method for panels. The walking on walls thing is certainly cool, and plays off the ambiguity of a top-down perspective in a very Escherian way, but again: I’m not really sure I know how to capitalize on that in a good way. I feel like I’m likely to fall into the pit of bad level design that early-era FEZ did. On the other hand, it is an interesting and somewhat potentially mind-bending mechanic, if I can solve those problems.

Option 3

Giving the player a pointer type cursor which can be used at any time. This is a somewhat nice solution because it can probably be done without any sort of modality, just mapping the cursor to the mouse or to the right stick. However, it again is pretty weird in that it feels like you’re controlling two things and you need to mentally switch modes. But it doesn’t have the problems that the always walking on panels method has, in that you don’t always have to be walking on panels, and you don’t have a player shaped cursor that doesn’t really act like the cursor. The cursor looks like a cursor and the player looks like a player. It also has some interesting gameplay implications in that the cursor can be used on any panel that the player can see, even if they cannot get to it. It does however beg the question of the layer 2 type puzzles, as the player will probably expect to be able to interact with tile looking things in the environment. (Admittedly a problem with the other methods as well, but perhaps to a lesser extent?)

So, I’m not sure which of these approaches I really want to choose right now, as I haven’t totally fallen into that sweet spot where everything just clicks and it’s obvious which is the most elegant way to proceed. It may just be that I have to adopt some sort of hybrid approach. In fact, let’s call it:

Option 4

Take the pointer type cursor aspect of Option 3 (We can maybe implement it as a fairy?), and combine it with the walking around aspect of Option 2, so that the player can solve panels that they are walking around on just by walking around and pressing the buttons, without using the cursor. But they do have the option to move the cursor around and activate a panel which they cannot reach. Perhaps the fairy is the thing that bonks its head on the panels to make them do things? Fairies are kinda magical, so maybe this also ties into the storyline aspect of the protagonist having some sort of magical ability to interact with the panels. Also the “oh you have a fairy” sort of calls back to Zelda 64, which is maybe cool. STEAL IT!

Orthogowhatnow?

So, I decided to put together a bit of a prioritized todo list (well, actually I already had one, but I finally took a look at it and struck some of the completed tasks off the list and added more new ones). Although I think that the interface problem is really the most pressing thing, since it affects so many other decisions, I decided to forgo doing any actual work on that and instead implement a new mechanic idea that I had a couple days ago when I was driving into work.

Jonathan Blow and Marc Ten Bosch gave a talk together at IndieCade several years ago called Designing to Reveal the Nature of the Universe (excellent talk, I recommend checking it out), in which they set out a list of aesthetics for game design. I won’t recount the whole list here, but the one that I was thinking about that led me to think of a new mechanic was “Orthogonality.”

If you don’t know what orthogonality even means (which why should you, it’s a long word and ugh math), it technically refers to the general property of two lines forming a right angle, across 2 or more dimensions. However, for the purposes of this discussion, it basically means design concepts which do not overlap with eachother.

A Tale of Two Mechanics

Some mechanical spoilers below.

So, I was trying to think about some other types of symbols to put on panels that might introduce constraints and create more puzzles and interact well with the existing symbols (just the dice faces at the moment). Earlier, I had thought about odd and even, but the issue there I realized is that there is significant overlap with the dice face puzzles. They are both about the total area of a shape. As an example, if you were to have a panel which had 4 symbols on it, each on different tiles: a 4 dice face, a 3 dice face, an odd symbol, and an even symbol. The solution would probably be: oh I need to put the odd symbol with the odd dice face and the even symbol with the even dice face. This observation about the single solution quickly turns into a generality about the puzzles.

This might seem fine, but to further illustrate my point of why this is a bad choice of mechanic, I will posit the following question:

“Are there many situations in which this new mechanic (the odd and even tile constraints) could equivalently be expressed using an existing mechanic (the dice face constraints)?”

I would posit that the answer in this case is yes, which generally means “don’t do it.” I could go into a lot of specific examples of how they are equivalent mechanics, but I will just give one and move on.

If we were to take the following panel, wherein we say that the 0 is required to be part of a lowlit area which is odd and the 1 is required to be part of a lowlit area that is even:

image

The possible solutions would be as follows:

image

Although we cannot entirely duplicate the same panel with the same set of solutions, we can cover the bases and achieve very similar results across 3 different dice face panels:

image

Whose complete solutions are as follows (note that in addition to covering the solutions of the odd/even one, the leftmost panel has two additional solutions due to the way in which dice faces combine with eachother):

image

“But Matthew!” you say, “look at how many solutions there were for that one odd/even panel, and you’re trying to say that a mechanic which cannot duplicate that solutions space without three separate panels is BETTER?”

To which I would say, “yes.” And the reason being is that sometimes having a tighter solution space is actually a much better thing. And even if you were to disagree with me on that (a fair choice), the argument is really about whether or not to add the odd/even mechanic in addition to the dice faces.

Clearly, I would say, the answer is no.

The New Mechanic

So, while looking for orthogonal concepts, I happened across the idea of having a tile care about the state of its immediate neighbors. This is nice because it has nothing to do with area, and only has to do with the fact that the tiles are on a grid and each tile has up to four neighbors.

So my initial implementation of the mechanic was for each tile to care about how many neighbors it has that are lit up. This is perhaps fine, but it seemed to have two problems.

First, that it was too prescriptive. I tend to avoid puzzles and mechanics that basically feel like I’ve envisioned an exact solution and merely laid down some tiles to enforce that you draw that line or whatever. This has a lot to do with my desire for bottom-up puzzles and things that I have discovered rather than just pulled out of my ass.

Second, I felt like the “is it lit up or not” aspect was an additional point of overlap with the dice faces, which already care about being lit up. It may come to a point where I want to separate out that aspect of the dice faces themselves, but at this point, I’m letting them care about what’s lit up or not, so it seemed like the new mechanic shouldn’t care.

So, what I decided to change the mechanic to is instead an equality check, simply enough each tile cares about how many neighbors it has that are the same as itself. This has the nice property of opening up the solution space a bit more (even though sometimes it’s better to have a narrower solution space, I tend to feel like it is a sweet spot in the middle that I’m aiming for), as well as making the concept fully orthogonal to the dice faces.

Here’s an example of one of the puzzles that we discovered on stream which I particularly like. (I’ll leave the solution as an exercise for the reader. Keep in mind, yellow means “the same as me” and blue
means “not the same”.):

image