The original idea for this algorithm comes from Paul Merrell’s 2007 i3D paper as well as his much longer PhD thesis. Maxim Gumin (alias mxgmn) would take this original idea and slightly modify it as well as give it the now iconic “wave function collapse” name.

Comparison to model synthesis

Merrell wrote a comparison of the two algorithms. He does a fine job of comparing the two but honestly it comes off as hater behavior simply because it was written after WFC was released and it’s hilarious how it became 10× more popular/well known than his “model synthesis.”

  • Which tile to collapse next is different. Model synthesis “sweeps through the grid in scanline order” while WFC selects the tile with the lowest entropy (i.e. least number of possible states).
    • This greatly affects runtime. Model synthesis is able to handle larger inputs better while WFC may become slower or not finish at all.
  • Model synthesis focuses more on generating 3D models than 2D textures (although WFC can also be generalized to any number of dimensions, including apparently the 4th dimension, time).

Technical implementation

mxgmn‘s implementation extracts a generic size tile from a sample image and uses this to generate the output. For my use case this is overkill as we have predefined tiles to work with. So we can skip this step entirely.

Details

  • Tile sprites are grouped together in a ScriptableObject called a Palette. A PaletteSet allows us to store a list of Palettes and represents all possible tiles in a level.
    • This gives us an extremely modular and flexible system. Palettes can be reused across PaletteSets, because I’m sure different levels will reuse some tile palettes.
    • Keeps the WFC algorithm separate from placing the actual Unity assets/tilemap.
    • Plug and chug, switch out a ScriptableObject for another so that CelesteWFC uses a different set of tiles with different rules
  • Keep track of possible symmetry types via CanNotRotate, X, T, I, and L enums. These are used to generate all possible unique orientations of a tile sprite.
    • We consider each orientation of a tile sprite to be a different and unique state.

However, we need to be careful to only consider unique tile sprites when randomly choosing a state from a set of possible ones. Otherwise, tiles with more orientations would unfairly have a higher probability of being picked.

Sockets and keeping track of state

Socket structs are used to tell us how each side of a tile can connect with other tiles. Use the symmetry type of a tile to generate copies of the tile with different socket values on its four sides, which simulates rotating/flipping the tile.

Use number IDs (ints are fine) so that we essentially have an infinite number of possible. Originally I used a bool with the thinking that a true value meant this side could be connected to, and a false value meant no connections were allowed. However, this was too simple.

  • Store all of these as separate State structs in each grid cell’s list of currently possible states.
  • Initialize this list for each grid cell at the beginning during setup when we “parse” the PaletteSet
  • Optimization: write into a JSON when we parse the ScriptableObject for the first time, and read from that instead next time?

Algorithm

// C# pseudocode
public class WFC {
    public void Iterate() {
        var nextCellToCollapse = PickLowestEntropyCell();
        Collapse(nextCellToCollapse);
        Propagate(nextCellToCollapse);
    }
 
    public void Run() {
        // Or, we enable the user to step through each iteration via UI
        while (!AllCellsCollapsed()) {
            Iterate();
        }
    }
}

Propagate(): use a stack to keep track of which tiles we still have to process. Before adding tiles to the stack, check that they’re not already in it.

For PickLowestEntropyCell(), the simplest implementation is to uniformly pick at random a state from the set of possible states in a grid cell. Of course, we can make it more interesting by assigning weights to each cell, considering its neighbors first, etc.

Unity

Tilemaps

Coordinate systems, row-major arrays

Using a multi-dimensional C# array like int[,] grid means that we index into the array with the y-coord first, then x-coord like grid[y, x]. Furthermore, the row index increases going downwards, meaning the first row has index 0, second has index 1, and so on.

This becomes important later on because when we want to draw the tiles to the screen, drawing the first row requires us to position it at the very top of the grid, not the bottom like a conventional XY coordinate system.

Tilemap tilemap;
 
void Paint() {
    var width = grid.GetLength(1);
    var height = grid.GetLength(0);
 
    for (var y = 0; y < height; ++y) {
        for (var x = 0; x < width; ++x) {
            // Start drawing at the top of the grid. Subtract `height`
            // by 1 because the grid is zero-indexed
            tilemap.SetTile(new Vector3Int(x, height - 1 - y), someCoolTile);
        }
    }
}

When we finish generating the output, we have to transfer the WFC grid to a Unity tilemap.

  • The only components necessary for stuff to work are the Grid and Tilemap component.
  • For dynamically placing tiles on the tilemap, the Tile Palette functionality is not necessary. It’s for in-editor tile painting use only. It’s simply a Grid and Tilemap prefab containing some tile assets.

Creating regular tile assets

For some stupid fucking reason Unity has removed the ability to create regular tile assets from the right-click menu. This screenshot is from Unity 6:

There are options to create fancier tiles like animated tiles, rule tiles, etc. but no regular tiles. Turns out that you have to do this really backwards method of dragging a sprite/spritesheet into the Tile Palette, which will then automatically generate regular tile assets for you.

See this discussion post on the Unity forums.

C#

All variables are passed by value in functions. However, there is nuance:

  • For a value type, this means a copy of the value is made (int, struct)
  • For a reference type, a copy of the reference is made (class). This means that the object itself is not copied. For example:
class Person
{
    public int number;
}
 
class Test
{
    private void Manipulate(Person person) {
        person.number = 5;
    }
 
    private bool Start() {
        Person p = new Person();
        p.number = 1;
 
        // This modifies `p` because a copy of the *reference* is passed in
        // but no true deep clone of the object was made
        Manipulate(p);
  
        // Returns `true`
        return p.number == 5;
    }
}