As the end of my studies is nigh, I’m looking into possible topics for my master thesis. Alas, I’ve grown a bit tired of writing game engines and renderers lately — who’d have thought.

Luckily, this quest for new shores topics was solved quickly. I’ve also developed quite an interest in procedural generation and always had a bit of a fascination for fantasy worlds and maps. So, I’ve decided to work on procedural world generation.

This is obviously going to be a fun project with much potential for interesting failures.[1] Therefore, I have the best intentions of documenting my plans, progress and interesting problems here. With best intentions I do of course mean “force myself” and with documenting I mean “dumb my thoughts here and try to make sense of it all here”.

Without further ado: Welcome to this unwise cocktail of a forever project and a master thesis.

There already are quite a lot of inspiring projects that procedurally generate worlds and maps with impressive results — some of which I will briefly mention here. I’ve also uncovered some scientific papers, that I might use as a basis for my project. Although, unfortunately, most of these are sadly closed source projects.

Dwarf Fortress is arguably the first thing that comes to mind, when people think about the procedural generation of game worlds. Sadly, definitive information about the generation algorithms is a bit sparse. What is known and documented is, that the generation starts with layers of noise-based fractal maps for terrain/precipitation/..., that form the initial map and are then refined by simulation steps. The main focus (and most impressive part) is, of course, the history generation and simulation of the actual game world. In contrast, the terrain generation is sadly relatively unimpressive.
Amit Patel's Mapgen4 (as well as its previous iterations) is also quite interesting, especially since it's quite thoroughly documented on his blog. The fundamental data-structure used by Magpen4 are relaxed voronoi diagrams and their dual (delaunay triangulations). This allows it to generate interesting maps without the obvious artifacts often seen with uniform grids. Elevation and generation parameters can be edited quite intuitively in Mapgen4 --- with procedural generation of rivers, precipitation, and biomes. However, the most interesting part for me personally is the rendering. It utilizes a traditional rendering pipeline with an oblique projection and more artistic lighting, to achieve a style that is quite close to a hand-drawn look.
Scott Turner's Dragons Abound procedurally generates and draws fantasy maps, with quite beautiful and adaptable results that are already often near-indistinguishable from hand drawn maps. Although the main focus seems to be on drawing the maps, which informed some of the design decisions, there are also some simulation-based approaches to determine climate-zones, precipitation and rivers. There are some technical similarities to Amit Patel's approach (insofar as both use delaunay triangulations as their fundamental data-structure). In contrast, Dragons Abound seems to use a much larger number of vertices, which allows it to support smaller features and produce more natural shapes.
Azgaar's Fantasy Map Generator is another fascinating project, as it not only provides a working web application to interactively generate detailed fantasy maps but is also open source. Additionally, there is even a blog with interesting details about the generator’s design and innerworkings, which --- sadly --- is currently inactive. Similar to the previous two, this generator uses a voronoi diagram, but the focus seems to lie much more on the generation. The most impressive part of this project is probably the insane amount of possible customization and editing options of the generated maps.
JonathanCR's Undiscovered Worlds is a procedural world generation project, inspired by Dragons Abound. But this project focuses more on generating large complete worlds, instead of drawing maps with relatively small/local scale, which is especially interesting as its close to my current plans.
Wonderdraft is a quite powerful and easy to use editor to manually create fantasy maps. It doesn't really provide much in terms of procedural generation, but might be an interesting reference as far as user editing of the generated data is concerned. There might also be an interesting potential in integrating a more powerful procedural generation toolkit with it, but --- in the absence of any API or documented file format --- that seems to be unrealistic for now.
Miguel Cepero's Voxel Farm is another interesting project --- and his blog is one of the reasons I originally became interested in 3D graphics and procedural generation. In contrast to most projects on this list, it generates not just a map of a world, but an actual interactive 3D world. This is obviously far outside of any reasonable scope for my project, but some of the more abstract ideas might still be interesting.
On the academic side of things, Cortial et al.'s 2019 paper Procedural Tectonic Planets proposes a model to procedurally generate planets using a simplified simulation of plate tectonics. Although the authors sadly didn't provide their source code or an executable to reproduce their results, the provided images and video look quite promising.
Another study worth mentioning here is Large Scale Terrain Generation from Tectonic Uplift and Fluvial Erosion by Cordonnier et al. from 2017. It uses tectonic uplift and hydraulic erosion data to generate plausible terrains. Although the scale is more local than my current plans, this might be an interesting approach to simulate erosion and river formation at a relatively high level, too.

Project Aims

The attentive reader might wonder: what exactly is it that I’m trying to achieve here?

The main focus of my project is pretty similar to that of Undiscovered Worlds. I’m interested in generating plausible worlds, that can be explored, used and processed further by others. Of course, I’ll also have to visualize my results in the form of maps and provide some form of user input, but that probably won’t be anything to fancy.

Apart from that, I have three main aims for this project guiding my design decisions.

Simulation Instead of Noise

As I’m mostly interested in the world generation aspect, I want to try to reduce the amount of random input and noise to a minimum and rely on simulations instead. A model based on a large amount of complexly layered noise might generate interesting worlds, but that is entirely thanks to how that noise is modulated instead of any real meaningful capabilities of the system. And as an effect of this, the generational space of such models is inherently limited to the small subset for which the parameters have been hand-tuned. As John von Neumann famously said:

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.

My goal is to create a more generalized system that can generate a wide variety of interesting worlds.

“Interesting” of course is subjective and means many different things to different people. Personally, I think one intriguing way to try to measure “interesting” in this context is in terms of the entropy of the generated world, i.e. its information density, uncertainty or “surprisingness”[2]. On one end of the spectrum, with an entropy near zero, we have an entirely flat featureless plane. Even if beige is your favorite color and your favorite ice cream flavor is “cold”, we can probably agree that this world wouldn’t be terribly “interesting”. On the other end, with a large entropy, we end up with a comprehensively incomprehensible world — where everything appears to be random noise because nothing is correlated.

Noise-based terrain (while often beautiful) would probably lie mostly on the low-entropy-end — think rolling hills, uniform mountains and general predictability — while our real world would be more on the high-end. The sweat spot I’m aiming for lies somewhere between these two — less predictable than Perlin noise, but also not as opaque and complex as the real world.

In other words, what I want to achieve here is a comprehensible amount of meaningful information in the generated worlds. Meaning every visible feature should have a cause that can be discovered by the user, e.g. matching coasts where continents split apart, canyons where rivers used to flow, settlements where they actually make sense with names based on their surroundings and history. Aside from the general technical difficulty, striking the right balance will also be a problem here: An unbroken chain of cause and effect is utterly meaningless if it’s too complex to be easily comprehensible. The generated world becomes no more interesting than one based purely on random noise.

Realistic Scale

My second central aim for this project is the scale at which I want to generate worlds. To support physically meaningful parameters and allow for an easier comparison with the real world, I’m planning to generate spherical worlds on roughly the same scale as the earth, i.e. a radius of about 6’000 km.

To support such a large scale with any amount of local details, I’m going to use a triangle mesh to store the elevation and other information. This should allow for a wide variety of resolutions based on local requirements (i.e., fewer vertices in the ocean and more near coasts and mountains). This should also solve most of the problems and artifacts, usually encountered when one tries to use uniform rectangular grids like bitmaps on a spherical surface.

I’m currently planning to develop my generator in a top-down fashion: starting with the largest terrain features (like tectonic plates and mountains) and moving on to smaller scale details like rivers, caves and settlements from there. Although the dynamic resolution of a triangle mesh should lend itself well to such a generation approach, I might come to a point in the future where I need to split the system into multiple generators for different scales (e.g., erosion at the scale of a continent or mountain range vs. the scale of a local river). However, that should also mesh quite well with the triangle mesh approach, as the small-scale-generator could use the vertices from the high-level generator, refine them further and generate new information matching the constraints determined by the existing vertices. There will also be many interesting problems there, like the usual problem with discontinuities at the border between cells, of course. But that is far enough in the future that I probably shouldn’t concern myself with it, just now. That is clearly future-me’s problem, who is much more experienced than me, anyway.

Reusability

My final aim for this project stems from an inkling that I might have bitten off more than I can chew here. Because of that, I want to build the project in a fashion that even in its incomplete state, parts of it should still be useful or at least interesting to others. To achieve this, I’ve not just open sourced (most) of the project but also plan to structure it as modular as possible.

I already laid the foundation for this with a basic C-API for the world-model and generation passes. While I’ll initially just work on a C++API-Wrapper to use in my code, the C-API core should also allow for future interoperability with other languages like Python or C#.

Based on this I plan to develop a simple graphical editor as a debugging tool and construct all the actual procedural generation passes as self-contained reusable modules, that can be loaded as plugins (.dll/.so) at runtime. So, others should (at least in theory) be able to modify (or salvage) any interesting parts, extend the system with new functionality or use it as a starting point for their own projects.

Outlook

With the goal in mind and the course set: what’s next?

While I would really like to dive directly into exploring ideas for procgen algorithms and interesting simulation approaches, I think I should document and discuss some of the groundwork, first. To do so, the next couple of posts will mostly focus on how the world is modeled, stored, and modified in the code[3].

Having that down, there are some (hopefully a bit lighter) topics, I would like to explore:

  • Simulating plate tectonics to generate realistic high-level features (probably based on Procedural Tectonic Planets by Cortial et al.)
  • Extending and refining the current API
  • Creating a renderer to display the generated information in a compact way that is easy to decipher
  • Creating easy to use tools to modify the generated worlds, both for debugging and artistic inputs
  • Simulating large scale weather patterns to determine precipitation, average temperatures, and biomes
  • Model drainage basins and high-level water flow to generate river systems
  • Generate detail-maps for smaller areas, based on the high-level features from the coarser large-scale map
  • Simulate coarse erosion based on this (maybe similar to Large Scale Terrain Generation from Tectonic Uplift and Fluvial Erosion by Cordonnier et al.)
  • Simulate large scale changes in world climate (i.e., ice ages and glaciation) to reproduce some of the more interesting terrain features we have on earth: fjords, sunken landmasses, interesting features on continental shelves, …
[X]