The aim of this project is to develop an artificial intelligence-based approach for generating game worlds using Minecraft data. Traditional methods of world generation rely primarily on randomness, which limits control over the final outcome. In this work, a neural network-based model is proposed that enables more precise world generation, giving users the ability to influence elements such as terrain shape and cave density. The applied approaches and mechanisms, such as Generative Adversarial Networks and attention mechanisms, are discussed in the context of their use for generating worlds in Minecraft. The training data was prepared through extraction and formatting of information from the game, while reducing data redundancy. The model was evaluated for the quality of generated structures using appropriate metrics commonly applied to such problems. The research results indicate that artificial intelligence can be an effective tool for generating worlds in computer games. The proposed model allows for the creation of personalized and controlled environments, opening up new possibilities in game design.
Fig. 1. U-Net architecture diagram for the generator with skip connections and Cave Attention.
Fig. 2. Illustration of the processing flow in the conditional Cave Map layer.
Fig. 3. U-Net architecture diagram for the discriminator.
The models were trained using the Google Colab platform, which provides access to highly efficient cloud-based GPUs, specifically the NVIDIA A100 card. This GPU delivers substantial computational power for training large neural networks. Training was conducted over four epochs for each version of the model, with a training step on a subset size of 8 to balance memory and learning speed constraints.
The execution environment used was the "A100 GPU," based on machines labeled as "a2-highgpu-1g" in Google Cloud. [38] Machine specifications:
- GPU: NVIDIA A100 40 GB
- Memory bandwidth: 1555 GB/s
- Tensor Float 32: 156 TF (models were trained at this precision)
- VM memory: 85 GB
- vCPU count: 12
The results of evaluating the final model, which utilized one-hot encoding for input data and did not apply the cave_loss during training. The model was trained using 8,192 samples over eight epochs, with each epoch covering the entire dataset. Although the training process did not show significant progress after the initial stages, the evaluation results provide valuable insights into the model's strengths and weaknesses.
Fig. 4. An example of a generated world using the developed model (Model 4). The world consists of 6x6 chunks. On the left: the raw world, which is the direct output of the model. On the right: the "decorated" world, filled with water up to sea level and covered with grass.
Tile Pattern Kullback-Leibler Divergence (TPKL-Div) was used to compare the generated samples with real Minecraft data in terms of local block patterns. This metric evaluates how accurately the model replicates the distribution of patterns in real Minecraft chunks. The model was tested using patterns of size 5×5×5 and 10×10×10.
Model | Average TPKL-Div (for the "plains" biom) |
---|---|
Our model | 3.70 |
World-GAN | 23.05 |
TOAD-GAN | 22.79 |
The Levenshtein distance was used to measure the variability and differences between the generated chunks by comparing them across multiple samples. The average Levenshtein distance between each generated sample and all other samples was calculated.
Model | Average Levenshtein distance (for the "plains" biom) |
---|---|
Our model | 137.66 |
World-GAN | 5314.43 |
TOAD-GAN | 3895.96 |
Block histograms represent the number of each block type generated by the model compared to the real Minecraft data.
Fig. 5. The block histogram in real Minecraft chunks across 16 chunks.
Fig. 6. The block histogram of generated chunks by the developed model across 16 chunks, with the error included relative to the histogram in Figure 5.
The Mean Gradient Magnitude Difference (MGD) metric was used to assess how well the model replicates terrain shapes in generated chunks compared to real Minecraft data. MGD measures the difference in slopes (gradients) of the heightmaps along the X and Z axes, allowing for a more accurate evaluation of the terrain shape similarity.
Mean Gradient Difference (MGD) | |
---|---|
Overall Gradient Difference | 0,1978 |
Variance | 0,0287 |
Average similarity | 99,94% |
The block distribution generated by the developed model largely aligns with real data, accurately reproducing Air, Dirt, Stone, and Bedrock blocks. Although Sand blocks are the least frequent, the model generated them, albeit in smaller quantities. However, the model did not generate Cave air blocks, which suggests an issue with cave generation. This information rules out the influence of block frequency on whether they are generated by the model.
The absence of caves indicates a problem with the current approach, which may stem from insufficient cave data or an incorrect method of modeling this feature. Nonetheless, the model performs excellently in generating surface terrain and the overall world structure.
The results show that the developed model excels in replicating surface structures of the Minecraft world, especially the "plains" biome, which is reflected in the low TPKL-Div results. However, the model struggles with generating diverse structures, as evidenced by the low Levenshtein distance and the lack of caves. Future research should focus on improving cave generation and increasing the variability of generated chunks by adjusting the discriminator's strength or introducing more noise.