I have a mixed relationship with Video-games: on one hand, they are a welcome stress vent and have inspired me to learn and become a programmer; on the other one, they are a black hole sinking huge amounts of my spare time and energies.
JYDGE by developer 10tons is my latest addiction: it could be superficially categorized as a twin-stick shooter, but it’s a rather new blend of action and stealth with puzzle elements.
One thing buggers me, though: in order to reach a specific secret level, you have to beat a challenge that requires you to score more than 60k points in a tiles-matching gem game called “Gembine”, and it’s rather hard.
You can see a short game here:
it’s from another 10ton’s game – Crimsonland, they embedded it in there too – but the differences are only aesthetic.
The hallmark of the best projects
I love 10tons’ games, yet I don’t share their love for gem-matching.
If I wanted to play gem matching games, I would have probably downloaded a puzzle game in the first place, which is why spending my time mastering Gembine is not an option.
What I would love to do, is to create a program to solve the game for me!
Due to its simplicity, Gembine is an ideal problem for learning a number of topics but, more than anything, it has the most desirable trait of all for a learning project: it’s a fun, silly thing, and when learning is involved nothing beats a task that’s nothing about time pressure and destination, and all about the journey and the fun of doing it.
If you have to learn something, fun is the best way, and if you are not enjoying yourself, either be sure to have a very strong motivation or don’t even bother trying.
What I want
I want a program that interfaces with Gembine and beats it in a reasonable time. Even some hours or a night of automated play are OK.
I watch a movie or go to sleep, and the next day the secret level is there! I need to top 60k points for that.
Just for comparison, my highest score has been about 53k points: I made them on the first play-through, at 2 am – completely groggy – and I never managed to get even near to that ever again.
My average game is between 15k and 25k. A long way to go to 60k.
The game rules
The game rules are simple: you start with a 4×4 board with some gems, and your objective is to move the gems on the board, as a whole, in a specific direction, to overlap gems of the same type. You can check the gameplay video embedded above to get the idea.
When two gems of the same type overlap the gem you moved disappears, the other one gets promoted to the next level, and you score some points.
There is a catch, however: each time you make a move, the computer will add a random gem on the side of the board you move from, which means the player should match on average a gem per move to keep up.
As soon as there are no available moves, the game ends.
Tools of the trade
As a language, I pick Clojure; it follows my gradient of maximum fun, but it’s also a natural fit: it’s JVM based, and has a REPL.
There is a nice class in AWT called Robot that allows you to emulate key presses: let’s take it for a spin!
$ lein repl [text blurb] $ (import [java.awt.event KeyEvent]) ⏎ $ (doto (java.awt.Robot.) (.keyPress KeyEvent/VK_UP) (.keyRelease KeyEvent/VK_UP)) ⏎ $ (.keyRelease KeyEvent/VK_UP))
It works! The REPL has a history, so when the first “doto” is executed, the program taps the up key, and I get rewarded with the last executed line (the keyRelease one).
A second test, with an initial delay, allows me to switch application, and ensure that the key event is really sent to whichever window has the current focus.
Not bad for 5′ of work, isn’t it?
Monkey Business
Gembine is not picky about trying an illegal move, so I have enough ammo to create a blind random player.
I create a new Leiningen project for that, cram together a function that taps a random arrow key with some delay and, voilà! The blind random player scores 11K after about 150 moves.
The score is almost in the range of my average game, which means I’m about as good at Gembine as a random moves generator.
This hurts.
Let’s put pride aside for a second and analyze the result: what I have is the algorithmic equivalent of putting a monkey in front of a typewriter to write a Shakespeare’s drama, and just as inefficient (if less smelly).
Theoretically, this could solve the game in a finite – albeit very very long – time.
What I really need is a smart program that can play strategically, and for a smart program to be useful, it has to see what’s going on, to see the board.
I don’t need to google for the tool to use for this: OpenCV for the win!
OpenCV
I will not bother you with the fine-grained details of setting up OpenCV on my system.
Let’s pretend it’s a cooking TV show: the host mixes flour, milk, baking powder, and sugar, puts the content in an oven, and a second later, voilà: a beautiful cake!
Too bad that the cake was cooked 4 times, that some people got poisoned and 2 of them died: what happens in the backstage remains in the backstage.
Just a couple of random remarks, since compiling and installing OpenCV properly and turning into a couple of JARs has been a huge pain in the ass:
- don’t be smart: follow the OpenCV instructions for java compilation closely, there is even an official Clojure/OpenCV tutorial. Do read it.
- if you are on Linux, make sure to run execstack -c on the java .so file if java complains even slightly.
- don’t mind what you read around: just because you are able to
(clojure.lang.RT/loadLibrary org.opencv.core.Core/NATIVE_LIBRARY_NAME))
doesn’t mean shit.
- automatize as much as you can because, unless you are a luckier bastard than I am, you are going to work on this stuff a lot.
Long story short: I have two custom made JARs in my local MAVEN repository – one for the java bindings and one for the native library – and by adding to my project.clj
:dependencies [[org.clojure/clojure "1.8.0"] [commons-io/commons-io "2.6"] [opencv/opencv "3.4.1"] [opencv/opencv-native "3.4.1"]]
and after loading the native OpenCV library in my REPL
> (clojure.lang.RT/loadLibrary org.opencv.core.Core/NATIVE_LIBRARY_NAME))
I can – for instance – load an image with something like:
> (org.opencv.imgcodecs.Imgcodecs/imread "resources/mocks/gembine-1.png" Imgcodecs/IMREAD_COLOR)
This last test is important: you’ll read around that when you are able to load the native library you are all set, but that’s BS.
In some occasions, my installation was good enough to load the libraries and create a simple instance (like Point) but failed to load an image.
Finding images in other images: template matching
I need to understand which gem – if any – is present at each board square.
This is a representative Gembine’s screenshot:
There are two different gem colors (red and green) and five different shapes (small sphere, normal sphere, triangle, square, and pentagon).
They can appear on each square of a fixed 4×4 board, or where the “Game Over” text is.
What I need is template matching: searching an image into another.
If you want to delve deeper you can start from OpenCV’s official documentation and possibly continue with this chapter on convolution from Steven W. Smith’s excellent “Engineer’s Guide to Digital Signal Processing”.
I need to experiment a bit, and while often REPL programming is enough, it’s a good time to put a testing infrastructure in place.
Clojure has an excellent support for TDD and UT, I decide for a data representation for the board, like:
[[:rB :rt :rB :rb] [:gb :rB :rs :rB] [:rb :gB :rp :gt] [:rB :rb :rs :rb]]
(it represents the screenshot above), then with:
(ns clojure-gembine.board-test (:require [clojure.test :refer :all] [clojure-gembine.board :refer :all] [clojure-gembine.test-utils :as utils])) (deftest test-read-board (testing "result on mock 1" (is (= [[:rt :rB :rt :rs] [:rB :rt :gb :rB] [:rb :gB :rb :rs] [:rs :rt :gB :rb]] (read-board (utils/load-mock-image 1))))) (testing "result on mock 2" …) …)
I can tweak the search algorithm to my heart’s content.
When I finish, my program has a pair of eyes but still needs a brain.
Minimax
First I have to simulate the behavior of the board.
I have left my iPad Pro with its pencil in the Ferrari, so I have to make do with a piece of napkin.
I can use Gembine’s symmetry to reduce the game to a single move – moving the tiles from right to left – while using rotations to simulate the remaining three.
Armed with this knowledge and a solid TDD approach, it’s just a matter of shedding a tear for the poor sods who tackle this with plain Java, and I’m ready for the actual algorithm.
I settle for a simple Minmax implementation – or more precisely a Negamax – that looks ahead one move, and see how it fares.
Minimax needs a score function.
I want one that embeds some knowledge about the game and its strategy without biasing Minmax into playing a self-defeating game: in this way, the single depth minimax should perform a little better than could be reasonably expected.
At least, this is the official reason.
A less noble reason is that I don’t have an exact representation of the score (and I have no intention to put elbow oil in it), so this crackpot theory serves me well.
My score has two components: the first correlates to the score proper, and it’s a weighted sum of the empty spaces after a move; the second count the number of adjacent gems of the same type.
First Results
Minimax level 1 implementation beats my 54k score at the second or third attempt and reaches the secret area with 82k points in the first handful of games.
You can still follow the moves logic, roughly speaking, and I confess that – while I’m watching the game – I feel proud.
Yet I’m not satisfied: I want to know what happens after two green pentagon gems are combined!
Pen, paper and a bit of logic suggest that there should be another color, and I want to see it.
Thanos and his glove can eat my shorts
Level 2 minimax seems to work better than level 1 (it should, in absolute terms), but it’s hard to tell.
I create a Gembine engine in Clojure and use it to score my solvers.
My ersatz Gembine (I looked it up on Wikipedia weeks ago and I couldn’t wait to use it: it means “inferior copy or replacement”) doesn’t have a concept of “score”, but I can do statistics counting the moves before game-over for each algorithm.
Minimax level 2 clearly outperforms level 1, and my spiked score function seems statistically better than a more conventional one.
My games end too often in the secret level now, so I teach my code to detect it and restart Gembine, and I make it run unattended for a whole night.
And this is the result:
Putting the awesome score aside, with an unforeseen stroke of luck I have been able to take a screenshot of a blue gem!
In one of the games, my program got stuck just before a game over: it couldn’t recognize the blue gem correctly – I never taught it to do that – so it tried to merge it with a green one
The game was almost lost, but I managed to scavenge a score of 120.100. Quite impressive, if I’m allowed to say it myself.
As a last experiment, I generalize the minimax algorithm to take the depth search as a parameter.
Going as deep as four moves proved to be impractical for performance reasons: it took a whole night to play a game and a half and, while I could probably optimize the code a lot to take advantage of all my CPU cores, it would still be slow.
Regardless, Minimax level 3 alone delivered me some satisfaction:
You know what? I think I can settle with that.
The code
You can find the source on GitHub
It has some instructions of sorts but it’s a prototype, so keep your expectations very very low.
Running the program from the command line you will be bound to the level 1 minimax, and only the truly worthy will be able to master the code’s secrets and tap from all minimax depths…
…or the ones that will copy this in a Leiningen REPL:
> (utils/sleep 10000) (execute-moves move-with-logic (move-logic/create-deep-minimax-solver 1024) true)
That’s a minimax of depth 1024 for you, baby!
See you in 2300AD, and you’ll tell me how your game went.