Tic-tac-toe meets Lean 4

28 Aug, 2025

This week I reached a milestone in my most useless side project so far. I finished writing a tic-tac-toe game in Lean 4, along with proofs to guarantee that the game behaves correctly! It “only” took me 20 hours, 1000 lines of code and endless suffering… Totally worth it, as you might expect.

Chances are you haven’t heard about Lean before, so I’ll share more details below. But first, let’s have an overview of this article’s contents, since it ended up quite long:

Formalizing tic-tac-toe
The journey: how did we end up here?
A taste of Lean

1. Formalizing tic-tac-toe

What does that even mean, tic-tac-toe in Lean? It means, dear reader, that you can finally play tic-tac-toe with the assurance that the game has no bugs whatsoever¹. Haven’t you ever wished that, given the monstrous complexity of tic-tac-toe, you’d have access to a bulletproof implementation? Then wish no more! That bright future has arrived.

But let’s take one step back. Tic-tac-toe is a classic children’s game, which you might have played on a paper sheet back in your school years. If the name “tic-tac-toe” is unfamiliar to you, I bet you can quickly recognize the game at a glance on its Wikipedia entry (it has pictures). Fun fact: the name for tic-tac-toe in other languages is somewhat random, such as boter, kaas en eieren in Dutch (literally butter, cheese and eggs) and el gato in Chilean Spanish (literally the cat).

Enough about names now, we need to move on!

Imagine you are playing a software version of the game against another person. Here are some “formal properties” you’d expect the game to uphold:

Players alternate after each move.
A cell gets marked after each move (with the corresponding player’s mark).
There is always a correct distribution of marks (e.g. you can’t ever arrive at a board with two X marks and zero O marks on it). This follows from the two properties above.
If the game is in progress, that means no player has managed to form a line of three similar marks yet.
If the game has a winner, only that player has formed a line of three similar marks.
If the game ended in a draw, the board is full and there are no winning lines on it.

Implementing this game in a mainstream programming language is straightforward. You create data types for the board and the player’s marks, you write functions to manipulate the board as the game makes progress, and you might even write unit tests to convince yourself that the properties listed above are upheld.

But… what if you wanted even more certainty about your implementation’s correctness? Tests can only get you so far: they are able to prove the presence of bugs, thereby raising your confidence when no bugs are found, but they don’t guarantee the absence of bugs. For that, they’d have to check all possible input and output combinations².

That’s where Lean comes in! It allows you to mathematically prove properties about your functions, instead of merely testing that they hold in some specific cases. If the code compiles, that means your proofs are correct! I admit it’s overkill for tic-tac-toe, but it’s lots of (at times masochistic) fun.

Thus, by the power of Lean, we can finally play the game with the correctness guarantees we deserve.

2. The journey

How did I end up spending so much time on a useless tic-tac-toe implementation? I’m afraid my curiosity carried me farther than I expected. Let me explain.

The case of the curious university student

This adventure started about 10 years ago, when I was pursuing my computer science degree. At Universiteit Utrecht, we spent quite some time playing with functional programming languages and related technologies. That’s how I came in contact with precursors of Lean 4, such as Agda, Idris and Coq (now renamed to Rocq). My mind was blown when I saw programming languages with type systems so strong that you could express complex mathematical proofs in them. I wanted to know more!

One of my professors, Wouter Swierstra, was deep into these topics. I kept asking him questions and he was delighted to mentor me. Under his guidance I started playing more seriously with Agda, which almost became the focus of my master’s thesis later on³.

My dream goal was to write a tic-tac-toe implementation with an unbeatable computer opponent (easy peasy) that was also mathematically proven to be unbeatable (much more complex, unless you brute-force the proof by enumerating all possible boards). Life, however, got in the way as usual. I graduated, started working as a software engineer and forgot about it… until now.

Lean enters the scene

It seems to me that Lean has been getting more attention lately, with high-profile projects such as Terence Tao’s Lean companion to Analysis I or the quest for a formalized proof of Fermat’s last theorem. Since Lean started to appear regularly on my radar, that got me thinking… Could I use it to resurrect my unbeatable tic-tac-toe dream?

I wasn’t too optimistic at the beginning, though. In the past, I’ve had the questionable pleasure of using niche software with lots of UX problems: installation hurdles, lack of documentation, cryptic error messages, etc. Did I really want to do that to myself again? But Lean turned out to be different! To my surprise, Lean is more polished than anything of the sort I’ve tried before: installation was smooth, editor support is top-notch, the CLI works well, error messages are helpful, etc. Folks have clearly been putting effort into the language’s UX.

Another pleasant surprise was that searching for information is now much easier than back in my university years, thanks to LLMs. As you can imagine, working with tools like Lean exposes you to a ton of new concepts. Having an LLM at hand is quite a boon when you need some context on a topic you don’t even know how to translate into search terms.

The first milestone and next steps

So it comes that I have finally formalized two-player tic-tac-toe. I haven’t yet reached the end goal of creating an unbeatable computer opponent with a Lean proof of unbeatable-ness, but the current progress gives me confidence that it’s a realistic objective. Will I find enough time and patience to actually do it? We’ll see.

Short term, I think it makes sense to revisit my proofs and refactor them. My lack of knowledge has probably resulted in lots of repetition and unnecessary complexity. It would be interesting to see how short we can get the proofs, since a thousand lines of code feels excessive for something as simple as tic-tac-toe.

Once the project is more polished, it might even make sense to grow it into a learning resource for people who want a nontrivial Lean challenge that doesn’t focus on mathematics. The code is already open source and permissively licensed, so feel free to contribute!

3. A taste of Lean

This article wouldn’t be complete without some snippets to show what the resulting code actually looks like, so here we go!

Main data types

Let’s start with the definition of tic-tac-toe’s data types, with accompanying comments to clarify things along the way.

-- A player is either `X` or `O`
inductive Player where
  | X : Player
  | O : Player

-- A cell is either `Empty` or `Occupied`
--
-- If `Occupied`, it contains a `Player`
inductive Cell where
  | Empty : Cell
  | Occupied : Player -> Cell

-- A board is a sequence of exactly 9 cells
--
-- Note: the length of the `Vector` is part of the type!
def Board := Vector Cell 9

-- A position is an index in the range [0, 9)
--
-- That means it is guaranteed to be within bounds
-- when used to obtain cells from the board through
-- indexing
def Position := Fin 9

-- The game status is either `InProgress`, `Won` or `Draw`
--
-- If `Won`, it contains the winning `Player`
inductive GameStatus where
  | InProgress : GameStatus
  | Won : Player -> GameStatus
  | Draw : GameStatus

-- The game state has a board and a status
structure TicTacToeState where
  board : Board
  status : GameStatus

The wellFormed property

Most of the code in the repository is related to proving that a game is well-formed. The well-formed predicate is a collection of properties regarding the state of the game, defined as a good old function:

def wellFormedGame (state : TicTacToeState) : Bool :=
  match state.status with
  | GameStatus.InProgress => -- omitted for brevity
  | GameStatus.Won winner => -- omitted for brevity
  | GameStatus.Draw =>
      isBoardFull state.board
      && !checkWin state.board Player.X
      && !checkWin state.board Player.O
      && countMarkedCells state.board Player.O == 4
      && countMarkedCells state.board Player.X == 5

Let’s translate this code to plain English. Leaving aside the InProgress and Won case, which are omitted for brevity, a well-formed game at a draw state is guaranteed to: have a fully marked board, have no winning players and have the right distribution of player marks (player X has 5, because they always have the starting move, whereas player O has 4).

Note that this function, in and of itself, guarantees nothing. It really is just a function. Things only start getting interesting once you have a state gameState and a proof wellFormedGame gameState = true. Then the wellFormed predicate is guaranteed to hold for that specific state.

Proofs

Which brings us to proofs! We want to prove our wellFormed function returns true when applied to the boards in our game. That involves writing two proofs:

The initial board is always well-formed (this proof is called initialGameStateIsWellFormed in the code).
If the board is well-formed, it will also be well-formed after a player move (this proof is called makeMovePreservesWellFormedness in the code).

Both proofs are unfortunately too complicated to bring up here, but we can at least look at the definition of the first one:

theorem initialGameStateIsWellFormed
	(state : TicTacToeState)
	(h : state = initialGameState)
	: wellFormedGame state = true :=
	-- body omitted for brevity

If you squint a little, you’ll see that this is actually a function! It takes two parameters: the state (state) and a proof that the state is equivalent to initialGameState. It returns an object of type wellFormedGame state = true.

Now comes the fascinating part: Lean’s type system will only let you construct an object of the wellFormedGame state = true type if that logically follows from your premises. In other words, the existence of an object of type wellFormedGame state = true is also a mathematical proof of the logical fact that wellFormedGame state = true holds. Isn’t that mind-blowing⁴?

I cannot but finish with my favorite Fred Brooks quote:

The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by the exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures….

Yet the program construct, unlike the poet’s words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

Unless my tic-tac-toe specification contains mistakes, or Lean itself generates the wrong machine code, or a rogue cosmic ray decides to flip some bits when you least expect it… You got ECC memory, right? ↩︎
This kind of comprehensive testing is actually possible in the case of tic-tac-toe. The number of valid games is at most 362880, which is well within range of a computer to “brute force”. But that would be no fun! ↩︎
I chose a less “risky” topic in the end to ensure I’d finish my degree on time. I was studying on a grant and would face serious financial trouble in case of a delay in my graduation. ↩︎
I think this is exactly what the Curry-Howard correspondence is about, but don’t quote me on that. ↩︎