0x44:$

Sudoku Part 1

Yesterday I was sitting in my office working through Ryan Levick's Gameboy emulation in Rust book. But I was tired and getting frustrated so I took a break from the new-and-weird-to-me world of hardware abstracted as software and thought "I should finally make a Sudoku solver". But then I thought more about that idea and it occurred to me that everyone has done that at this point in the history of computer science, and my solver would probably suck (spoilers: it does). But I got to work on it anyway. My first step was to just get a Python program up and running that takes a 2d array as input and spits out a Sudoku puzzle. I then discovered that a 81 character string of numbers is a ~the most~ common format for Sudoku so I thought it was a good idea to make my pretty-print method a bit more flexible by sticking variables in the place of the numbers. But then I noticed something.

board = [
[a,b,c, d,e,f, g,h,i],
[d,e,f, g,h,i, a,b,c],
[g,h,i, a,b,c, d,e,f],

[b,c,d, e,f,g, h,i,a],
[e,f,g, h,i,a, b,c,d],
[h,i,a, b,c,d, e,f,g],

[c,d,e, f,g,h, i,a,b],
[f,g,h, i,a,b, c,d,e],
[i,a,b, c,d,e, f,g,h]]

Do you see the pattern?

The first row is abcdefghi. And the second row is defghiabc. Rows are merely shifting over three, the next is ghiabcdef, see?! OK, but then we skip down and if we shifted three we'd be back at the 0th row so that won't do. Instead we shift 1 at the start of every 3rd row. So now it's bcdefghia. And then it's back to shifting three for two rows. Then a shift one. And then back to shift three for the last two.

I used this list to throw up a few quick and dirty Sudoku puzzles on my console.

+-----------------------+ +-----------------------+ +-----------------------+
| 6 5 7 | 2 4 9 | 1 8 3 | | 8 2 5 | 3 1 4 | 6 9 7 | | 6 2 8 | 4 3 7 | 9 1 5 |
| 2 4 9 | 1 8 3 | 6 5 7 | | 3 1 4 | 6 9 7 | 8 2 5 | | 4 3 7 | 9 1 5 | 6 2 8 |
| 1 8 3 | 6 5 7 | 2 4 9 | | 6 9 7 | 8 2 5 | 3 1 4 | | 9 1 5 | 6 2 8 | 4 3 7 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 5 7 2 | 4 9 1 | 8 3 6 | | 2 5 3 | 1 4 6 | 9 7 8 | | 2 8 4 | 3 7 9 | 1 5 6 |
| 4 9 1 | 8 3 6 | 5 7 2 | | 1 4 6 | 9 7 8 | 2 5 3 | | 3 7 9 | 1 5 6 | 2 8 4 |
| 8 3 6 | 5 7 2 | 4 9 1 | | 9 7 8 | 2 5 3 | 1 4 6 | | 1 5 6 | 2 8 4 | 3 7 9 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 7 2 4 | 9 1 8 | 3 6 5 | | 5 3 1 | 4 6 9 | 7 8 2 | | 8 4 3 | 7 9 1 | 5 6 2 |
| 9 1 8 | 3 6 5 | 7 2 4 | | 4 6 9 | 7 8 2 | 5 3 1 | | 7 9 1 | 5 6 2 | 8 4 3 |
| 3 6 5 | 7 2 4 | 9 1 8 | | 7 8 2 | 5 3 1 | 4 6 9 | | 5 6 2 | 8 4 3 | 7 9 1 |
+-----------------------+ +-----------------------+ +-----------------------+
+-----------------------+ +-----------------------+ +-----------------------+
| 2 8 3 | 9 1 7 | 4 6 5 | | 1 3 5 | 7 2 4 | 9 8 6 | | 5 9 4 | 6 7 2 | 1 3 8 |
| 9 1 7 | 4 6 5 | 2 8 3 | | 7 2 4 | 9 8 6 | 1 3 5 | | 6 7 2 | 1 3 8 | 5 9 4 |
| 4 6 5 | 2 8 3 | 9 1 7 | | 9 8 6 | 1 3 5 | 7 2 4 | | 1 3 8 | 5 9 4 | 6 7 2 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 8 3 9 | 1 7 4 | 6 5 2 | | 3 5 7 | 2 4 9 | 8 6 1 | | 9 4 6 | 7 2 1 | 3 8 5 |
| 1 7 4 | 6 5 2 | 8 3 9 | | 2 4 9 | 8 6 1 | 3 5 7 | | 7 2 1 | 3 8 5 | 9 4 6 |
| 6 5 2 | 8 3 9 | 1 7 4 | | 8 6 1 | 3 5 7 | 2 4 9 | | 3 8 5 | 9 4 6 | 7 2 1 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 3 9 1 | 7 4 6 | 5 2 8 | | 5 7 2 | 4 9 8 | 6 1 3 | | 4 6 7 | 2 1 3 | 8 5 9 |
| 7 4 6 | 5 2 8 | 3 9 1 | | 4 9 8 | 6 1 3 | 5 7 2 | | 2 1 3 | 8 5 9 | 4 6 7 |
| 5 2 8 | 3 9 1 | 7 4 6 | | 6 1 3 | 5 7 2 | 4 9 8 | | 8 5 9 | 4 6 7 | 2 1 3 |
+-----------------------+ +-----------------------+ +-----------------------+
+-----------------------+ +-----------------------+ +-----------------------+
| 3 2 9 | 8 7 5 | 1 4 6 | | 2 6 7 | 8 5 3 | 4 1 9 | | 9 3 6 | 1 8 4 | 7 2 5 |
| 8 7 5 | 1 4 6 | 3 2 9 | | 8 5 3 | 4 1 9 | 2 6 7 | | 1 8 4 | 7 2 5 | 9 3 6 |
| 1 4 6 | 3 2 9 | 8 7 5 | | 4 1 9 | 2 6 7 | 8 5 3 | | 7 2 5 | 9 3 6 | 1 8 4 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 2 9 8 | 7 5 1 | 4 6 3 | | 6 7 8 | 5 3 4 | 1 9 2 | | 3 6 1 | 8 4 7 | 2 5 9 |
| 7 5 1 | 4 6 3 | 2 9 8 | | 5 3 4 | 1 9 2 | 6 7 8 | | 8 4 7 | 2 5 9 | 3 6 1 |
| 4 6 3 | 2 9 8 | 7 5 1 | | 1 9 2 | 6 7 8 | 5 3 4 | | 2 5 9 | 3 6 1 | 8 4 7 |
|-------+-------+-------| |-------+-------+-------| |-------+-------+-------|
| 9 8 7 | 5 1 4 | 6 3 2 | | 7 8 5 | 3 4 1 | 9 2 6 | | 6 1 8 | 4 7 2 | 5 9 3 |
| 5 1 4 | 6 3 2 | 9 8 7 | | 3 4 1 | 9 2 6 | 7 8 5 | | 4 7 2 | 5 9 3 | 6 1 8 |
| 6 3 2 | 9 8 7 | 5 1 4 | | 9 2 6 | 7 8 5 | 3 4 1 | | 5 9 3 | 6 1 8 | 4 7 2 |
+-----------------------+ +-----------------------+ +-----------------------+

The boards actually look like legitimate Sudoku puzzles, albeit presolved. So I started to wonder if a pattern like this exists in all Sudoku puzzles. I went ahead and put the super_awesome_sudoku_solver.py on the back burner and decided to chase this rabbit trail.

Next step, convert solved Sudoku strings into the "abc..." format I showed previously. First I took my board generator and had it spit out strings instead of the prettified boards (Sidenote: Sudoku strings have a file format, .sdm). A few LOC later I had:

362715849715849362849362715627158493158493627493627158271584936584936271936271584
514789263789263514263514789147892635892635147635147892478926351926351478351478926
195873426873426195426195873958734261734261958261958734587342619342619587619587342
368172945172945368945368172681729453729453681453681729817294536294536817536817294
738942156942156738156738942389421567421567389567389421894215673215673894673894215
834125967125967834967834125341259678259678341678341259412596783596783412783412596
189756432756432189432189756897564321564321897321897564975643218643218975218975643
249587613587613249613249587495876132876132495132495876958761324761324958324958761
318574692574692318692318574185746923746923185923185746857469231469231857231857469
578936421936421578421578936789364215364215789215789364893642157642157893157893642

I used these as my test cases to create a utility to convert .sdm files into what I began calling "metaboards":

with open('biglib.sdm', 'r') as f:
	boards = f.readlines()
boards = [x.strip() for x in boards]
metaboards = []
for board in boards:
	#num_to_alpha = {
	#	board[0]:'a',
	#	board[1]:'b',
	#	board[2]:'c',
	#	board[3]:'d',
	#	board[4]:'e',
	#	board[5]:'f',
	#	board[6]:'g',
	#	board[7]:'h',
	#	board[8]:'i'
	#}
	num_to_alpha = {
		1:'a',
		2:'b',
		3:'c',
		4:'d',
		5:'e',
		6:'f',
		7:'g',
		8:'h',
		9:'i'
	}
	metaboard = board
	for n,c in num_to_alpha.items():
		metaboard = metaboard.replace(str(n), c, -1)
	metaboards.append(metaboard)

for metaboard in metaboards:
	print(metaboard)

It output abcdefghi over and over, so I knew it probably worked.

Next I needed some test cases. But finding solved Sudoku puzzles isn't as easy as it sounds. People tend to want unsolved, not solved Sudokus. But I did find almost 50,000 unsolved puzzles in sdm format so I grabbed them and thought, "Hey, I will just write a solver! How hard can it be?!"

My current ~solver~ attempt doesn't return the right answer even when only a single digit is missing... but I also only spent five minutes writing it before deciding to start writing a blog post lambasting my attempt to write a solver.

Stay tuned for part two of our adventure! Will I write a Sudoku solver? Will I cheat and use someone else's? Will I just stumble upon a horde of pre-solved puzzles?