I set myself a challenge this year to solve the 3x3 Rubik's cube blindfolded. I've been able to solve one since I was 12, but there's something different in being able to do it blindfolded.
After some initial research I started learning the Old Pochmann method - a blindsolving method where you turn the cube into a series of letter sequences, memorise them, then execute the solve with a small set of algorithms. If you're interested, this guide proved invaluable.
It's been quite fun. I got comfortable with the sequencing and algorithms fairly quickly, but the memorisation part was slower to nail down. This is where I needed the most practice.
So I split the problem in two - sequencing and memorisation - and built a small tool to let me focus on the memorisation part.

The idea was simple: generate scrambles and show me the corresponding letter sequences so I could practise the recall loop without constantly resetting the cube. Definitely overkill, but it felt like something I could knock out pretty quickly.
To build it, I tried a slightly different approach. I've recently been reminded of Jeff Bezos's "working backwards" product development approach, where teams start by writing a press release for a product that doesn't exist yet, as if it has already launched successfully. This forces clarity on the problem being solved, core requirements and the value delivered before any building begins. I thought why not try a variation of this with an agent?
I didn't replicate this exactly but I did start by writing a (far too) detailed spec of the desired tool with GPT-5.4, then fed that into Opus 4.6 in Claude Code to see how far it would get. The idea was to have a single reference point and let the agent work from there.
Some parts worked immediately. Cube rendering, scrambling and the basic UI were all pretty straightforward thanks to existing libraries like cube.js.
The Old Pochmann logic for the sequences, however, was not.
The sequences were often incorrect and sometimes wildly off. There are just enough small rules in Old Pochmann - buffer pieces, orientation quirks and cycle breaks - that the approach the agent took was just too brittle. This took quite a bit of iteration, but eventually I got the agent there.
Over a couple of evenings I ended up with something that worked well for practice and I've been using it since.
This mini-project turned into a neat little test of two things I wanted to explore: whether a spec-first approach works well for agentic engineering and how well agents handle logic that's quite spatial. In summary:
- The spec-first approach was useful, but my spec was far too heavy. I tried to lock in too much detail upfront. A tighter, more high-level version would probably have been better at guiding the agent. Will try this again.
- Opus 4.6 was not great for interpreting spatial logic. When I solve the cube it inherently feels quite visual, even blindfolded - I imagine the cube state and the effect of each move in my head. This feels like the kind of thing a language model might struggle to interpret - and Old Pochmann turns out to be exactly that sort of logic, especially as there are likely few well-defined, structured rules in it's training data. As a result it didn't work quite so well, but it does mean that it's quite a nice test case for future models!
If you want to try the tool, you can find it here.