A reinforcement learning agent reads the board, picks safe moves, and closes the puzzle step-by-step.