Wednesday, April 8, 2026 - 12:00pm
Event Calendar Category
Other LIDS Events
Speaker Name
Yorgos Pantis
Affiliation
Kapodistrian University of Athens / Archimedes, Athena Research Center
“Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error”
Despite their proficiency in various language tasks, Large Language Models (LLMs) struggle with combinatorial problems like Satisfiability, Traveling Salesman Problem, or even basic arithmetic. We address this gap through a novel trial & error approach for solving problems in the class NP, where candidate solutions are iteratively generated and efficiently validated using verifiers. We focus on the paradigmatic task of Sudoku and achieve state-of-the-art accuracy (99%) compared to prior neuro-symbolic approaches. Unlike prior work that used custom architectures, our method employs a vanilla decoder-only Transformer (GPT-2) without external tools or function calling. Our method integrates imitation learning of simple Sudoku rules with an explicit Depth-First Search (DFS) exploration strategy involving informed guessing and backtracking. Moving beyond imitation learning, we seek to minimize the number of guesses until reaching a solution. This is achieved using depth-1 guessing, showing empirically that almost all Sudoku can be solved using the puzzle's rules with at most one guess. We provide a rigorous analysis of this setup formalizing its connection to a contextual variant of Min-Sum Set Cover, a well-studied problem in algorithms and stochastic optimization.
Yorgos Pantis is a second-year Ph.D. student in Machine Learning at the National and Kapodistrian University of Athens and Athena Research Center. His research focuses on theory-driven approaches to modern machine learning, with an emphasis on Transformers and neural networks. He is supervised by Christos Tzamos. Prior to his doctoral studies, he held research positions at several institutions, including the Technical University of Denmark, Czech Technical University, Athena Research Center, and the Max Planck Institute for Mathematics. His work during this time centered on machine learning, optimization, and natural language processing. Yorgos has also completed graduate studies in statistics at the University of Oxford and undergraduate studies in mathematics at the University of Patras.

