Why Representation Might Matter for Symbolic PDE Reasoning

1 minute read

Published:

In the PDEBench-Lang project, the main question is surprisingly simple: does the format of a partial differential equation change how well a language model can reason about it? We often focus on model architecture and training data size, but representation format can matter just as much when the task is symbolic.

For example, the same PDE can be written in raw LaTeX, postfix notation, prefix notation, or plain natural language. From a human perspective, these are all equivalent. From a model’s perspective, they may not be equivalent at all. Language models are exposed heavily to natural language and LaTeX-like mathematics during pretraining, but much less to formal encodings such as postfix expressions.

That mismatch is what makes the problem interesting. A format that is computationally convenient for parsing is not automatically the best format for neural reasoning. In our benchmark, we are trying to measure whether some representations make operator prediction and reasoning chains more faithful to the true symbolic structure of the PDE.

One lesson from the first experiments is that high accuracy alone is not enough. When a synthetic dataset is too templated, a model can look extremely strong while still avoiding the deeper structural reasoning we actually care about. That is why the reasoning-fidelity side of the benchmark is just as important as the headline classification score.

I think this project is a good reminder that representation is not a cosmetic choice. In symbolic tasks, it can shape what the model learns to notice, what shortcuts it uses, and whether its explanations are genuinely informative.