Poster Presentation: PDEBench-Lang: Does notation formation shape neural reasoning about PDEs?

Date: April 28, 2026

Presented a research poster titled “PDEBench-Lang: Does notation formation shape neural reasoning about PDEs?”, investigating how symbolic representation impacts language model reasoning.

Summary

This work studies whether the notation used to represent partial differential equations (PDEs) affects how well language models understand and generalize. While prior work showed LLMs can accelerate PDE solving, the choice of notation (e.g., Postfix) was never rigorously evaluated. We systematically test whether representation format influences reasoning performance.

Approach

Constructed a synthetic dataset of 12,000 PDE instances across 6 families
Each PDE encoded in 4 symbolic dialects:
- LaTeX
- Prefix
- Postfix
- Natural Language
Applied data augmentations:
- k-scaling
- directional shuffling
- positional shuffling
Trained BART-base models (one per dialect)
Evaluated in:
- In-dialect setting (train/test on same format)
- Cross-dialect setting (train/test across formats, full 4×4 matrix)
Metrics:
- Family Accuracy
- Operator F1
- Trash Score (invalid/meaningless outputs)

Key Results

Models achieve 100% accuracy in-dialect, but cross-dialect generalization collapses to 13–35% :contentReference[oaicite:0]{index=0}
Postfix notation generalizes best, maintaining stable performance across formats
Natural language captures operator structure (~80% F1) even when failing classification
Mismatched formats produce high trash scores (up to 77%), indicating breakdown in reasoning rather than simple errors :contentReference[oaicite:1]{index=1}

Key Insights

Representation format strongly affects reasoning, not just performance
Models tend to learn format-specific patterns instead of underlying PDE structure
Postfix is more transferable because:
- Eliminates parentheses and precedence tracking
- Enforces consistent left-to-right computation
- Produces simpler tokenization
Natural language provides semantic signals but weaker structural consistency

Baseline

Zero-shot BART achieves ~16.7% accuracy (random chance) across all formats, confirming the task requires learning and is not trivially solved :contentReference[oaicite:2]{index=2}

Contributions

Introduced a novel PDE symbolic reasoning benchmark (12K samples, 4 dialects)
Provided the first systematic study of notation effects on neural reasoning
Demonstrated that Postfix representation yields the strongest cross-format generalization

Impact

This work highlights that representation design is a critical factor in machine reasoning, with implications for:

Scientific machine learning
Symbolic reasoning systems
LLM-based equation solving

Poster PDFs

📄 Presentation Report 1

Ztang Yit Xiaang (Yixin Chen 陳奕昕)