Gradual Ascent

Here, I share both initial explorations and published findings of my research.

How is bidirectional information retrieved and generated in masked diffusion language models?

Understanding Bidirectional Information Retrieval in MDLMs with ROME

19 min read · 2025

How is bidirectional information retrieved and generated in masked diffusion language models?

Understanding Bidirectional Information Retrieval in MDLMs with ROME

19 min read · July 14, 2025

2025 · masked-discrete-diffusion mech-interp rome
Characterizing arithmetic length generalization performance in large language models

An initial exploration of a mechanistic understanding of arithmetic performance (and performance scaling) in large language models.

17 min read · February 03, 2025

2025 · llms autoregressive length-generalization