The Ant Open Source group has launched LLaDA 2.1, a 100 billion parameter discrete diffusion LLM designed with a draft-then-edit approach, allowing it to correct errors in real-time as it generates text. This innovative model addresses a significant limitation of autoregressive models, which cannot modify previously generated tokens, thus reducing the impact of early mistakes during text generation. LLaDA 2.1 achieves a peak processing speed of 892 tokens per second on complex coding tasks, and its open-source release positions it as a non-consensus alternative to mainstream models, challenging traditional approaches within the field. Additionally, it offers immediate compatibility with SGLang from lmsysorg, ensuring it is production-ready from day one.
Ant Open Source unveils LLaDA 2.1, a 100B diffusion model with real-time token editing
