Taalas has launched its new Hardcore ASIC (HC1) chip, which boasts an impressive performance of up to 17,000 tokens per second, making it approximately ten times faster than the leading Cerebras wafer-scale engine. This significant speed advantage comes from Taalas’s unique approach of hardwiring a specific model, Meta’s Llama3.1-8B, directly into the chip, thereby sacrificing general programmability for extreme efficiency. While the HC1 offers a more economical option for AI inference, costing around $0.75 per million tokens, it introduces operational complexities for data centers, as they must manage multiple generations of chips for different models and their necessary updates.
Taalas unveils HC1 chip, achieving 17,000 tokens per second for Llama3.1-8B
