Run LLMs under 5W

Looking beyond current deployments, DeepX is now developing its next-generation chip, the DX-M2 – an on-device AI processor designed to run LLMs under 5W. As large language model technology evolves, the field is beginning to split in two directions: one track continues to scale up LLMs in cloud data centers in pursuit of AGI; the other, more practical path focuses on lightweight, efficient models optimized for local inference – such as DeepSeek and Meta’s LLAMA 4. The DX-M2 is purpose-built for this second future. With ultra-low power consumption, high performance, and a silicon architecture tailored for real-world deployment, it will support LLMs like DeepSeek and LLAMA 4 directly at the edge, no cloud dependency required. Most notably, the processor is being developed to become the first AI inference chip built on the leading-edge 2nm process – marking a new era of performance-per-watt leadership. In short, DX-M2 isn’t just about running LLMs efficiently, it’s about unlocking the next stage of intelligent devices, fully autonomous and truly local.

Memory-Centric AI Architectures

One key innovation in AI semiconductor technology is the shift towards memory-centric computing architectures. Traditional computing paradigms prioritize raw processing speed; however, future AI models demand rapid access to vast amounts of data. Advanced memory solutions, such as High-Bandwidth Memory (HBM4) and emerging technologies like MRAM and ReRAM, are now becoming essential for managing the heavy data flows required by complex AI workloads. Additionally, revolutionary interconnect standards like Compute Express Link (CXL 3.0) and Universal Chiplet Interconnect Express (UCIe) facilitate faster, more efficient data exchanges between computing components, further enhancing the performance and scalability of multi-modal AI applications.

Symbiotic Relationship: Software-Hardware Co-Design

The seamless integration of hardware and software design is crucial for harnessing the full potential of AI. Strategic partnerships between major AI software innovators and semiconductor manufacturers, such as those between OpenAI and Microsoft or Meta’s LLAMA initiative, highlight the essential need for software-hardware co-design. Companies like DeepX recognize this and are actively fostering collaborations that result in customized semiconductor architectures specifically optimized for advanced multi-modal AI tasks.

Seiten: 1 2 3 4