LLM Moe - Search News

News

ByteDance open-sources COMET to boost MoE efficiency, accelerating LLM training by 1.7x

ByteDance's Doubao AI team has open-sourced COMET, a Mixture of Experts (MoE) optimization framework that improves large language model (LLM) training efficiency while reducing costs. Already ...

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

Compared to DeepSeek R1, Llama-3.1-Nemotron-Ultra-253B shows competitive results despite having less than half the parameters.

Alibaba’s Qwen3 AI model coming this month, sources say, in bid to cement industry lead

The latest upgrade to the Qwen family of models will include a mixture-of-experts version and one with just 600 million ...

AASTOCKS.com1d

Alibaba Cloud Unrolls New Generation PaaS Solution

Alibaba Cloud, a subsidiary of BABA-W (09988.HK), held its 2025 Spring Launch Event, announcing the offerings of more ...

AASTOCKS.com1d

Alibaba Cloud Unrolls New Generation PaaS Solutions

Alibaba Cloud, a subsidiary of BABA-W (09988.HK), held its 2025 Spring Launch Event, announcing the offerings of more ...

Llama 4 : 10 Million Context Window Fully Tested

Llama 4 Scoot is a compact yet highly capable model, boasting 17 billion active parameters and 16 experts. Its standout ...

1don MSN

Meta debuts its first 'mixture of experts' models from the Llama 4 herd

Meta has debuted the first two models in its Llama 4 family, its first to use mixture of experts tech. A Saturday post from ...

it-daily.net2d

Meta introduces new Llama 4 language models

Meta presented the latest generation of its large language models (LLM) at the weekend: Llama 4 Scout and Llama 4 Maverick.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results