#MLPerf

1 week ago

Results day is coming.
MLPerf Inference v6.0 drops April 1 — cross-platform AI inference data spanning datacenter, edge & more. Follow so you don't miss it.
#MLPerf #AIInference

0 0 0 0

AI Daily Post

@aidailypost.com

1 month ago

Just saw NVIDIA’s NVFP4 recipe slash training time and costs on Blackwell Ultra GPUs—MLPerf scores are soaring and Llama 3.1 cranks out faster than ever. Want the nitty‑gritty on how GPU acceleration is reshaping LLM training? Dive in! #NVFP4 #MLPerf #Llama3_1

🔗 aidailypost.com/news/nvidias...

0 0 0 0

1 month ago

1/4 🧵
First Qwen model in MLPerf.
40M products daily.
Real production data from Shopify's e-commerce infrastructure.
Submit by Feb 13, 2026 👇
#MLPerf #Shopify #VLM #MLCommons

0 0 1 0

MLCommons MLPerf Inference v6.0 Qwen3-VL Shopify Catalog text on a checkered background with mlcommons and shopify logos

1 month ago

🚀 NEW: MLPerf Inference v6.0 debuts Qwen3-VL + Shopify Product Catalog benchmark
40M products daily. Real production data. First Qwen model in MLPerf.
Submit by Feb 13, 2026 →
https://bit.ly/4k9F5YS
#MLPerf #VLM #Shopify #MLCommons

1 1 0 0

MLPerf Mobile - Apps on Google Play An AI benchmark for mobile devices

2 months ago

MLPerf Mobile app v5.0.4 is here!
#MLPerf Mobile release now supports Samsung Exynos 2600 and Qualcomm's newest Snapdragon lineup, from flagship 8 Elite Gen 5 to mid-range 6 Gen 4. Meaning more comprehensive, apples-to-apples #AIperformance data across devices.
play.google.com/store/apps/d...

0 1 0 0

Ahmandonk

@ahmandonk.bsky.social

3 months ago

📰 AMD Umumkan Hasil MLPerf 5.1 Training Pertama untuk GPU Instinct MI350 Series

👉 Baca artikel lengkap di sini: ahmandonk.com/2025/11/19/amd-instinct-...

#ai-training #amd #gpu-computing #hardware #instinct-mi350 #mlperf #rocm

0 0 0 0

Thiard News@F4F

@newsen.bsky.social

3 months ago

Wiwynn Achieves Record-Breaking MLPerf® Training Results with Llama 2 70B at YTL Malaysia Wiwynn sets a new standard in AI training performance with record-breaking MLPerf® Training v5.1 results at YTL Malaysia, enhancing infrastructure efficiency.

Wiwynn Achieves Record-Breaking MLPerf® Training Results with Llama 2 70B at YTL Malaysia #Malaysia #Johor #Wiwynn #MLPerf #YTL_AI_Cloud

0 0 0 0

AI Daily Post

@aidailypost.com

4 months ago

Just saw NVIDIA’s Blackwell crush every MLPerf Training v5.1 benchmark using FP4 precision – even outpacing FP16 on Llama 3.1’s 405‑billion‑parameter model. The future of GPU AI is here. Dive in for the full breakdown! #NVIDIABlackwell #MLPerf #FP4

🔗 aidailypost.com/news/nvidia-...

0 0 0 0

Wccftech | Hardware, Gaming, and Mobile News

@spectrum.ieee.org

4 months ago

Large language models #LLMs are growing extremely quickly, and the #hardware systems that they require can’t keep up with the pace. Each time #MLPerf introduces a new benchmark, training time increases. The data tells the story. spectrum.ieee.org/mlperf-trends

5 3 0 0

Ahmandonk

@ahmandonk.bsky.social

4 months ago

📰 NVIDIA Dominasi MLPerf Training v5.1, Menang di Semua Benchmark

👉 Baca artikel lengkap di sini: ahmandonk.com/2025/11/13/nvidia-menang...

#ai #training #blackwell #gpu #llama #mlperf #nvidia

1 0 0 0

@wccftech.com.web.brid.gy

4 months ago

NVIDIA Blackwell Ultra Secures Win Across All Seven MLPerf AI Training Benchmarks, GB300 NVL72 Sets Record 10 Minutes Training Time For Llama 405B By securing wins across all MLPerf training tests, NVIDIA boasts its Blackwell Ultra-based GB300 NVL72 platform, which delivers leading AI training performance. NVIDIA Showcases its GB300 NVL72 "Blackwell Ultra" Results in MLPerf AI Training Tests; Up To Five Times the Performance vs Hopper-Based Platform When it comes to delivering leading AI performance, NVIDIA GPUs have always been at the forefront. The Blackwell-based data center GPUs have already showcased their incredible potential several times previously, and the latest GB300 NVL72 platform is no exception. Today, NVIDIA has proudly announced that its Blackwell Ultra-powered AI GPUs have secured the first position in […]

0 0 0 0

4 months ago

New Results! MLPerf Training v5.1

MLPerf Training v5.1 results are live!
Record participation: 20 organizations submitted 65 unique systems featuring 12 different accelerators. Multi-node submissions increased 86% over last year, showing the industry's focus on scale.
Results: mlcommons.org/2025/11/trai...
#MLPerf
1/3

2 2 1 0

@spectrum.ieee.org.web.brid.gy

4 months ago

AI Model Growth Outpaces Hardware Improvements Since 2018, the consortium MLCommons has been running a sort of Olympics for AI training. The competition, called MLPerf, consists of a set of tasks for training specific AI models, on predefined datasets, to a certain accuracy. Essentially, these tasks, called benchmarks, test how well a hardware and low-level software configuration is set up to train a particular AI model. Twice a year, companies put together their submissions—usually, clusters of CPUs and GPUs and software optimized for them—and compete to see whose submission can train the models fastest. There is no question that since MLPerf’s inception, the cutting-edge hardware for AI training has improved dramatically. Over the years, Nvidia has released four new generations of GPUs that have since become the industry standard (the latest, Nvidia’s Blackwell GPU, is not yet standard but growing in popularity). The companies competing in MLPerf have also been using larger clusters of GPUs to tackle the training tasks. However, the MLPerf benchmarks have also gotten tougher. And this increased rigor is by design—the benchmarks are trying to keep pace with the industry, says David Kanter, head of MLPerf. “The benchmarks are meant to be representative,” he says. Intriguingly, the data show that the large language models and their precursors have been increasing in size faster than the hardware has kept up. So each time a new benchmark is introduced, the fastest training time gets longer. Then, hardware improvements gradually bring the execution time down, only to get thwarted again by the next benchmark. Then the cycle repeats itself.

0 0 0 0

Ahmandonk

@ahmandonk.bsky.social

5 months ago

📰 Solidigm Buka AI Central Lab dengan 192 SSD Berkapasitas Total 23,6 Petabyte dalam 16U

👉 Baca artikel lengkap di sini: ahmandonk.com/2025/10/11/solidigm-ai-c...

#ai #d5-p5336 #d7-ps1010 #data #center #metrum #ai #mlperf #penyimpanan #data #solidigm #ssd

0 1 0 0

A New TinyML Streaming Benchmark for MLPerf Tiny v1.3 - MLCommons A New TinyML Streaming Benchmark for MLPerf Tiny v1.3

5 months ago

TinyML benchmarks finally address real-world deployment with MLCommons' new streaming benchmark in MLPerf Tiny v1.3. Tests 20-minute continuous wake word detection while measuring power and duty cycle.
Technical deep dive: mlcommons.org/2025/09/mlpe... #MLPerf #TinyML #EdgeAI

0 0 0 0

MLPerf Introduces Largest and Smallest LLM Benchmarks Nvidia's Blackwell Ultra chip is setting new standards in AI performance. How does it achieve nearly 50% performance gain?

@spectrum.ieee.org

5 months ago

This year's #MLPerf introduced three new benchmark tests (its largest yet, its smallest yet, and a new voice-to-text model), and #Nvidia's Blackwell Ultra topped the charts on the two largest benchmarks.

1 0 0 0

Daniel Gutierrez

@ddgutierrez.bsky.social

6 months ago

The Summer of MLPerf Congratulations to MLPerf Inference v5.1 for a new submission record! MLPerf Inference is the fourth benchmark release in under two months. Progress in AI is rapid, and the organization is thrilled…

BREAKING TODAY! The Summer of MLPerf -- radicaldatascience.wordpress.com/2025/09/11/t...

#AI #LLM #GenAI #MachineLearning @mlcommons.org #MLPerf

1 0 0 0

@spectrum.ieee.org.web.brid.gy

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-annual machine learning competition sometimes termed “the Olympics of AI,” introduced three new benchmark tests, reflecting new directions in the field. “Lately, it has been very difficult trying to follow what happens in the field,” says Miro Hodak, AMD engineer and MLPerf Inference working group co-chair. “We see that the models are becoming progressively larger, and in the last two rounds we have introduced the largest models we’ve ever had.” The chips that tackled these new benchmarks came from the usual suspects—Nvidia, Arm, and Intel. Nvidia topped the charts, introducing its new Blackwell Ultra GPU, packaged in a GB300 rack-scale design. AMD put up a strong performance, introducing its latest MI325X GPUs. Intel proved that one can still do inference on CPUs with their Xeon submissions, but also entered the GPU game with an Intel Arc Pro submission. ## New Benchmarks Last round, MLPerf introduced its largest benchmark yet, a large language model based on Llama3.1-403B. This round, they topped themselves yet again, introducing a benchmark based on the Deepseek R1 671B model—more than 1.5 times the number of parameters of the previous largest benchmark. As a reasoning model, Deepseek R1 goes through several steps of chain-of-thought when approaching a query. This means much of the computation happens during inference then in normal LLM operation, making this benchmark even more challenging. Reasoning models are claimed to be the most accurate, making them the technique of choice for science, math, and complex programming queries. In addition to the largest LLM benchmark yet, MLPerf also introduced the smallest, based on Llama3.1-8B. There is growing industry demand for low latency yet high-accuracy reasoning, explained Taran Iyengar, MLPerf Inference task force chair. Small LLMs can supply this, and are an excellent choice for tasks such as text summarization and edge applications. This brings the total count of LLM-based benchmarks to a confusing four. They include the new, smallest Llama3.1-8B benchmark; a pre-existing Llama2-70B benchmark; last round’s introduction of the Llama3.1-403B benchmark; and the largest, the new Deepseek R1 model. If nothing else, this signals LLMs are not going anywhere. In addition to the myriad LLMs, this round of MLPerf inference included a new voice-to-text model, based on Whisper-large-v3. This benchmark is a response to the growing number of voice-enabled applications, be it smart devices or speech-based AI interfaces. TheMLPerf Inference competition has two broad categories: “closed,” which requires using the reference neural network model as-is without modifications, and “open,” where some modifications to the model are allowed. Within those, there are several subcategories related to how the tests are done and in what sort of infrastructure. We will focus on the “closed” datacenter server results for the sake of sanity. ## Nvidia leads Surprising no one, the best performance per accelerator on each benchmark, at least in the ‘server’ category, was achieved by an Nvidia GPU-based system. Nvidia also unveiled the Blackwell Ultra, topping the charts in the two largest benchmarks: Lllama3.1-405B and DeepSeek R1 reasoning. Blackwell Ultra is a more powerful iteration of the Blackwell architecture, featuring significantly more memory capacity, double the acceleration for attention layers, 1.5x more AI compute, and faster memory and connectivity compared to the standard Blackwell. It is intended for the larger AI workloads, like the two benchmarks it was tested on. In addition to the hardware improvements, director of accelerated computing products at Nvidia Dave Salvator attributes the success of Blackwell Ultra to two key changes. First, the use of Nvidia’s proprietary 4-bit floating point number format, NVFP4. “We can deliver comparable accuracy to formats like BF16,” Salvator says, while using a lot less computing power. The second is so-called disaggregated serving. The idea behind disaggregated serving is that there are two main parts to the inference workload: prefill, where the query (“Please summarize this report.”) and its entire context window (the report) are loaded into the LLM, and generation/decoding, where the output is actually calculated. These two stages have different requirements. While prefill is compute heavy, generation/decoding is much more dependent on memory bandwidth. Salvator says that by assigning different groups of GPUs to the two different stages, Nvidia achieves a performance gain of nearly 50 percent. ## AMD close behind AMD’s newest accelerator chip, MI355X launched in July. The company offered results only in the “open” category where software modifications to the model are permitted. Like Blackwell Ultra, MI355x features 4-bit floating point support, as well as expanded high-bandwidth memory. The MI355X beat its predecessor, the MI325X, in the open Llama2.1-70B benchmark by a factor of 2.7, says Mahesh Balasubramanian, senior director of data center GPU product marketing at AMD. AMD’s “closed” submissions included systems powered by AMD MI300X and MI325X GPUs. The more advanced MI325X computer performed similarly to those built with Nvidia H200s on the Lllama2-70b, the mixture of experts test, and image generation benchmarks. This round also included the first hybrid submission, where both AMD MI300X and MI325X GPUs were used for the same inference task,the Llama2-70b benchmark. The use of hybrid GPUs is important, because new GPUs are coming at a yearly cadence, and the older models, deployed en-masse, are not going anywhere. Being able to spread workloads between different kinds of GPUs is an essential step. ## Intel enters the GPU game In the past, Intel has remained steadfast that one does not need a GPU to do machine learning. Indeed, submissions using Intel’s Xeon CPU still performed on par with the Nvidia L4 on the object detection benchmark but trailed on the recommender system benchmark. This round, for the first time, an Intel GPU also made a showing. The Intel Arc Pro was first released in 2022. The MLPerf submission featured a graphics card called the MaxSun Intel Arc Pro B60 Dual 48G Turbo , which contains two GPUs and 48 gigabytes of memory. The system performed on-par with Nvidia’s L40S on the small LLM benchmark and trailed it on the Llama2-70b benchmark.

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used to measure its progress are having to race to keep up. A case in point: MLPerf, the bia...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-a...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

@spectrum.ieee.org.web.brid.gy

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-annual machine learning competition sometimes termed “the Olympics of AI,” introduced three new benchmark tests, reflecting new directions in the field. “Lately, it has been very difficult trying to follow what happens in the field,” says Miro Hodak, AMD engineer and MLPerf Inference working group co-chair. “We see that the models are becoming progressively larger, and in the last two rounds we have introduced the largest models we’ve ever had.” The chips that tackled these new benchmarks came from the usual suspects—Nvidia, Arm, and Intel. Nvidia topped the charts, introducing its new Blackwell Ultra GPU, packaged in a GB300 rack-scale design. AMD put up a strong performance, introducing its latest MI325X GPUs. Intel proved that one can still do inference on CPUs with their Xeon submissions, but also entered the GPU game with an Intel Arc Pro submission. ## New Benchmarks Last round, MLPerf introduced its largest benchmark yet, a large language model based on Llama3.1-403B. This round, they topped themselves yet again, introducing a benchmark based on the Deepseek R1 671B model—more than 1.5 times the number of parameters of the previous largest benchmark. As a reasoning model, Deepseek R1 goes through several steps of chain-of-thought when approaching a query. This means much of the computation happens during inference then in normal LLM operation, making this benchmark even more challenging. Reasoning models are claimed to be the most accurate, making them the technique of choice for science, math, and complex programming queries. In addition to the largest LLM benchmark yet, MLPerf also introduced the smallest, based on Llama3.1-8B. There is growing industry demand for low latency yet high-accuracy reasoning, explained Taran Iyengar, MLPerf Inference task force chair. Small LLMs can supply this, and are an excellent choice for tasks such as text summarization and edge applications. This brings the total count of LLM-based benchmarks to a confusing four. They include the new, smallest Llama3.1-8B benchmark; a pre-existing Llama2-70B benchmark; last round’s introduction of the Llama3.1-403B benchmark; and the largest, the new Deepseek R1 model. If nothing else, this signals LLMs are not going anywhere. In addition to the myriad LLMs, this round of MLPerf inference included a new voice-to-text model, based on Whisper-large-v3. This benchmark is a response to the growing number of voice-enabled applications, be it smart devices or speech-based AI interfaces. TheMLPerf Inference competition has two broad categories: “closed,” which requires using the reference neural network model as-is without modifications, and “open,” where some modifications to the model are allowed. Within those, there are several subcategories related to how the tests are done and in what sort of infrastructure. We will focus on the “closed” datacenter server results for the sake of sanity. ## Nvidia leads Surprising no one, the best performance per accelerator on each benchmark, at least in the ‘server’ category, was achieved by an Nvidia GPU-based system. Nvidia also unveiled the Blackwell Ultra, topping the charts in the two largest benchmarks: Lllama3.1-405B and DeepSeek R1 reasoning. Blackwell Ultra is a more powerful iteration of the Blackwell architecture, featuring significantly more memory capacity, double the acceleration for attention layers, 1.5x more AI compute, and faster memory and connectivity compared to the standard Blackwell. It is intended for the larger AI workloads, like the two benchmarks it was tested on. In addition to the hardware improvements, director of accelerated computing products at Nvidia Dave Salvator attributes the success of Blackwell Ultra to two key changes. First, the use of Nvidia’s proprietary 4-bit floating point number format, NVFP4. “We can deliver comparable accuracy to formats like BF16,” Salvator says, while using a lot less computing power. The second is so-called disaggregated serving. The idea behind disaggregated serving is that there are two main parts to the inference workload: prefill, where the query (“Please summarize this report.”) and its entire context window (the report) are loaded into the LLM, and generation/decoding, where the output is actually calculated. These two stages have different requirements. While prefill is compute heavy, generation/decoding is much more dependent on memory bandwidth. Salvator says that by assigning different groups of GPUs to the two different stages, Nvidia achieves a performance gain of nearly 50 percent. ## AMD close behind AMD’s newest accelerator chip, MI355X launched in July. The company offered results only in the “open” category where software modifications to the model are permitted. Like Blackwell Ultra, MI355x features 4-bit floating point support, as well as expanded high-bandwidth memory. The MI355X beat its predecessor, the MI325X, in the open Llama2.1-70B benchmark by a factor of 2.7, says Mahesh Balasubramanian, senior director of data center GPU product marketing at AMD. AMD’s “closed” submissions included systems powered by AMD MI300X and MI325X GPUs. The more advanced MI325X computer performed similarly to those built with Nvidia H200s on the Lllama2-70b, the mixture of experts test, and image generation benchmarks. This round also included the first hybrid submission, where both AMD MI300X and MI325X GPUs were used for the same inference task,the Llama2-70b benchmark. The use of hybrid GPUs is important, because new GPUs are coming at a yearly cadence, and the older models, deployed en-masse, are not going anywhere. Being able to spread workloads between different kinds of GPUs is an essential step. ## Intel enters the GPU game In the past, Intel has remained steadfast that one does not need a GPU to do machine learning. Indeed, submissions using Intel’s Xeon CPU still performed on par with the Nvidia L4 on the object detection benchmark but trailed on the recommender system benchmark. This round, for the first time, an Intel GPU also made a showing. The Intel Arc Pro was first released in 2022. The MLPerf submission featured a graphics card called the MaxSun Intel Arc Pro B60 Dual 48G Turbo , which contains two GPUs and 48 gigabytes of memory. The system performed on-par with Nvidia’s L40S on the small LLM benchmark and trailed it on the Llama2-70b benchmark.

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-a...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-a...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used to measure its progress are having to race to keep up. A case in point: MLPerf, the bia...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-a...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

NVIDIA Blackwell Ultra: 5x schneller! Alle Rekorde & Specs 2025 Schockierende neue MLPerf-Rekorde! Erfahre, wie NVIDIAs Blackwell Ultra GPU mit neuer Architektur & NVFP4 bis zu 5,2x schneller ist als Hopper. Alle Details. 🚀

6 months ago

Machine Learning Tests Keep Getting Bigger The machine learning field is moving fast, and the yardsticks used measure progress in it are having to race to keep up. A case in point, MLPerf, the bi-a...

#Mlper #Ai #Nvidia #Amd #Intel #Mlperf

Origin | Interest | Match

0 0 0 0

KINEWS24.de

@kinews24.bsky.social

6 months ago

💪 NVIDIA Blackwell Ultra: 5× schneller – wieso?

▶️ Erklärt NVFP4 und TTFT
▶️ Nutzt 288GB HBM3e RAM!
▶️ Trennt Kontext und Gen

#ai #ki #artificialintelligence #BlackwellUltra #Nvidia #mlperf #aiinference #tech2025

💬 LIKEN ❤️ TEILEN 🔄 LESEN 📖 FOLGEN ➕

kinews24.de/nvidia-black...

1 0 0 0

Drew Jolly

@amjolly.bsky.social

6 months ago

MLCommons Releases New MLPerf Inference v5.1 Benchmark Results Sept. 9, 2025 — Today, MLCommons announced new results for its industry-standard MLPerf Inference v5.1 benchmark suite, tracking the relentless forward momentum of the AI community and its new capabil...

MLCommons Releases New MLPerf Inference v5.1 Benchmark Results
ow.ly/t2pv50WU8IQ #MLCommons #MLPerf #HPCwire

1 0 0 0