April 27, 2024

New AI standard tests speed of responses to user queries – March 27, 2024 at 4:00 pm

The AI ​​benchmarking group MLCommons released a new set of tests and results on Wednesday to evaluate the speed at which cutting-edge devices can run AI applications and respond to user requests.

The two new MLCommons benchmarks measure the speed at which AI chips and systems can generate answers from powerful, data-packed AI models. The results show roughly how quickly an AI application like ChatGPT can provide an answer to a user's query.

One of the new benchmarks added the ability to measure Q&A script speed for large language models. Called Llama 2, it includes 70 billion parameters and was developed by Meta Platforms.

MLCommons officials also added a second text-to-image generator to the benchmarking toolkit, called MLPerf and based on Stability AI's Stable Diffusion XL model.

Servers with Nvidia's H100 chips, built by companies like Google, Supermicro, and Nvidia itself, easily beat the two new benchmarks in terms of raw performance. Several server manufacturers have introduced designs based on the company's less powerful L40S chip.

For the image generation benchmark, server manufacturer Krai presented a design using Qualcomm's AI chip that uses much less power than Nvidia's innovative processors.

Intel also introduced a design based on its Gaudi2 accelerator chips. The company described the results as “strong.”

Initial performance is not the only important dimension of using AI applications. Advanced AI chips consume huge amounts of power, and one of the biggest challenges facing AI companies is developing chips that provide optimal levels of performance with minimal power consumption.

See also  Simple file transfer: WhatsApp gets an AirDrop alternative

MLCommons has its own reference class for measuring power consumption. (Reporting by Max A. Cherney in San Francisco; Editing by Jamie Freed)