benchmarks
-
Hackers News
[2502.06559] Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation
[Submitted on 10 Feb 2025] View a PDF of the paper titled Can We Trust AI Benchmarks? An Interdisciplinary Review…
Read More » -
PassMark CPU Benchmarks – Year on Year Performance
Year on Year Performance Updated 10th of February 2025 This graph counts the baselines submitted to us during these time…
Read More » -
Tech News
Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek-R1…
Read More » -
Tech News
Self-invoking code benchmarks help you decide which LLMs to use for your programming tasks
LLMs are good at coding simple functions. But how good are they at calling their own functions to solve complex…
Read More » -
Tech News
Small model, big impact: Patronus AI’s Glider outperforms GPT-4 in key AI benchmarks
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A…
Read More »