LiveFact

Next Generation Benchmark for LLM-Driven Fake News Detection

Evaluating the capabilities of Large Language Models in detecting and analyzing misinformation in real-time scenarios

6 Running Months
26K+ Test Cases
7 Categories

About LiveFact

A comprehensive benchmark designed to evaluate LLM performance in fake news detection

🎯

Real-Time Testing

Evaluate models on current events and emerging misinformation patterns

📊

Comprehensive Metrics

Multi-dimensional evaluation across temporal dimensions: before, during, and after event occurrence

🔄

Continuous Updates

Regular benchmark updates to reflect the evolving landscape of misinformation

🌐

Global Coverage

Testing across world-wide events and emerging misinformation patterns

Leaderboard

Current rankings of LLM models on LiveFact benchmark

November 2025
Rank Model Organization Overall Score Before Event (CLS) Before Event (INF) During Event (CLS) During Event (INF) After Event (CLS) After Event (INF)

Methodology

How we evaluate LLM performance on fake news detection

01

Dataset Collection

We curate a diverse dataset of verified true and false news from multiple sources, covering current global events.

02

Temporal Analysis

Each news item is tested at three temporal points: before, during, and after the event occurrence to evaluate temporal reasoning.

03

Model Evaluation

Each model is tested on identical prompts and evaluated based on accuracy, confidence calibration, and reasoning quality.

04

Continuous Updates

The benchmark is updated regularly with new test cases to reflect emerging misinformation trends.

Get Involved

Join our community and contribute to advancing fake news detection research

Submit Your Model

Have a model you'd like to see on the leaderboard? Submit it for evaluation.

Submit Model

Access Dataset

Download the LiveFact dataset and evaluation scripts for your research on Hugging Face.

Download Dataset

Contribute

Join the LiveFact maintenance team and participate in the ongoing research and development of this benchmark.

Join the Team