SFE Challenge

Advancing scientific discovery through artificial intelligence. Join researchers worldwide in pushing the boundaries of AI applications in scientific research.

Introduction

The SFE Challenge (Scientists’ First Exam Challenge) brings together the global research community to explore innovative frontiers of AI-empowered scientific discovery. This competition focuses on building and evaluating advanced multimodal large language models (MLLMs) capable of deep scientific understanding and cognitive reasoning across multiple scientific domains. By mirroring the complexity of real scientific workflows, the SFE Challenge offers a unique testbed to push the boundaries of automated scientific intelligence.

Participants are invited to develop models that not only excel within familiar scientific domains, but also demonstrate robust generalization to novel tasks and disciplines. With comprehensive resources, official baselines, and a supportive community, the SFE Challenge provides an inclusive platform for researchers and practitioners interested in AI for science to accelerate progress in automated reasoning and facilitate the next wave of breakthroughs in AI-driven scientific discovery.

Machine LearningAI for ScienceScientific CognitionMulti Discipline

Timeline

Warm-up Round Starts

October 20, 2025

Participants can start exploring the competition tracks and datasets

Official Round Starts

December 1, 2025

Official competition begins with full access to datasets and evaluation metrics

Official Round Ends

March 15, 2026

Submission deadline for all competition tracks

Results Announcement

April, 2026

Winners will be announced

Competition Tracks

🔬
SFE Benchmark Track
This track is built on the Scientists’ First Exam (SFE) benchmark, aiming to comprehensively evaluate and push forward the scientific cognitive abilities of multimodal large language models (MLLMs) in realistic scientific workflows. The SFE benchmark covers 66 expert-designed scientific tasks spanning five high-value scientific disciplines, containing 830 rigorously vetted multimodal VQA pairs. It systematically assesses models on three interconnected levels: scientific signal perception, scientific attribute understanding, and scientific comparative reasoning. Rule: Participants are provided with a training dataset that includes 25 out of the total 66 tasks in SFE, as well as a never-before-seen test set covering these same 25 tasks. For evaluation, the same LLM-as-a-Judge scoring method used in SFE will be adopted. We invite researchers and practitioners interested in multimodal models, scientific AI, and automated scientific reasoning to join us in advancing the frontier of AI-empowered scientific discovery!

Key Topics:

Multi DisciplineScientific CognitionAI for Science
Training Data

Organizers

Yuhao Zhou

Yuhao Zhou

Sichuan University

Wanghan Xu

Wanghan Xu

Shanghai Jiao Tong University

Bo Liu

Bo Liu

The Hong Kong Polytechnic University

Li Kang

Li Kang

Shanghai Jiao Tong University

Wenzhe Li

Wenzhe Li

Tongji University

Yiheng Wang

Yiheng Wang

Shanghai Jiao Tong University

Xuming He

Xuming He

Zhejiang University

Jia Bu

Jia Bu

Shanghai Jiao Tong University

Zhiwang Zhou

Zhiwang Zhou

Tongji University

Yixin Chen

Yixin Chen

University of California Los Angeles

Xiang Zhuang

Xiang Zhuang

Zhejiang University

Fengxiang Wang

Fengxiang Wang

National University of Defense Technology

Advisory Committee

Wenlong Zhang

Wenlong Zhang

Shanghai Artificial Intelligence Laboratory

Zhenfei Yin

Zhenfei Yin

University of Oxford

Siqi Sun

Siqi Sun

Fudan University

Tianfan Fu

Tianfan Fu

Nanjing University

Dongzhan Zhou

Dongzhan Zhou

Shanghai Artificial Intelligence Laboratory

Fenghua Ling

Fenghua Ling

Shanghai Artificial Intelligence Laboratory

Yan Lu

Yan Lu

The Chinese University of Hong Kong

Shixiang Tang

Shixiang Tang

Chinese University of Hong Kong

Philip Torr

Philip Torr

University of Oxford

Yinghui Zhang

Yinghui Zhang

Shanghai Artificial Intelligence Laboratory

Zicheng Zhang

Zicheng Zhang

Shanghai Artificial Intelligence Laboratory

Guangtao Zhai

Guangtao Zhai

Shanghai Jiao Tong University

Lei Bai

Lei Bai

Shanghai Artificial Intelligence Laboratory

Contact

General Inquiries

For questions about registration, submission guidelines, or general information about the challenge.

sfe-challenge@googlegroups.com
WeChat Group

Join our WeChat group for real-time discussions, updates, and community support.

Discord Community

Connect with fellow researchers, get technical support, and participate in live discussions on our Discord server.

Join Discord Server