MentorEval Benchmark Leaderboard

A Multilingual Benchmark for Educational Assessment

Loading benchmark data...

Rank Parameters Overall ASAP ASAP 2.0 ELLIPSE Mohler PT-ASAG 2018 AR-ASAG

Dataset Information

ASAP

Language: English

Level: ISCED 2-3

Type: Essay Writing

Samples: 12,977

ASAP 2.0

Language: English

Level: ISCED 1-3

Type: Essay Writing

Samples: 24,728

ELLIPSE

Language: English

Level: ISCED 3

Type: Essay Writing

Samples: 6,482

Mohler

Language: English

Level: ISCED 6

Type: Short Answer

Samples: 1,263

PT-ASAG 2018

Language: Portuguese

Level: ISCED 2

Type: Short Answer

Samples: 9,862

AR-ASAG

Language: Arabic

Level: ISCED 7

Type: Short Answer

Samples: 2,132