The benchmark, named Hist-LLM, assesses the correctness of LLMs’ responses based on the Seshat Global History Databank, a comprehensive repository of historical knowledge. The study tested three ...
Researchers recently evaluated the ability of advanced artificial intelligence (AI) models to answer questions about global history using a benchmark derived from the Seshat Global History Databank.