Hosted on MSN1mon
AI Falls Short in Advanced Historical Analysis, Study ShowsThe benchmark, named Hist-LLM, assesses the correctness of LLMs’ responses based on the Seshat Global History Databank, a comprehensive repository of historical knowledge. The study tested three ...
Hosted on MSN1mon
AI models struggle with expert-level global history knowledgeResearchers recently evaluated the ability of advanced artificial intelligence (AI) models to answer questions about global history using a benchmark derived from the Seshat Global History Databank.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results