The following videos demonstrate agent behavior on economic data tasks. Playback is shown at 5× speed.
Category | Task Description | Start URL | Answer | Domain |
---|---|---|---|---|
Government | As published by the Office for National Statistics, what was the CPIH annual inflation rate for all items (2015=100) in the United Kingdom in March 2025? Provide only the number as a decimal with one digit after the decimal point, without percent symbols or other units. | ons.gov.uk | 3.4 | ons.gov.uk |
Energy | As reported by the U.S. Energy Information Administration, what was the average retail price of regular gasoline in California during the week of March 24, 2025, in dollars per gallon? Provide only the number as a decimal with three digits after the decimal point, without currency symbols, commas, or other units. | eia.gov | 4.418 | eia.gov |
Markets | As reported by Cox Automotive, what was the total number of unsold used vehicles in the United States as of March 31, 2025? Provide only the number as a decimal with two digits, in millions, without commas or other units. | coxautoinc.com | 2.14 | coxautoinc.com |
Banking | As reported by the Federal Reserve Bank of New York, what was the effective federal funds rate on January 10, 2025? Provide only the number as a decimal with two digits, without percent symbols or other units. | newyorkfed.org | 4.33 | newyorkfed.org |
Category | Tasks | o4-mini | GPT-4.1 | GPT-4o | Claude-Sonnet-4 | Llama-4-Maverick |
---|---|---|---|---|---|---|
Banking | 60 | 41.7% | 23.3% | 18.3% | 38.3% | 21.7% |
Finance | 21 | 33.3% | 14.3% | 14.3% | 23.8% | 9.5% |
Government | 138 | 56.5% | 44.9% | 36.2% | 47.1% | 26.1% |
Labor | 24 | 25.0% | 0.0% | 8.3% | 12.5% | 4.2% |
Markets | 60 | 48.3% | 35.0% | 33.3% | 41.7% | 15.0% |
Other* | 57 | 38.6% | 22.8% | 21.1% | 29.8% | 12.3% |
All | 360 | 46.4% | 31.4% | 27.2% | 38.3% | 18.9% |
*Other categories: Energy, RealEstate, Trade, Education, and Health.
@article{liu2025econwebarena,
title={EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments},
author={Zefang Liu and Yinzhu Quan},
year={2025}
}