GIFT Eval
🥇
158
GIFT-Eval: A Benchmark for General Time Series Forecasting
None defined yet.
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion