Generalist Foundation Models Are Not Clinical Enough for Hospital Operations Paper • 2511.13703 • Published 23 days ago • 21
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence Paper • 2511.07384 • Published about 1 month ago • 16
When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity Paper • 2509.20293 • Published Sep 24 • 7
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22 • 35
When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023
MARVIS: Modality Adaptive Reasoning over VISualizations Paper • 2507.01544 • Published Jul 2 • 13
Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering Paper • 2506.06905 • Published Jun 7 • 2
ModEFormer: Modality-Preserving Embedding for Audio-Video Synchronization using Transformers Paper • 2303.11551 • Published Mar 21, 2023
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes Paper • 2503.18155 • Published Mar 23 • 1
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification Paper • 2505.20302 • Published May 16
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models Paper • 2310.18208 • Published Oct 27, 2023
When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023