LAUNCH Lab

university

https://launch.eecs.umich.edu/

launchnlp

Activity Feed

AI & ML interests

Factuality, reasoning, alignment, LLM applications

Recent Activity

farimafatahi authored a paper 25 days ago

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

farimafatahi authored a paper 25 days ago

Logit Arithmetic Elicits Long Reasoning Capabilities Without Training

farimafatahi authored a paper 25 days ago

From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

View all activity

farimafatahi

authored 3 papers 25 days ago

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

Paper • 2410.22257 • Published Oct 29, 2024

Logit Arithmetic Elicits Long Reasoning Capabilities Without Training

Paper • 2507.12759 • Published Jul 17

From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

Paper • 2511.10899 • Published 30 days ago • 3

xinliucs

updated a Space about 1 month ago

FactRBench

🏆

View and analyze long-form factuality leaderboard

JieRuan

updated a Space 3 months ago

ExpertLongBench

🚀

Leaderboard for ExpertLongBench

JieRuan

updated a dataset 5 months ago

launch/ExpertLongBench

Preview • Updated Jul 30 • 367 • 10

frederickxzhang

published a dataset 5 months ago

launch/CMV

Viewer • Updated Jun 26 • 133 • 30

mkhalifa

in launch/ThinkPRM-14B 6 months ago

Add link to code and library name

#2 opened 6 months ago by

nielsr

mkhalifa

in launch/thinkprm-1K-verification-cots 6 months ago

Update paper link and add Github link

#3 opened 6 months ago by

nielsr

zkjzou

updated a dataset 6 months ago

launch/ManyICLBench

Viewer • Updated Jun 26 • 66 • 639 • 1

frederickxzhang

updated a dataset 6 months ago

launch/CMV

Viewer • Updated Jun 26 • 133 • 30

mkhalifa

updated a model 6 months ago

launch/ThinkPRM-1.5B

Text Generation • 2B • Updated Jun 25 • 303 • 3

zkjzou

published a Space 6 months ago

ManyICLBench

🚀

Leaderboard for ManyICLBench

zkjzou

updated a Space 6 months ago

ManyICLBench

🚀

Leaderboard for ManyICLBench

JieRuan

in launch/ExpertLongBench 6 months ago

Change ordering and remove columns from T3

#3 opened 6 months ago by

amyliiu

Add task category and relevant tags

#2 opened 6 months ago by

nielsr

Update README.md

#1 opened 6 months ago by

amyliiu

JieRuan

in launch/ExpertLongBench 6 months ago

Update src/streamlit_app.py

#3 opened 6 months ago by

amyliiu

JieRuan

authored a paper 6 months ago

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists

Paper • 2506.01241 • Published Jun 2 • 9

xinliucs

updated a dataset 6 months ago

launch/FactRBench

Viewer • Updated Jun 9 • 1.06k • 68 • 1

AI & ML interests

Recent Activity

Team members 16

launch's activity

FactRBench

ExpertLongBench

Add link to code and library name

Update paper link and add Github link

ManyICLBench

ManyICLBench

Change ordering and remove columns from T3

Add task category and relevant tags

Update README.md

Update src/streamlit_app.py