Haoyu Guo

ghy0324

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

upvoted a paper 2 days ago

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

upvoted a paper 3 days ago

OneThinker: All-in-one Reasoning Model for Image and Video

View all activity

Organizations

upvoted 2 papers 2 days ago

PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

Paper • 2512.02589 • Published 5 days ago • 29

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Paper • 2512.03746 • Published 3 days ago • 15

upvoted 2 papers 3 days ago

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published 4 days ago • 25

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 10 days ago • 106

upvoted a paper 4 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 5 days ago • 166

upvoted 2 papers 13 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 16 days ago • 91

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published 16 days ago • 105

upvoted a paper 16 days ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published 16 days ago • 106

upvoted a paper 20 days ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 23 days ago • 92

upvoted a paper 24 days ago

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published 25 days ago • 193

upvoted 3 papers 26 days ago

Cambrian-S: Towards Spatial Supersensing in Video

Paper • 2511.04670 • Published about 1 month ago • 36

V-Thinker: Interactive Thinking with Images

Paper • 2511.04460 • Published about 1 month ago • 96

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published about 1 month ago • 208

upvoted a paper 27 days ago

DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published 29 days ago • 42

upvoted 4 papers about 1 month ago

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30 • 81

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Paper • 2510.26802 • Published Oct 30 • 33

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30 • 106

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published Oct 28 • 96

liked a model about 1 month ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4 • 5.47M • 2.93k

upvoted a paper about 1 month ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16 • 47

Haoyu Guo

AI & ML interests

Recent Activity

Organizations

ghy0324's activity