Review

[논문리뷰] Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

Stuart Shieber이 [arXiv]에 게시한 'Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

이 [arXiv]에 게시한 'VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

Zirui Ge이 [arXiv]에 게시한 'VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

이 [arXiv]에 게시한 'Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] ReSWD: ReSTIR'd, not shaken. Combining Reservoir Sampling and Sliced Wasserstein Distance for Variance Reduction

이 [arXiv]에 게시한 'ReSWD: ReSTIR'd, not shaken. Combining Reservoir Sampling and Sliced Wasserstein Distance for Variance Reduction' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] PIPer: On-Device Environment Setup via Online Reinforcement Learning

이 [arXiv]에 게시한 'PIPer: On-Device Environment Setup via Online Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] On Predictability of Reinforcement Learning Dynamics for Large Language Models

Yuqing Huang이 [arXiv]에 게시한 'On Predictability of Reinforcement Learning Dynamics for Large Language Models' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Making, not Taking, the Best of N

이 [arXiv]에 게시한 'Making, not Taking, the Best of N' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

이 [arXiv]에 게시한 'Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] JoyAgent-JDGenie: Technical Report on the GAIA

이 [arXiv]에 게시한 'JoyAgent-JDGenie: Technical Report on the GAIA' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Infusing Theory of Mind into Socially Intelligent LLM Agents

이 [arXiv]에 게시한 'Infusing Theory of Mind into Socially Intelligent LLM Agents' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning

Chaehyeon Chung이 [arXiv]에 게시한 'In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures

Andrea Passerini이 [arXiv]에 게시한 'Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness

Chien-Sheng Wu이 [arXiv]에 게시한 'GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] GEM: A Gym for Agentic LLMs

이 [arXiv]에 게시한 'GEM: A Gym for Agentic LLMs' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution

이 [arXiv]에 게시한 'Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Eliciting Secret Knowledge from Language Models

Neel Nanda이 [arXiv]에 게시한 'Eliciting Secret Knowledge from Language Models' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

이 [arXiv]에 게시한 'DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

Hengyi Cai이 [arXiv]에 게시한 'CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일

[논문리뷰] Code2Video: A Code-centric Paradigm for Educational Video Generation

이 [arXiv]에 게시한 'Code2Video: A Code-centric Paradigm for Educational Video Generation' 논문에 대한 자세한 리뷰입니다.

2025년 10월 2일