secrett2633's blog

[논문리뷰] A Practitioner’s Guide to Multi-turn Agentic Reinforcement Learning

October 6, 2025

이 [arXiv]에 게시한 ‘A Practitioner’s Guide to Multi-turn Agentic Reinforcement Learning’ 논문에 대한 자세한 리뷰입니다.

[논문리뷰] Why Can’t Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

October 2, 2025

Stuart Shieber이 [arXiv]에 게시한 ‘Why Can’t Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls’ 논문에 대한 자세한 리뷰입니다.

[논문리뷰] VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

October 2, 2025

이 [arXiv]에 게시한 ‘VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs’ 논문에 대한 자세한 리뷰입니다.

[논문리뷰] VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

October 2, 2025

Zirui Ge이 [arXiv]에 게시한 ‘VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators’ 논문에 대한 자세한 리뷰입니다.

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

October 2, 2025

이 [arXiv]에 게시한 ‘Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned’ 논문에 대한 자세한 리뷰...

Recent Posts

[논문리뷰] A Practitioner’s Guide to Multi-turn Agentic Reinforcement Learning

[논문리뷰] Why Can’t Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

[논문리뷰] VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

[논문리뷰] VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned