Recent Posts

[논문리뷰] Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

James Cheng이 [arXiv]에 게시한 'Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Mohit Bansal이 [arXiv]에 게시한 'SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Reinforcing Diffusion Models by Direct Group Preference Optimization

Jing Tang이 [arXiv]에 게시한 'Reinforcing Diffusion Models by Direct Group Preference Optimization' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Peng Cheng이 [arXiv]에 게시한 'Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation

Zheng Zhu이 [arXiv]에 게시한 'R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Baixuan Xu이 [arXiv]에 게시한 'NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

이 [arXiv]에 게시한 'NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

vanilla1116이 [arXiv]에 게시한 'MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

이 [arXiv]에 게시한 'Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Memory Retrieval and Consolidation in Large Language Models through Function Tokens

이 [arXiv]에 게시한 'Memory Retrieval and Consolidation in Large Language Models through Function Tokens' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] MemMamba: Rethinking Memory Patterns in State Space Model

Xiao Sun이 [arXiv]에 게시한 'MemMamba: Rethinking Memory Patterns in State Space Model' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

이 [arXiv]에 게시한 'Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling

이 [arXiv]에 게시한 'LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

이 [arXiv]에 게시한 'LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs

Franck Dernoncourt이 [arXiv]에 게시한 'Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

이 [arXiv]에 게시한 'Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

Jintao Zhang이 [arXiv]에 게시한 'Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] InstructX: Towards Unified Visual Editing with MLLM Guidance

Xinghui Li이 [arXiv]에 게시한 'InstructX: Towards Unified Visual Editing with MLLM Guidance' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

이 [arXiv]에 게시한 'Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일

[논문리뷰] GCPO: When Contrast Fails, Go Gold

이 [arXiv]에 게시한 'GCPO: When Contrast Fails, Go Gold' 논문에 대한 자세한 리뷰입니다.

2025년 10월 10일