Review

[논문리뷰] EmbeddingGemma: Powerful and Lightweight Text Representations

Marksherwood이 [arXiv]에 게시한 'EmbeddingGemma: Powerful and Lightweight Text Representations' 논문에 대한 자세한 리뷰입니다.

2025년 9월 25일

[논문리뷰] EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

Tianyu Wang이 [arXiv]에 게시한 'EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning' 논문에 대한 자세한 리뷰입니다.

2025년 9월 25일

[논문리뷰] Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Avihu이 [arXiv]에 게시한 'Advancing Speech Understanding in Speech-Aware Language Models with GRPO' 논문에 대한 자세한 리뷰입니다.

2025년 9월 25일

[논문리뷰] Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

Genady Beryozkin이 [arXiv]에 게시한 'Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

Anthony Hartshorn이 [arXiv]에 게시한 'What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

Haoxiao Wang이 [arXiv]에 게시한 'VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction

So Fukuda이 [arXiv]에 게시한 'VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Reinforcement Learning on Pre-Training Data

Evander Yang이 [arXiv]에 게시한 'Reinforcement Learning on Pre-Training Data' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] OpenGVL - Benchmarking Visual Temporal Progress for Data Curation

Viktor Petrenko이 [arXiv]에 게시한 'OpenGVL - Benchmarking Visual Temporal Progress for Data Curation' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Wenshuo Ma이 [arXiv]에 게시한 'MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] MAPO: Mixed Advantage Policy Optimization

Xuankun Rong이 [arXiv]에 게시한 'MAPO: Mixed Advantage Policy Optimization' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Yifeng Jiang이 [arXiv]에 게시한 'Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Large Language Models Discriminate Against Speakers of German Dialects

Katharina von der Wense이 [arXiv]에 게시한 'Large Language Models Discriminate Against Speakers of German Dialects' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis

Dan Xu이 [arXiv]에 게시한 'HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation

Jianbin Zheng이 [arXiv]에 게시한 'Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Jin Zheng이 [arXiv]에 게시한 'GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Do You Need Proprioceptive States in Visuomotor Policies?

Yushen Liang이 [arXiv]에 게시한 'Do You Need Proprioceptive States in Visuomotor Policies?' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

Rui Qian이 [arXiv]에 게시한 'CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR

Zeina Aldallal이 [arXiv]에 게시한 'Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR' 논문에 대한 자세한 리뷰입니다.

2025년 9월 24일

[논문리뷰] When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs

Anand Mishra이 [arXiv]에 게시한 'When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs' 논문에 대한 자세한 리뷰입니다.

2025년 9월 23일