Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks (To appear) Yuki Ichihara, Yuu Jinnai EMNLP Industry Track 2025
MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems(Paper) Yuki Ichihara, Yuu Jinnai, Tetsuro Morimura, Mitsuki Sakamoto, Ryota Mitsuhashi, Eiji Uchibe Preprint
About
My core interests include reinforcement learning in general and foundational research in natural language processing.
Education
Apr 2025 — Present: Nara Institute of Science and Technology — Ph.D. Program (Engineering)
Mar 2025: Nara Institute of Science and Technology — M.S. in Engineering
Mar 2023: Tokyo University of Science — B.S. in Electrical Engineering