Yuki Ichihara

Starting in Aug 2026, I will join MBZUAI as a Ph.D. student, supervised by Junpei Komiyama.

My interests include Best-of-N and Minimum Bayes Risk decoding, language model alignment, and RL-based fine-tuning methods.

News

Aug 2026 I will join MBZUAI as a Ph.D. student, supervised by Junpei Komiyama.

Jul 2026 MO-GRPO was accepted to TACL.

Reliable Chain-of-Thought via Prefix Consistency N. Iwase, Y. Ichihara, M. A. Quamar, J. Komiyama. arXiv preprint, 2026. Paper Code
CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency H. Ota, N. Iwase, Y. Ichihara, J. Komiyama, M. Imaizumi. arXiv preprint, 2026. Paper
Consensus Group Relative Policy Optimization for Text Generation Y. Ichihara, Y. Jinnai, K. Ariu, E. Uchibe. Workshop at ACL, 2026. Paper Code
MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems Y. Ichihara, Y. Jinnai, T. Morimura, M. Sakamoto, R. Mitsuhashi, E. Uchibe. TACL, 2026. Paper Code
Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks Y. Ichihara, Y. Jinnai. EMNLP Industry Track, 2025. Paper Code
Theoretical Guarantees for Minimum Bayes Risk Decoding Y. Ichihara, Y. Jinnai, K. Ariu, T. Morimura, E. Uchibe. ACL, 2025. Paper
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment Y. Ichihara, Y. Jinnai, T. Morimura, K. Abe, K. Ariu, M. Sakamoto, E. Uchibe. TMLR, 2025. Paper
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees T. Kitamura, T. Kozuno, M. Kato, Y. Ichihara, S. Nishimori, A. Sannai, S. Sonoda, W. Kumagai, Y. Matsuo. RLSW, 2024. Paper

Aug 2026 - MBZUAI, Ph.D. (incoming), supervised by Junpei Komiyama

2025 - 2026 Nara Institute of Science and Technology, Ph.D. program in Engineering (withdrew to re-enroll at MBZUAI)

2025 Nara Institute of Science and Technology, M.S. in Engineering

2023 Tokyo University of Science, B.S. in Electrical Engineering

2025 - 2026 JST SPRING Fellowship, Doctoral fellowship

2024 - 2026 CyberAgent AI Lab, Research Intern

2023 - 2026 ATR, Research Assistant

2023 AIST, Research Assistant

2026 Reviewer, ARR; ICML (nominated as Gold Reviewer)

For research discussions or collaboration, email is the best way to reach me.