Guian Fang

I am a Ph.D. student at Show Lab, National University of Singapore, advised by Prof. Mike Zheng Shou.

Previously, I received my B.Eng. in Intelligent Science and Technology at the School of Intelligent Systems Engineering, Sun Yat-sen University, advised by Xiaodan Liang (梁小丹), co-supervised by Shengcai Liao.

My research centers on generative video models and world models for long-form visual generation, with recent work on video diffusion distillation and embodied intelligence.

Beyond research, I build agentic visual-generation pipelines and runtime infrastructure for coding agents, and I am open to collaboration.

News

  • 2026.05 Open-sourced AnyFlow — any-step video diffusion via flow map distillation.
  • 2026.05 Released Claw Orchestrator — a unified runtime for Claude Code, Codex and other coding CLIs.
  • 2026.03 Launched PAI at Utopai Studios — long-form video generation for cinematic storytelling.
  • 2025.07 RealignDiff accepted to IEEE TNNLS — coarse-to-fine semantic re-alignment for diffusion.
  • 2025.06 Launched MikoAI — an ACG (anime / comic / manga) creation tool, powered by FramePrompt.

Publications

My research focuses on Generative Models for Vision — particularly long-form video generation, video diffusion, and Video World Models for video understanding and generation — as well as Embodied Intelligence. Representative works are highlighted. * denotes equal contribution.

AnyFlow:
Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Yuchao Gu, Guian Fang, Yuxin Jiang, Weijia Mao, Song Han, Han Cai, Mike Zheng Shou

arXiv preprint, 2026

PAI-Studio:
Cinematic Video Background Replacement with Camera-Aware Motion

Heyuan Gao*, Bangxun Tang*, Yiren Song*, Guian Fang, Zijian He, Jie Yang, Mike Zheng Shou

arXiv preprint, 2026

HumanRefiner research visualization showing human pose refinement

HumanRefiner:
Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

Guian Fang*, Wenbiao Yan*, Yuanfan Guo*, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang

ECCV, 2024

ChartThinker framework diagram showing contextual chain-of-thought approach

ChartThinker:
A Contextual Chain-of-Thought Approach to Optimized Chart Summarization

Mengsha Liu, Daoyuan Chen, Yaliang Li, Guian Fang, Ying Shen

LREC-COLING, 2024

RealignDiff framework showing coarse-to-fine semantic re-alignment process

RealignDiff:
Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment

Guian Fang*, Zutao Jiang*, Jianhua Han, Guansong Lu, Hang Xu, Shengcai Liao, Xiaojun Chang, Xiaodan Liang

IEEE TNNLS, 2023

Products & Open Source

Deployed products and open-source systems I've led or co-built — productized research rather than peer-reviewed papers.

Honors & Awards

Scholarships

Competitions

Activities & Services

Conference Reviewer

  • CV: ECCV, CVPR, ICCV
  • ML: NeurIPS, ICLR, ICML
  • AI: AAAI, AISTATS
  • NLP: ACL Rolling Review (ACL, EMNLP, NAACL, EACL)

Workshop Organizer

  • LOVEU Workshop @ CVPR 2024: Long-form Video Understanding Towards Multimodal AI Assistant and Copilot

Teaching Assistant

  • EE3703: Machine Learning with Applications
  • EE4309: Robot Perception
  • EE5106: Advanced Robotics

Acknowledgements

Grateful to the mentors and teams I've worked with along the way.