Guian Fang

I am a Ph.D. student at Show Lab, National University of Singapore, advised by Prof. Mike Zheng Shou.

Previously, I received my B.Eng. in HCPLab, Artificial Intelligence at the School of Intelligent Systems Engineering, Sun Yat-sen University, advised by Xiaodan Liang (梁小丹), co-supervised by Shengcai Liao.

I'm currently building agentic visual generation pipelines and runtime infrastructure for coding agents. Open to collabs on long-form video, world models, embodied intelligence, and agent orchestration — feel free to reach out.

News

  • 2026.05 Released Claw Orchestrator — a unified runtime for Claude Code, Codex and other coding CLIs.
  • 2026.03 Launched PAI at Utopai Studios — long-form video generation for cinematic storytelling.
  • 2025.07 RealignDiff accepted to IEEE TNNLS — coarse-to-fine semantic re-alignment for diffusion.
  • 2025.06 Launched MikoAI — an ACG (anime / comic / manga) creation tool, powered by FramePrompt.
  • 2024.08 Started Ph.D. at Show Lab, NUS, advised by Prof. Mike Zheng Shou.
  • 2024.07 HumanRefiner accepted to ECCV 2024.
  • 2024.06 Co-organized LOVEU workshop at CVPR 2024 — long-form video understanding & AI copilots.
  • 2024.05 ChartThinker presented at LREC-COLING 2024 in Torino.
  • 2023.07 Released LLaMA2-Accessory with OpenGVLab, Shanghai AI Lab — open-source LLM toolkit.

Research & Projects

My research focuses on Generative Models for Vision — particularly long-form video generation, video diffusion, and Video World Models for video understanding and generation — as well as Embodied Intelligence. Recently I've been exploring agentic visual generation, building creative pipelines from concept to final frame, and runtime infrastructure for coding agents. Representative works are highlighted. * denotes equal contribution.

Claw Orchestrator banner — unified runtime for coding agent CLIs

Claw Orchestrator:
A Unified Runtime for Coding Agent CLIs

Guian Fang

Open-source, 2026

GitHub stars count npm version

A TypeScript runtime that turns coding CLIs (Claude Code, Codex, Gemini, Cursor Agent, OpenCode, and custom engines) into persistent, programmable agents — with sessions, multi-engine routing, and multi-agent councils behind one API. Runs standalone, with first-class OpenClaw plugin support.

ACG Creation Made Easy for Everyone.

MikoAI*, Guian Fang*

Productized at MikoAI, 2025

FramePrompt is the character-animation engine behind MikoAI. By packing reference images, skeleton motion, and target clips into one visual sequence, it turns animation into conditional future prediction — driving pre-trained video diffusion transformers with no guider modules or multi-stage pipelines.

Generate 3D Worlds in Production with AI

Cybever*, Guian Fang*

Productized at Cybever, 2024

An image-to-3D pipeline for 3D world generation in production. Cybever handles the heavy lifting — geometry, layout, materials — so 3D professionals can focus on storytelling.

Honors & Awards

Scholarships

Competitions

Activities & Services

Conference Reviewer

  • CV: ECCV, CVPR, ICCV
  • ML: NeurIPS, ICLR, ICML
  • AI: AAAI, AISTATS
  • NLP: ACL Rolling Review (ACL, EMNLP, NAACL, EACL)

Workshop Organizer

  • LOVEU Workshop @ CVPR 2024: Long-form Video Understanding Towards Multimodal AI Assistant and Copilot

Teaching Assistant

  • EE3703: Machine Learning with Applications
  • EE4309: Robot Perception
  • EE5106: Advanced Robotics

Acknowledgements

Grateful to the mentors and teams I've worked with along the way.