ACG Creation Made Easy for Everyone.
FramePrompt: In-context Controllable Animation with Zero Structural Changes
I am a Ph.D. student at Show Lab, National University of Singapore, advised by Prof. Mike Zheng Shou.
Previously, I received my B.Eng. in HCPLab, Artificial Intelligence at the School of Intelligent Systems Engineering, Sun Yat-sen University, advised by Xiaodan Liang (梁小丹), co-supervised by Shengcai Liao.
Beyond research, I'm passionate about gaming and enjoy collecting various credit cards as a hobby. I believe in fostering open communication within the research community. Whether you'd like to chat about academic pursuits, share experiences, or explore collaborative opportunities, I'm always happy to connect.
My research focuses on Generative Models for Vision, particularly Video World Models for video understanding and generation. Representative papers are highlighted. * denotes equal contribution.
FramePrompt: In-context Controllable Animation with Zero Structural Changes
Image-to-3D: Let AI do the heavy lifting so 3D Professionals can do the storytelling
ECCV, 2024
A new large-scale benchmark (AbHuman) for anatomical anomalies in humans, and a plug-and-play method (HumanRefiner) for refining abnormal human generations with pose-reversible guidance.
LREC-Coling, 2024
ChartThinker leverages chain-of-thought reasoning and context retrieval to generate accurate and coherent chart summaries, outperforming previous methods on a diverse benchmark.
An open-source toolkit supporting pretraining, finetuning, and deployment of large language and multimodal models, making LLM development more accessible and efficient.
TNNLS, 2023
RealignDiff introduces a two-stage semantic re-alignment strategy to significantly improve the consistency between generated images and text prompts in diffusion models.
I feel incredibly fortunate to have collaborated with such remarkable individuals who have generously offered me their mentorship.