Zekun Qi

I am a 2nd-year MS student joint study in Xi’an Jiaotong University & IIIS, Tsinghua University under the supervision of Prof. Andrew C. Yao. I collaborate closely with Prof. Kaisheng Ma, Prof. Li Yi and Runpei Dong.
In 2022, I obtained my bachelor’s degree in Automation from Xi’an Jiaotong University .
I am currently a research intern of Foundation Model Group at Megvii Inc, where I work with Zheng Ge and Xiangyu Zhang .

My research focuses on 3D Computer Vision, Multimodal Large Language Models, and Embodied AI.

Email / Google Scholar / Github / CV

News

2024-01: One paper is accepted to ICLR 2024 as Spotlight presentation.

2023-09: One paper is accepted to NeurIPS 2023.

2023-04: One paper is accepted to ICML 2023.

2023-01: One paper is accepted to ICLR 2023.

Publications

* indicates equal contribution

	ShapeLLM: Universal 3D Object Understanding for Embodied Interaction Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, He Wang, Li Yi, Kaisheng Ma arXiv preprint, 2024 [arXiv] [Project] [Code] We present ShapeLLM, the first 3D Multimodal Large Language Model designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.
	DreamLLM: Synergistic Multimodal Comprehension and Creation Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, Hongyu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi International Conference on Learning Representations (ICLR), 2024, Spotlight [arXiv] [Project] [Code] We present DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models empowered with frequently overlooked synergy between multimodal comprehension and creation.
	VPP⚡: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation Zekun Qi, Muzhou Yu, Runpei Dong, Kaisheng Ma Conference on Neural Information Processing Systems (NeurIPS), 2023 [arXiv] [Code] We achieve rapid, multi-category 3D conditional generation by sharing the merits of different representations. VPP can generate 3D shapes less than 0.2s using a single RTX 2080Ti.
	Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast Guofan Fan, Zekun Qi, Wenkai Shi, Kaisheng Ma arXiv preprint, 2023 [arXiv] [Code] We enhance the utilization of color information to improve 3D scene self-supervised learning.
	Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi International Conference on Machine Learning (ICML), 2023 [arXiv] [Code] [OpenReview] We propose contrast guided by reconstruct to mitigate the pattern differences between two self-supervised paradigms.
	Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, Kaisheng Ma International Conference on Learning Representations (ICLR), 2023 [arXiv] [Code] [OpenReview] We propose to use autoencoders as cross-modal teachers to transfer dark knowledge into 3D representation learning.

Honors and Awards

2022 Outstanding Graduate, Xi’an Jiaotong University

2021 Annual Spiritual Civilization Award , Xi’an Jiaotong University

2020 National runner-up of the China Undergraduate Physics Tournament (CUPT) as the team leader

2019 Chen Qi Scholarship, Xi’an Jiaotong University

Website Template