Shaochen Zhang*, Zekun Qi*, Runpei Dong, Xiuxiu Bai, Xing Wei arXiv preprint, 2024 [arXiv] [Code] We rethink the role of positional encoding in 3D representation learning, and propose Positional Prompt Tuning, a simple but efficient method for transfer learning. |
|
Yuang Peng*, Yuxin Cui*, Haomiao Tang*, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia arXiv preprint, 2024 [arXiv] [Project] [Code] We collect diverse images and prompts, and utilize GPT-4o for automated evaluation aligned with human preference. |
|
Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma European Conference on Computer Vision (ECCV), 2024 [arXiv] [Project] [Code] [Huggingface] We present ShapeLLM, the first 3D Multimodal Large Language Model designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages. |
Runpei Dong*, Chunrui Han*, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, Hongyu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi International Conference on Learning Representations (ICLR), 2024, Spotlight [arXiv] [Project] [Code] [Huggingface] We present DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models empowered with frequently overlooked synergy between multimodal comprehension and creation. |
|
Zekun Qi*, Muzhou Yu*, Runpei Dong, Kaisheng Ma Conference on Neural Information Processing Systems (NeurIPS), 2023 [arXiv] [Code] [OpenReview] We achieve rapid, multi-category 3D conditional generation by sharing the merits of different representations. VPP can generate 3D shapes less than 0.2s using a single RTX 2080Ti. |
|
Guofan Fan, Zekun Qi, Wenkai Shi, Kaisheng Ma ACM International Conference on Multimedia (ACMMM), 2024 [arXiv] [Code] We enhance the utilization of color information to improve 3D scene self-supervised learning. |
|
Zekun Qi*, Runpei Dong*, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi International Conference on Machine Learning (ICML), 2023 [arXiv] [Code] [OpenReview] We propose contrast guided by reconstruct to mitigate the pattern differences between two self-supervised paradigms. |
Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jianjian Sun, Zheng Ge, Li Yi, Kaisheng Ma International Conference on Learning Representations (ICLR), 2023 [arXiv] [Code] [OpenReview] We propose to use autoencoders as cross-modal teachers to transfer dark knowledge into 3D representation learning. |