Cogview: Mastering text-to-image generation via transformers M Ding, Z Yang, W Hong, W Zheng, C Zhou, D Yin, J Lin, X Zou, Z Shao, ... Advances in Neural Information Processing Systems 34, 19822-19835, 2021 | 560 | 2021 |
Cogvideo: Large-scale pretraining for text-to-video generation via transformers W Hong, M Ding, W Zheng, X Liu, J Tang arXiv preprint arXiv:2205.15868, 2022 | 245 | 2022 |
Cogview2: Faster and better text-to-image generation via hierarchical transformers M Ding, W Zheng, W Hong, J Tang Advances in Neural Information Processing Systems 35, 16890-16902, 2022 | 206 | 2022 |
Cogvlm: Visual expert for pretrained language models W Wang, Q Lv, W Yu, W Hong, J Qi, Y Wang, J Ji, Z Yang, L Zhao, X Song, ... arXiv preprint arXiv:2311.03079, 2023 | 116 | 2023 |
Cogagent: A visual language model for gui agents W Hong, W Wang, Q Lv, J Xu, W Yu, J Ji, Y Wang, Z Wang, Y Dong, ... arXiv preprint arXiv:2312.08914, 2023 | 38 | 2023 |
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis J Teng, W Zheng, M Ding, W Hong, J Wangni, Z Yang, J Tang arXiv preprint arXiv:2309.03350, 2023 | 4 | 2023 |
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations J Qi, M Ding, W Wang, Y Bai, Q Lv, W Hong, B Xu, L Hou, J Li, Y Dong, ... arXiv preprint arXiv:2402.04236, 2024 | 3 | 2024 |
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer Z Yang, H Jiang, W Hong, J Teng, W Zheng, Y Dong, M Ding, J Tang arXiv preprint arXiv:2405.04312, 2024 | | 2024 |