Segui
Zijia Zhao
Zijia Zhao
Institute of Automation, Chinese Academy Sciences (CASIA)
Email verificata su ia.ac.cn
Titolo
Citata da
Citata da
Anno
Vast: A vision-audio-subtitle-text omni-modality foundation model and dataset
S Chen, H Li, Q Wang, Z Zhao, M Sun, X Zhu, J Liu
Advances in Neural Information Processing Systems 36, 2024
422024
OPT: Omni-perception pre-trainer for cross-modal understanding and generation
J Liu, X Zhu, F Liu, L Guo, Z Zhao, M Sun, W Wang, H Lu, S Zhou, J Zhang, ...
arXiv preprint arXiv:2107.00249, 2021
372021
Chatbridge: Bridging modalities with large language model as a language catalyst
Z Zhao, L Guo, T Yue, S Chen, S Shao, X Zhu, Z Yuan, J Liu
arXiv preprint arXiv:2305.16103, 2023
302023
Vl-mamba: Exploring state space models for multimodal learning
Y Qiao, Z Yu, L Guo, S Chen, Z Zhao, M Sun, Q Wu, J Liu
arXiv preprint arXiv:2403.13600, 2024
152024
Mamo: masked multimodal modeling for fine-grained vision-language representation learning
Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu
arXiv preprint arXiv:2210.04183, 2022
52022
Mm21 pre-training for video understanding challenge: Video captioning with pretraining techniques
S Chen, X Zhu, D Hao, W Liu, J Liu, Z Zhao, L Guo, J Liu
Proceedings of the 29th ACM International Conference on Multimedia, 4853-4857, 2021
52021
MAMO: Fine-Grained Vision-Language Representations Learning with Masked Multimodal Modeling
Z Zhao, L Guo, X He, S Shao, Z Yuan, J Liu
Proceedings of the 46th International ACM SIGIR Conference on Research and …, 2023
42023
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
T Yue, J Cheng, L Guo, X Dai, Z Zhao, X He, G Xiong, Y Lv, J Liu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
2024
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–8