Bloom: A 176b-parameter open-access multilingual language model TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... arXiv preprint arXiv:2211.05100, 2022 | 1459 | 2022 |
StarCoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Transactions on Machine Learning Research, 2835-8856, 2023 | 614* | 2023 |
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset H Laurençon, L Saulnier, T Wang, C Akiki, AV del Moral, T Le Scao, ... Thirty-sixth Conference on Neural Information Processing Systems Datasets …, 2022 | 145 | 2022 |
Overview of the CLPsych 2022 shared task: Capturing moments of change in longitudinal user posts A Tsakalidis, J Chim, IM Bilal, A Zirikly, D Atzil-Slonim, F Nanni, P Resnik, ... Proceedings of the Eighth Workshop on Computational Linguistics and Clinical …, 2022 | 59 | 2022 |
BigBio: A Framework for Data-Centric Biomedical Natural Language Processing JA Fries, L Weber, N Seelam, G Altay, D Datta, S Garda, M Kang, R Su, ... Thirty-sixth Conference on Neural Information Processing Systems Datasets …, 2022 | 43 | 2022 |
Identifying Moments of Change from Longitudinal User Text A Tsakalidis, F Nanni, A Hills, J Chim, J Song, M Liakata Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 32 | 2022 |
GEMv2: Multilingual NLG benchmarking in a single line of code S Gehrmann, A Bhattacharjee, A Mahendiran, A Wang, A Papangelis, ... Conference on Empirical Methods in Natural Language Processing (EMNLP) Demo …, 2022 | 15 | 2022 |
Overview of the clpsych 2024 shared task: Leveraging large language models to identify evidence of suicidality risk in online posts J Chim, A Tsakalidis, D Gkoumas, D Atzil-Slonim, Y Ophir, A Zirikly, ... Proceedings of the 9th Workshop on Computational Linguistics and Clinical …, 2024 | 13 | 2024 |
Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions TY Zhuo, MC Vu, J Chim, H Hu, W Yu, R Widyasari, INB Yusuf, H Zhan, ... arXiv preprint arXiv:2406.15877, 2024 | 3 | 2024 |
Combining Hierachical VAEs with LLMs for clinically meaningful timeline summarisation in social media J Song, J Chim, A Tsakalidis, J Ive, D Atzil-Slonim, M Liakata Findings of the Association for Computational Linguistics ACL 2024, 14651–14672, 2024 | | 2024 |
Data Contamination Report from the 2024 CONDA Shared Task O Sainz, I García-Ferrero, A Jacovi, JA Campos, Y Elazar, E Agirre, ... Proceedings of the 1st Workshop on Data Contamination (CONDA), 41–56, 2024 | | 2024 |