Welfare Diplomacy: Benchmarking Language Model Cooperation G Mukobi, H Erlebach, N Lauffer, L Hammond, A Chan, J Clifton arXiv preprint arXiv:2310.08901, 2023 | 8 | 2023 |
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ... arXiv preprint arXiv:2403.03218, 2024 | 4 | 2024 |
Escalation Risks from Language Models in Military and Diplomatic Decision-Making JP Rivera, G Mukobi, A Reuel, M Lamparth, C Smith, J Schneider arXiv preprint arXiv:2401.03408, 2024 | 3 | 2024 |
Assessing Risks of Using Autonomous Language Models in Military and Diplomatic Planning G Mukobi, AK Reuel, JP Rivera, C Smith Multi-Agent Security Workshop@ NeurIPS'23, 2023 | | 2023 |
SuperHF: Supervised Iterative Learning from Human Feedback G Mukobi, P Chatain, S Fong, R Windesheim, G Kutyniok, K Bhatia, ... arXiv preprint arXiv:2310.16763, 2023 | | 2023 |
CNNs for Photorealistic Computer-Generated Imagery Detection G Mukobi, S Zhang | | 2019 |
Opportunities in Physics Education: Low-Cost Position Tracking for Use in Kinematics Labs PR DeStefano, C Siebert, R Perez-Franco, T Allen, G Mukobi, ... | | 2018 |