Catherine Olsson

Cited by

	All	Since 2019
Citations	15594	11929
h-index	27	26
i10-index	28	27

3700

1850

925

2775

2014201520162017201820192020202120222023202473 153 780 1066 1306 1439 1614 1803 1943 3613 1500

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Tuan-Hung VuResearch scientist, valeo.aiVerified email at valeo.com
Ivan LaptevVisiting professor at MBZUAI, on leave from INRIAVerified email at inria.fr
Josef SivicCzech Technical University, CIIRC, ELLIS Unit PragueVerified email at cvut.cz
Aude OlivaSenior Research Scientist, CSAIL, MIT Director MIT-IBM Lab, MIT College Director IndustryVerified email at mit.edu

Catherine Olsson

Anthropic

Verified email at mit.edu

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Estimating the reproducibility of psychological science Open Science Collaboration Science 349 (6251), aac4716, 2015	9180	2015
Dota 2 with large scale deep reinforcement learning C Berner, G Brockman, B Chan, V Cheung, P Dębiak, C Dennison, ... arXiv preprint arXiv:1912.06680, 2019	1672	2019
An open, large-scale, collaborative effort to estimate the reproducibility of psychological science Open Science Collaboration Perspectives on Psychological Science 7, 657-660, 2012	727	2012
Training a helpful and harmless assistant with reinforcement learning from human feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ... arXiv preprint arXiv:2204.05862, 2022	656	2022
Constitutional ai: Harmlessness from ai feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	561	2022
Tensorfuzz: Debugging neural networks with coverage-guided fuzzing A Odena, C Olsson, D Andersen, I Goodfellow International Conference on Machine Learning, 4901-4911, 2019	338	2019
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	218	2022
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	209	2022
A general language assistant as a laboratory for alignment A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ... arXiv preprint arXiv:2112.00861, 2021	207	2021
In-context learning and induction heads C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ... arXiv preprint arXiv:2209.11895, 2022	183	2022
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ... Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022	167	2022
Discriminator rejection sampling S Azadi, C Olsson, T Darrell, I Goodfellow, A Odena arXiv preprint arXiv:1810.06758, 2018	147	2018
A mathematical framework for transformer circuits N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Transformer Circuits Thread 1, 1, 2021	145	2021
Toy models of superposition N Elhage, T Hume, C Olsson, N Schiefer, T Henighan, S Kravec, ... arXiv preprint arXiv:2209.10652, 2022	136	2022
Is generator conditioning causally related to GAN performance? A Odena, J Buckman, C Olsson, T Brown, C Olah, C Raffel, I Goodfellow International conference on machine learning, 3849-3858, 2018	135	2018
Discovering language model behaviors with model-written evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022	120	2022
Dawn Drain N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Andy Jones, Jackson …, 2021	114	2021
Dota 2 with large scale deep reinforcement learning CB OpenAI, G Brockman, B Chan, V Cheung, P Debiak, C Dennison, ... arXiv preprint arXiv:1912.06680 2, 2019	104	2019
Dawn Drain C Olsson, N Elhage, NJ Neel Nanda, N DasSarma, T Henighan, B Mann, ... Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy …, 2022	101	2022
Unrestricted adversarial examples TB Brown, N Carlini, C Zhang, C Olsson, P Christiano, I Goodfellow arXiv preprint arXiv:1809.08352, 2018	94	2018

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors