Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario I Medennikov, M Korenevsky, T Prisyach, Y Khokhlov, M Korenevskaya, ... Interspeech 2020, 274--278, 2020 | 180 | 2020 |
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation A Laptev, R Korostik, A Svischev, A Andrusenko, I Medennikov, S Rybin CISP-BMEI 2020, 439--444, 2020 | 67 | 2020 |
The STC system for the CHiME-6 challenge I Medennikov, M Korenevsky, T Prisyach, Y Khokhlov, M Korenevskaya, ... CHiME 2020 Workshop on Speech Processing in Everyday Environments, 2020 | 60 | 2020 |
Towards a competitive end-to-end speech recognition for chime-6 dinner party transcription A Andrusenko, A Laptev, I Medennikov Interspeech 2020, 319--323, 2020 | 20 | 2020 |
CTC variations through new wfst topologies A Laptev, S Majumdar, B Ginsburg Interspeech 2022, 1041--1045, 2022 | 19 | 2022 |
Dynamic acoustic unit augmentation with bpe-dropout for low-resource end-to-end speech recognition A Laptev, A Andrusenko, I Podluzhny, A Mitrofanov, I Medennikov, ... Sensors 21 (9), 3063, 2021 | 16 | 2021 |
Exploration of end-to-end asr for openstt–russian open speech-to-text dataset A Andrusenko, A Laptev, I Medennikov Speech and Computer: 22nd International Conference, SPECOM 2020, St …, 2020 | 9 | 2020 |
Fast entropy-based methods of word-level confidence estimation for end-to-end automatic speech recognition A Laptev, B Ginsburg 2022 IEEE Spoken Language Technology Workshop (SLT), 152-159, 2023 | 6 | 2023 |
LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring A Mitrofanov, M Korenevskaya, I Podluzhny, Y Khokhlov, A Laptev, ... Interspeech 2021, 4039--4043, 2021 | 2 | 2021 |
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System TJ Park, H Huang, A Jukic, K Dhawan, KC Puvvada, N Koluguri, N Karpov, ... arXiv preprint arXiv:2310.12378, 2023 | 1 | 2023 |
Confidence-based ensembles of end-to-end speech recognition models I Gitman, V Lavrukhin, A Laptev, B Ginsburg arXiv preprint arXiv:2306.15824, 2023 | 1 | 2023 |
Powerful and Extensible WFST Framework for Rnn-Transducer Losses A Laptev, V Bataev, I Gitman, B Ginsburg ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 1 | 2023 |
Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems N Malkovsky, V Bataev, D Sviridkin, N Kizhaeva, A Laptev, I Valiev, ... arXiv preprint arXiv:2003.09024, 2020 | 1 | 2020 |