About Me
Currently working on my doctoral thesis, my main interests include speech synthesis, speech translation, speaker identification and emotion recognition.
From september 2021, I started a PhD thesis on expressive speech translation at the University of Avignon and the Laboratoire Informatique d’Avignon (LIA)
under the supervision of Prof. Esteve Yannick and Assoc. Prof. Parcollet Titouan. The thesis is part of the SELMA consortium funded by the European Union’s Horizon 2020 research and innovation programme.
Education
PhD in computer science [2021]
Thesis: Expressive speech translation
Advisors: Esteve Yannick and Parcollet Titouan
Avignon Université, France
Master Research in computer science [2019-2021]
Specializations: Artificial Intelligence, Cyber Security
Avignon Université, France
Bachelor in computer science [2016-2019]
Avignon Université, France
Experiences
Scientific Researcher [Sep.2021-Mar.2024]
Research and Development of TTS, ASR, and S2ST models for a multilingual open-source platform that can process large volumes of content.
SELMA, UE
Visiting Researcher [Jul.2023-Oct.2023]
Design of efficient Text-to-Speech systems, model quantization, and inference time optimization.
NXP, France
Visiting PhD Student [Jun.2022-Aug.2022]
Research on end-to-end model for Speech to Speech Translation.
Johns Hopkins University, USA
Research Apprenticeship [Sep.2019-Aug.2021]
Research on speaker identification and verification.
Avignon University, France
Research Intern [May.2019-Jul.2021]
Development of speaker verification systems.
Avignon University, France
Research Intern [May.2018-Jul.2018]
Research on sentiment analysis.
Avignon University, France
Projects
Speechbrain [2022-2023]
Implementation of speech synthesis, speech translation and speech-to-speech translation model in the SpeechBrain project.
LIAvignon AI Challenge [2021-2022]
The aim is to recognize which character of a series has pronounced a short line, based on the spelling transcription of the line and/or from the audio recording of the line. Recognizing the series and not just the character is also part of the challenge.
STKLIA [2019-2021]
Collaboration and release of a Pytorch-Kaldi toolkit for speaker verification called STKLIA.
Publications
Enhancing expressivity transfer in textless speech-to-speech translation
J Duret, T Parcollet, Y Estève
ASRU 2023
Learning multilingual expressive speech representation for prosody prediction without parallel data
J Duret, T Parcollet, Y Estève
INTERSPEECH 2023
Direct Text to Speech Translation System Using Acoustic Units
V Mingote, P Gimeno, L Vicente, S Khurana, A Laurent, J Duret
IEEE Signal Processing Letters
Multi-lingual Speech to Speech Translation for Under-Resourced Languages
A Larcher, Y Estève, M Rouvier, N Tomashenko, J Duret, G Laperriere, ...
HAL 2022
End-to-end model for named entity recognition from speech without paired training data
Mdhaffar, J Duret, T Parcollet, Y Estève
INTERSPEECH 2022
Study on the temporal pooling used in deep neural networks for speaker verification
M Rouvier, PM Bousquet, J Duret
EUSIPCO 2021
Language adaptation for speaker recognition systems using contrastive learning
V Brignatz, J Duret, D Matrouf, M Rouvier
SPECOM 2021
Influence of speaker pre-training on character voice representation
M Quillot, J Duret, R Dufour, M Rouvier, JF Bonastre
SPECOM 2021
Teaching
Year 2022/2023
Teaching | Lectures | Practical Work | Total |
---|---|---|---|
Advanced Programming Project | 42h | 42h | |
Assembly Programming | 24h | 24h | |
Total | 66h |
Year 2021/2022
Teaching | Lectures | Practical Work | Total |
---|---|---|---|
HTML & CSS | 24h | 24h | |
Assembly Programming | 24h | 24h | |
Agorithms and programming | 18h | 18h | |
Total | 66h |