About

About Me

Currently working on my doctoral thesis, my main interests include speech synthesis, speech translation, speaker identification and emotion recognition.

From september 2021, I started a PhD thesis on expressive speech translation at the University of Avignon and the Laboratoire Informatique d’Avignon (LIA) under the supervision of Prof. Esteve Yannick and Assoc. Prof. Parcollet Titouan. The thesis is part of the SELMA consortium funded by the European Union’s Horizon 2020 research and innovation programme.

Education

PhD in computer science [2021]
Thesis: Expressive speech translation
Advisors: Esteve Yannick and Parcollet Titouan
Avignon Université, France

Master Research in computer science [2019-2021]
Specializations: Artificial Intelligence, Cyber Security
Avignon Université, France

Bachelor in computer science [2016-2019]
Avignon Université, France

Experiences

Scientific Researcher [Sep.2021-Mar.2024]
Research and Development of TTS, ASR, and S2ST models for a multilingual open-source platform that can process large volumes of content.
SELMA, UE

Visiting Researcher [Jul.2023-Oct.2023]
Design of efficient Text-to-Speech systems, model quantization, and inference time optimization.
NXP, France

Visiting PhD Student [Jun.2022-Aug.2022]
Research on end-to-end model for Speech to Speech Translation.
Johns Hopkins University, USA

Research Apprenticeship [Sep.2019-Aug.2021]
Research on speaker identification and verification.
Avignon University, France

Research Intern [May.2019-Jul.2021]
Development of speaker verification systems.
Avignon University, France

Research Intern [May.2018-Jul.2018]
Research on sentiment analysis.
Avignon University, France

Projects

Speechbrain [2022-2023]
Implementation of speech synthesis, speech translation and speech-to-speech translation model in the SpeechBrain project.

LIAvignon AI Challenge [2021-2022]
The aim is to recognize which character of a series has pronounced a short line, based on the spelling transcription of the line and/or from the audio recording of the line. Recognizing the series and not just the character is also part of the challenge.

STKLIA [2019-2021]
Collaboration and release of a Pytorch-Kaldi toolkit for speaker verification called STKLIA.

Publications

Enhancing expressivity transfer in textless speech-to-speech translation
J Duret, T Parcollet, Y Estève
ASRU 2023

Learning multilingual expressive speech representation for prosody prediction without parallel data
J Duret, T Parcollet, Y Estève
INTERSPEECH 2023

Direct Text to Speech Translation System Using Acoustic Units
V Mingote, P Gimeno, L Vicente, S Khurana, A Laurent, J Duret
IEEE Signal Processing Letters

Multi-lingual Speech to Speech Translation for Under-Resourced Languages
A Larcher, Y Estève, M Rouvier, N Tomashenko, J Duret, G Laperriere, ...
HAL 2022

End-to-end model for named entity recognition from speech without paired training data
Mdhaffar, J Duret, T Parcollet, Y Estève
INTERSPEECH 2022

Study on the temporal pooling used in deep neural networks for speaker verification
M Rouvier, PM Bousquet, J Duret
EUSIPCO 2021

Language adaptation for speaker recognition systems using contrastive learning
V Brignatz, J Duret, D Matrouf, M Rouvier
SPECOM 2021

Influence of speaker pre-training on character voice representation
M Quillot, J Duret, R Dufour, M Rouvier, JF Bonastre
SPECOM 2021

Teaching

Year 2022/2023

Teaching	Practical Work	Total
Advanced Programming Project	42h	42h
Assembly Programming	24h	24h
Total		66h

Year 2021/2022

Teaching	Practical Work	Total
HTML & CSS	24h	24h
Assembly Programming	24h	24h
Agorithms and programming	18h	18h
Total		66h