
Audio Data Engineer – Speech Cleaning & Pipeline Automation (TTS)

Audio Data Engineer – Speech Cleaning & Pipeline Automation (TTS)

Audio Data Engineer – Speech Cleaning & Pipeline Automation (TTS)
Hippocratic AI
Hippocratic AI is seeking an Audio Data Engineer to enhance and automate speech datasets for Text-to-Speech (TTS) systems, contributing to the development of a safety-focused Large Language Model for healthcare. The role involves cleaning audio data, building automation pipelines, and collaborating with ML researchers to ensure high-quality voice models.
Qualification
- Strong experience with speech/audio cleaning using tools such as iZotope RX, Audacity, Adobe Audition, or SoX.
- Proficiency in Python and audio-related scripting for automation and batch processing.
- Familiarity with digital audio principles, including sample rates, bit depth, frequency bands, and compression artifacts.
Responsibility
- Clean, denoise, and enhance large volumes of recorded speech data for use in TTS and voice synthesis pipelines.
- Build and maintain automated audio preprocessing pipelines using scripting tools and open-source libraries.
- Apply techniques such as background noise removal, silence trimming, gain normalization, and sample rate conversion.
- Integrate tools like ffmpeg, sox, or Python-based scripts (pydub, torchaudio, librosa) into scalable workflows.
- Collaborate with ML researchers and speech scientists to deliver high-quality, ready-to-train datasets.
- Evaluate audio quality using perceptual and quantitative metrics, and maintain audio QA checklists.




