hcompany logo

Member of technical staff (Inference)

hcompanyParis
FullTimehybridfull-timepython+5 more
Apply Now
hcompany logo

Member of technical staff (Inference)

hcompany

Apply Now

H is an innovative AI startup focused on developing agentic AI to automate complex tasks and enhance human potential. The Inference team is seeking a technical staff member to optimize inference pipelines and model performance, contributing to cutting-edge AI technology.

Qualification

  • MS or PhD in Computer Science, Machine Learning or related fields
  • Proficient in Python, Rust or C/C++
  • Experience in GPU programming such as CUDA, Open AI Triton, Metal
  • Experience in model compression and quantization techniques
  • Strong communication and presentation skills
  • Collaborative mindset, thriving in dynamic teams

Responsibility

  • Develop scalable, low-latency and cost effective inference pipelines
  • Optimize model performance: memory usage, throughput, and latency using advanced techniques
  • Develop specialized GPU kernels for performance-critical tasks
  • Collaborate with research teams on model architectures
  • Review state-of-the-art papers to improve inference techniques
  • Prioritize and implement state-of-the-art inference techniques

Similar Jobs