Research Manager, Interpretability

Anthropic•San Francisco, CA

Full Timemachine-learning ai data-science python tensorflow pytorch research analytics

Apply Now

Research Manager, Interpretability

Anthropic•San Francisco, CA

Full Timemachine-learning ai data-science+5 more

Apply Now

Research Manager, Interpretability

Anthropic

Apply Now

The Research Manager for Interpretability at Anthropic will lead efforts to reverse engineer neural networks and enhance AI safety through mechanistic interpretability. The role involves working with a dedicated team focused on understanding how AI models function and ensuring their reliability and trustworthiness for users and society.

Qualification

PhD in a relevant field (e.g., computer science, machine learning, neuroscience)
Strong background in machine learning and neural networks
Experience with interpretability research or related fields
Proven track record of publishing research in top-tier conferences
Excellent communication and collaboration skills

Responsibility

Lead research initiatives on mechanistic interpretability of neural networks
Develop methodologies to reverse engineer AI models
Collaborate with researchers and engineers to enhance AI safety
Publish findings and contribute to the scientific community
Mentor and guide team members in interpretability research

Research Manager, Interpretability

Research Manager, Interpretability

Research Manager, Interpretability

Qualification

Responsibility

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Machine Learning Engineer, Simulation Realism

AI Engineer

AI Research Engineer

Data Scientist

Similar Jobs

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Machine Learning Engineer, Simulation Realism

AI Engineer

AI Research Engineer

Data Scientist