Research Scientist, Interpretability

Anthropic•San Francisco, CA

Full Timemachine-learning ai python research data-science tensorflow pytorch analytics

Apply Now

Research Scientist, Interpretability

Anthropic•San Francisco, CA

Full Timemachine-learning ai python+5 more

Apply Now

Research Scientist, Interpretability

Anthropic

Apply Now

Anthropic is seeking a Research Scientist for its Interpretability team, focused on understanding and improving the safety of AI systems through mechanistic interpretability. The role involves reverse-engineering neural networks to enhance trust and reliability in AI technologies.

Qualification

PhD in a relevant field (e.g., computer science, machine learning, neuroscience)
Strong background in machine learning and neural networks
Experience with interpretability research or related areas
Proficiency in programming languages such as Python
Ability to work collaboratively in a research team

Responsibility

Conduct research on mechanistic interpretability of neural networks
Develop methodologies to reverse-engineer trained models
Collaborate with a team of researchers and engineers
Publish findings in relevant scientific forums
Contribute to the design and implementation of interpretability tools

Research Scientist, Interpretability

Research Scientist, Interpretability

Research Scientist, Interpretability

Qualification

Responsibility

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Machine Learning Engineer, Simulation Realism

AI Engineer

AI Research Engineer

Data Scientist

Similar Jobs

Similar Jobs

Research Engineer, Machine Learning (Horizons)

Machine Learning Engineer, Simulation Realism

AI Engineer

AI Research Engineer

Data Scientist