in collaboration with Scale AI
About Outlier
Outlier is committed to improving the intelligence & safety of AI models. Owned and operated by Scale AI, we’ve recently been featured in Forbes for partnering experts with top AI labs to provide the high quality data for LLMs.
About the role
Our mission is to challenge AI with Humanity’s Last Exam, a rigorous new way to test AI’s intelligence and reasoning. To do so, we are recruiting a group of exceptional minds to craft PhD and Master's level problems that current AI models cannot solve correctly.
IN THE NEWS
As artificial intelligence becomes ever more capable of mimicking human reasoning, it's increasingly requiring more highly skilled humans, often with specific areas of expertise, to hone its models.
-
Forbes, A Growing Side Hustle For American College Grads: Fixing AI’s Wrong Answers
What you will be doing
Create Challenging Prompts for AI
Write a valid question that AIs and average humans cannot answer.
Subject
Physics
Question
Provide a challenging prompt that will stump the AI model or cause it to "break".
Provide Benchmark Solutions
Provide a definitive, high quality solution that serves as the benchmark for evaluating AI responses.
Question
Difficult
Answer/Solution
Provide a high-quality, thorough response that directly matches the intent of the original prompt.
Participate in Peer Review
Work closely with fellow experts to evaluate and refine each other's solutions. Provide feedback to ensure high accuracy.
Contributor A’s response
Edit
Approve
What we’re looking for
PhD or Master's Degree in Physics or a related field (e.g. Theoretical Physics, Astrophysics, Quantum Mechanics). Can be currently enrolled.
Attendance or a degree from a Top 200 accredited University or Institution such as Harvard, MIT, Stanford, Oxford.
Deep subject matter expertise with the ability to create complex, graduate-level problems that challenge AI reasoning.
Strong analytical and problem-solving skills, with experience in crafting rigorous, high-quality questions and solutions.
Attention to detail to accurately assess AI capabilities and evaluate peer submissions.
Fluency/High proficiency in English.
What you will get from contributing
Hear from other experts
Process
Here’s what to expect when you apply