Challenge AI with Domain-Specific Visual Reasoning

Challenge AI with Domain-Specific Visual Reasoning

Join a new expert-led project where you'll craft prompts using structured images, like graphics, tables, charts, schematics, etc, to expose the reasoning limits of top AI models.

Join a new expert-led project where you'll craft prompts using structured images, like graphics, tables, charts, schematics, etc, to expose the reasoning limits of top AI models.

Outlier is powered by Scale AI to accelerate AI development at leading enterprises

Outlier is powered by Scale AI to accelerate AI development at leading enterprises

Outlier is powered by Scale AI to accelerate AI development at leading enterprises

Earn an average $20 - $50/hr, paid out weekly
Earn an average $20 - $50/hr,
paid out weekly

Rates vary based on your
area of expertise

Remote work,
flexible hours

Remote work,
flexible hours

Work from
anywhere, anytime

Define
AI's future
Define
AI's Future
Define AI's future

Leverage your expertise
and train frontier AI models

About Outlier

Outlier is committed to improving the intelligence & safety of AI models. Owned and operated by Scale AI, we’ve recently been featured in Forbes for partnering experts with top AI labs to provide the high quality data for LLMs.

About the role
Our mission is to challenge AI with Humanity’s Last Exam, a rigorous new way to test AI’s intelligence and reasoning. To do so, we are recruiting a group of exceptional minds to craft PhD and Master's level problems that current AI models cannot solve correctly.

"We partnered with Scale AI to work with Enterprises to adopt Llama and train custom models with their own data. We are excited to collectively make Llama the industry standard and bring the benefits of AI to everyone."

Mark Zuckerberg

Founder and CEO, Meta

"We partnered with Scale AI to work with Enterprises to adopt Llama and train custom models with their own data. We are excited to collectively make Llama the industry standard and bring the benefits of AI to everyone."

Mark Zuckerberg

Founder and CEO, Meta

"We partnered with Scale AI to work with Enterprises to adopt Llama and train custom models with their own data. We are excited to collectively make Llama the industry standard and bring the benefits of AI to everyone."

Mark Zuckerberg

Founder and CEO, Meta

What you will be doing

What you will be doing

What you will be doing

Create Challenging Prompts for AI
Create Challenging Prompts for AI
Create Challenging Prompts for AI

Write a valid question that AIs and average humans cannot answer.

Write a valid question that AIs and average humans cannot answer.

Write a valid question that AIs and average humans cannot answer.

Subject

Medicine

Question

Provide a challenging prompt that will stump the AI model or cause it to "break".

Provide a challenging prompt that will stump the AI model or cause it to "break".

Provide Benchmark Solutions
Provide Benchmark Solutions
Provide Benchmark Solutions

Provide a definitive, high quality solution that serves as the benchmark for evaluating AI responses. 

Provide a definitive, high quality solution that serves as the benchmark for evaluating AI responses. 

Provide a definitive, high quality solution that serves as the benchmark for evaluating AI responses. 

Question

Difficult

Answer/Solution

Provide a high-quality, thorough response that directly matches the intent of the original prompt.

Provide a high-quality, thorough response that directly matches the intent of the original prompt.

Participate in Peer Review
Participate in Peer Review
Participate in Peer Review

Work closely with fellow experts to evaluate and refine each other's solutions. Provide feedback to ensure high accuracy.

Work closely with fellow experts to evaluate and refine each other's solutions. Provide feedback to ensure high accuracy.

Work closely with fellow experts to evaluate and refine each other's solutions. Provide feedback to ensure high accuracy.

Contributor A’s response

Edit

Approve

What we’re looking for

  • Pursuing or holding an Undergraduate, Master’s, or PhD degree in your field of expertise or a related field.

  • Attendance or a degree from a Top accredited University or institution such as Harvard, MIT, Stanford, Oxford.

  • 2+ years of demonstrated expertise in your domain.

  • Deep subject matter expertise with the ability to create complex, graduate-level problems that challenge AI reasoning.

  • Strong analytical and problem-solving skills, with experience in crafting rigorous, high-quality questions and solutions.

  • Attention to detail to accurately assess AI capabilities and evaluate peer submissions.

  • Fluency/High proficiency in English.

What you will get from contributing

Earn up to $20-50 USD/hr, based on your area of expertise

Rates vary based on quality, accuracy and time spent. Paid out weekly via PayPal and AirTM

Earn up to $20-50 USD/hr, based on your area of expertise

Rates vary based on quality, accuracy and time spent. Paid out weekly via PayPal and AirTM

Earn up to $20-50 USD/hr, based on your area of expertise

Rates vary based on quality, accuracy and time spent. Paid out weekly via PayPal and AirTM

Join an exclusive network of top 1% folks in their domains

Collaborate with leading experts at an all-expenses-covered event

Join an exclusive network of top 1% folks in their domains

Collaborate with leading experts at an all-expenses-covered event

Join an exclusive network of top 1% folks in their domains

Collaborate with leading experts at an all-expenses-covered event

Flexible schedule and time commitment

No contracts, no 9-to-5. You control your schedule. (Most experts spend 5-10 hours/week, up to 40 hours working from home

Flexible schedule and time commitment

No contracts, no 9-to-5. You control your schedule. (Most experts spend 5-10 hours/week, up to 40 hours working from home

Flexible schedule and time commitment

No contracts, no 9-to-5. You control your schedule. (Most experts spend 5-10 hours/week, up to 40 hours working from home

Combine domain-specific images and prompts to challenge AI models

Combine meaningful domain-specific images with carefully crafted prompts to challenge state-of-the-art language models

Combine domain-specific images and prompts to challenge AI models

Combine meaningful domain-specific images with carefully crafted prompts to challenge state-of-the-art language models

Combine domain-specific images and prompts to challenge AI models

Combine meaningful domain-specific images with carefully crafted prompts to challenge state-of-the-art language models

Choose your expertise

Choose your expertise

Choose your expertise

Hear from other experts

Read about how MIT PhD Daniel Z earns extra income on Outlier

Read about how MIT PhD Daniel Z earns extra income on Outlier

Read about how MIT PhD Daniel Z earns extra income on Outlier

"Daniel has enjoyed the combination of earning extra cash while contributing to AI development and has made over $4,000 on Outlier".

"Daniel has enjoyed the combination of earning extra cash while contributing to AI development and has made over $4,000 on Outlier".

Being able to discuss the nuances of intricate genetic systems with people from around the world that I've never met before is super cool… and of course the money is a good bonus!

Josh Cole

Biologist, United States

Being able to discuss the nuances of intricate genetic systems with people from around the world that I've never met before is super cool… and of course the money is a good bonus!

Josh Cole

Biologist, United States

Being able to discuss the nuances of intricate genetic systems with people from around the world that I've never met before is super cool… and of course the money is a good bonus!

Josh Cole

Biologist, United States

As an engineer, I often use Al models to get the right answers, but this experience is the opposite: I need to make the models generate wrong answers. It ⁠is strange, exciting, and unforgettable.

Cuong N.

Computer Scientist, Vietnam

As an engineer, I often use Al models to get the right answers, but this experience is the opposite: I need to make the models generate wrong answers. It ⁠is strange, exciting, and unforgettable.

Cuong N.

Computer Scientist, Vietnam

As an engineer, I often use Al models to get the right answers, but this experience is the opposite: I need to make the models generate wrong answers. It ⁠is strange, exciting, and unforgettable.

Cuong N.

Computer Scientist, Vietnam

Application and onboarding process

STEP 1

Create an account
on Outlier

STEP 1

Create an account
on Outlier

STEP 1

Create an account
on Outlier

STEP 2

Verify your identity and conduct a video interview

STEP 2

Verify your identity and conduct a video interview

STEP 2

Verify your identity and conduct a video interview

STEP 3

Contribute your expertise to outsmart models and earn

STEP 3

Contribute your expertise to outsmart models and earn

STEP 3

Contribute your expertise to outsmart models and earn

Ready to use your brilliance to outsmart AI?

in collaboration with Scale AI

in collaboration with

in collaboration with

in collaboration with

in collaboration with

© 2025 Outlier. All rights reserved.

© 2025 Outlier. All rights reserved.

© 2025 Outlier. All rights reserved.