AI SCORE 8.5 / 10

Conversational AI Evaluator - Remote Position

$120K–$150K/year

About the Role

We are seeking a skilled Conversational AI Evaluator to join our team in a fully remote capacity. As a Conversational AI Evaluator, you will partner with leading AI teams to enhance the quality, usefulness, and reliability of conversational AI systems. These systems are utilized in various everyday and professional scenarios, and your expertise will ensure they respond accurately and helpfully to user inquiries.

What You'll Do

Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness.
Conduct fact-checking using trusted public sources and authoritative references.
Perform accuracy testing by executing code and validating outputs using appropriate tools.
Annotate model responses by identifying strengths, areas for improvement, and factual or conceptual inaccuracies.
Assess code quality, readability, algorithmic soundness, and explanation quality.
Ensure model responses align with expected conversational behavior and system guidelines.
Apply consistent evaluation standards by following clear taxonomies, benchmarks, and detailed evaluation guidelines.

Requirements

A BS, MS, or PhD in Computer Science or a closely related field.
5+ years of real-world experience in software engineering or related technical roles.
Expertise in at least two relevant programming languages (e.g., Python, Java, C++, JavaScript).
Ability to solve HackerRank or LeetCode Medium and Hard-level problems independently.
Experience contributing to well-known open-source projects, including merged pull requests.
Significant experience using LLMs while coding and understanding their strengths and failure modes.
Strong attention to detail and comfort evaluating complex technical reasoning.

Nice to Have

Prior experience with RLHF, model evaluation, or data annotation work.
Track record in competitive programming.
Experience reviewing code in production environments.
Familiarity with multiple programming paradigms or ecosystems.
Experience explaining complex technical concepts to non-expert audiences.

What We Offer

Fully remote role with flexible scheduling.
Weekly payments via Stripe or Wise based on services rendered.
Opportunity to work on impactful AI projects.
Potential for project extensions based on performance.
Engagement as an independent contractor.

Join us as a Conversational AI Evaluator and contribute to the advancement of AI technologies that assist with real-world coding tasks. Your feedback will improve the correctness, robustness, and clarity of AI coding outputs, making a significant impact in the field.

Why This Job8.5 of 10

This role offers a unique opportunity to work remotely as a Conversational AI Evaluator, engaging with cutting-edge AI technologies and contributing to their improvement.

Salary Range

Required

0/1

Optional

0/1

Bonus

0/1

Generating success profile...

Analyzing job requirements and market data

Loading market overview...

Analyzing market trends and skill demands

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.