···
Log in / Register
Arabic & English Ai Trainer - Remote
Indeed
Full-time
Onsite
No experience limit
No degree limit
Spain
Favourites
Share
Description

Summary: This role involves partnering with leading AI teams to evaluate and improve general chat behavior in large language models by providing high-quality human feedback and ensuring accuracy. Highlights: 1. Partner with leading AI teams to improve conversational AI systems 2. Evaluate and improve general chat behavior in LLMs 3. Apply structured analytical thinking to AI model evaluation **Work Mode:** Remote **Engagement Type:** Independent Contractor **Schedule:** Full\-Time or Part\-Time Contract **Role:** Partners with leading AI teams to improve the quality, usefulness, and reliability of general\-purpose conversational AI systems. These systems are used across a wide range of everyday and professional scenarios, and their effectiveness depends on how clearly, accurately, and helpfully they respond to real user questions. This project focuses on evaluating and **improving general chat behavior** in large language models (LLMs). You will assess model\-generated responses across diverse topics, provide high\-quality human feedback, and help ensure AI systems communicate in ways that are accurate, well\-reasoned, and aligned with human expectations. **What You’ll Do** * Evaluate LLM\-generated responses on their ability to effectively answer user queries * Conduct fact\-checking using trusted public sources and external tools * Generate high\-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuracies * Assess reasoning quality, clarity, tone, and completeness of responses * Ensure model responses align with expected conversational behavior and system guidelines * Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines **Who You Are** * You hold a **Bachelor’s degree** * You are a **native speaker** or have **ILR 5/primary fluency (C2 on the CEFR scale)** in **Arabic** * You have **significant experience using large language models** (LLMs) and understand how and why people use them * You have **excellent writing skills** and can clearly articulate nuanced feedback * You have **strong attention to detail** and consistently notice subtle issues others may overlook * You are **adaptable** and comfortable moving across topics, domains, and customer requirements * You have a background or experience in domains requiring **structured analytical thinking** (e.g., research, policy, analytics, linguistics, engineering) * You have **excellent college\-level mathematics skills** **Nice\-to\-Have Specialties** * Prior experience with **RLHF, model evaluation, or data annotation work** * Experience writing or editing **high\-quality written content** * Experience comparing multiple outputs and making **fine\-grained qualitative judgments** * **Familiarity with evaluation rubrics**, benchmarks, or quality scoring systems **What Success Looks Like** * You identify factual inaccuracies, reasoning errors, and communication gaps in model responses * You produce clear, consistent, and reproducible evaluation artifacts * Your feedback leads to measurable improvements in response quality and user experience **Contract and Payment Terms** ------------------------------ * You will be engaged as an independent contractor. * This is a fully remote role that can be completed on your own schedule. * Projects can be extended, shortened, or concluded early depending on needs and performance. * Payments are weekly on Stripe or Wise based on services rendered.

Source:  indeed View original post
David Muñoz
Indeed · HR

Company

Indeed
David Muñoz
Indeed · HR
Similar jobs

Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.