Mathematics Model Prompt Evaluator Job at SaidGig, Remote

Q3FaRDZZVm53SEM4SWIxcnNORTNOR0dHanc9PQ==
  • SaidGig
  • Remote

Job Description

Role Overview

Expert mathematicians are invited to author and verify high-quality open-ended prompts for AI model evaluation. In this role, you will craft and review challenging, unambiguous mathematical problems across core subdomains, assessing AI reasoning quality and helping establish rigorous evaluation standards for frontier language models.

Task Types

You will be assigned one of two task types:

  • Authoring Task: Create 5 original, open-ended prompts from your assigned subdomain at varying difficulty levels (undergraduate, advanced undergraduate, or graduate/professional). Prompts should require human judgment to evaluate the quality of the AI''s response, such as chain-of-thought reasoning or proof construction.
  • Verification Task: Review 5 authored prompts for clarity, scope alignment, difficulty accuracy, and uniqueness. Edit prompts and difficulty ratings where needed.
Mathematics Subdomains Covered

Probability & Statistics, Algebra (including Linear Algebra), Ordinary/Partial Differential Equations & Dynamical Systems, Geometry, Graph Theory, Number Theory.

Key Responsibilities
  • Author clear, unambiguous, open-ended mathematical prompts that elicit evaluable AI responses.
  • Verify prompts are within the scope of the assigned subdomain and correctly rated for difficulty.
  • Ensure all 5 prompts in a task are sufficiently distinct from one another with varying difficulty levels.
  • Apply expert judgment to assess the depth and quality of mathematical reasoning required.
  • Edit prompts and difficulty assignments where standards are not met.
Ideal Qualifications
  • Master''s degree or higher in Mathematics, Applied Mathematics, Statistics, or a closely related field.
  • 2–6 years of professional or research experience in a quantitative field.
  • Strong command of graduate-level mathematical concepts including proof writing, analysis, and formal reasoning.
  • Experience in academic research, mathematical competition design, or quantitative industry roles is a plus.
  • Excellent written English and ability to craft precise, well-scoped technical questions.
Work Terms

Expected commitment: 10+ hours/week. Asynchronous, fully remote work.

Job Tags

Remote job

Similar Jobs

Progyny, Inc.

Analyst, Medical Economics & Analytics Job at Progyny, Inc.

 ...people.We are hiringaBusiness Analystto jointhe Medical Economics and Analyticsteam in New York City. Our team is at the forefront...  ...top-tier consulting firm or healthcare company ~ Bachelors degree required ~ Enjoys analyzing large sets of data (30M+ rows)~... 

Roche Brothers

Landscape Laborer Job at Roche Brothers

Job Posting: Landscape Laborer - Roche Landscaping Services Job Description Roche Landscaping Services is excited to announce an opening for a full-time Landscape Laborer. This position plays a crucial role in the physical execution of our landscaping projects. Due...

CrucialPoint

Email Marketing Copywriter | Remote | Part-Time Job at CrucialPoint

How to Apply: 1. Submit Your Resume 2. Complete the short application form at the link provided. 3. Only candidates who complete the form will be considered. About CrucialPoint CrucialPoint Agency is a precision retention marketing agency trusted by 7-, 8-...

American Institutes for Research

Researcher, Literacy New Job at American Institutes for Research

 ...AIR is seeking a Researcher with expertise in literacy (e.g., reading, writing, or oral language) education who brings strong experience in both rigorous research and evidence-based technical assistance. The researcher will contribute to multiple literacy research,... 

Uncle Jack's Steakhouse

Bar Manager Job at Uncle Jack's Steakhouse

 ...Urgent Opportunity: Bar Manager at Uncle Jack's Steakhouse Are you looking for a dynamic and vibrant work environment where your leadership can shine? Do you want to be part of a team that values your contributions and offers exciting opportunities for growth? If...