Embedding VC logo

AI Trainer: Code Generation

Embedding VC
Full-time
On-site
Palo Alto, California, United States
Ai Trainer

AI Trainer: Code Generation

Overview

We are building a focused group of engineers to improve how large language models reason through real world code. This initiative centers on evaluating and refining multi step reasoning trajectories derived from real GitHub repositories, with the goal of producing higher quality, more reliable code generation outputs.

This is a long term project requiring strong engineering judgment rather than surface level labeling. Contributors will work directly with complex code paths and reasoning flows across multiple platforms.

What You Will Do

You will analyze and refine multi step code reasoning trajectories generated from real production repositories.

This includes:

  • Reviewing model generated reasoning sequences

  • Identifying logical inconsistencies or weak reasoning steps

  • Improving trajectory structure to produce stronger, production grade outputs

  • Evaluating reasoning quality across different programming environments

The work is closer to debugging model logic and reasoning systems than to traditional annotation tasks.

What We Are Looking For

We are looking for engineers with strong hands on development experience and deep familiarity with real codebases.

You should:

  • Be proficient in at least two mainstream programming languages such as Python, C++, Java, TypeScript, or JavaScript

  • Have real world development experience in areas such as backend systems, frontend applications, algorithms, testing, or infrastructure

  • Be comfortable reading and reasoning through large GitHub repositories

  • Have strong written communication skills

Experience contributing to high visibility or high star GitHub repositories is a strong plus.

Additional Details

We expect to onboard approximately 10 to 20 engineers for this long term initiative. A short qualification exercise may be required prior to joining.