Member of Technical Staff - Post-Training and RL

Palo Alto, CA·Model·other
Apply on X (formerly Twitter) →

<div class=&quot;content-intro&quot;><h3><strong><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>ABOUT xAI</span></strong></h3> <p><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. </span><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. </span><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. </span><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.</span></p></div><h3 data-pm-slice=&quot;1 1 []&quot;><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>ABOUT THE ROLE:</span></h3> <ul> <li>You will work on the most critical post-training and reinforcement learning challenges at any given time — including reward modeling, preference optimization (RLHF/DPO), and RL for improving reasoning, truthfulness, and real-world capabilities.</li> <li>You will get clarity on your first project before an offer.</li> </ul> <h3><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;>BASIC QUALIFICATIONS:</span></h3> <ul> <li>You believe truth-seeking AI is the most important and challenging problem.</li> <li>You are obsessed about building incredibly useful models through post-training and RL techniques.</li> <li>You are a power user of AI models and eager to push the boundaries of what’s possible with reinforcement learning and alignment methods.</li> <li>If you previously worked on post-training, RLHF, or trained models used by millions of people it’s a big plus, but relevant experience is not required.</li> <li>You take pride in your work and thrive in meritocratic environments.</li> </ul> <h3><span style=&quot;font-family: arial, helvetica, sans-serif;&quot;><strong>COMPENSATION AND BENEFITS:</strong></span></h3> <p>$180,000 - $600,000 USD</p> <p>Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short &amp; long-term disability insurance, life insurance, and various other discounts and perks.</p><div class=&quot;content-conclusion&quot;><p><em>xAI is an equal opportunity employer. For details on data processing, view our&nbsp;</em><em><a href=&quot;https://x.ai/legal/recruitment-privacy-notice&quot; target=&quot;_blank&quot;>Recruitment Privacy Notice</a>.</em></p></div>

More open roles at X (formerly Twitter)