Software Engineer, Machine Learning Infrastructure

3 Months ago โ€ข 4 Years + โ€ข Artificial Intelligence

Job Summary

Job Description

Character.AI seeks a seasoned ML Infrastructure engineer to design, build, and maintain training and serving infrastructure for ML research and product development. Responsibilities include providing infrastructure support for ML research, building tooling for diagnosing cluster issues and hardware failures, monitoring deployments, managing experiments, and maximizing GPU allocation and utilization. The ideal candidate possesses 4+ years of experience supporting ML infrastructure, developing diagnostic tools, and working with cloud platforms like Compute Engine, Kubernetes, and Cloud Storage. Experience with GPUs is essential.
Must have:
  • 4+ years supporting ML infrastructure
  • Develop diagnostic tools for ML infrastructure
  • Experience with cloud platforms (Compute Engine, Kubernetes, Cloud Storage)
  • GPU experience
Good to have:
  • Large GPU clusters and high-performance computing/networking
  • Large language model training support
  • ML frameworks (Pytorch/TensorFlow/JAX)
  • GPU kernel development

Job Details

About the role

Weโ€™re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research.

Responsibilities:

  • Provide infrastructure support to our ML research and product

  • Build tooling to diagnose cluster issues and hardware failures

  • Monitor deployments, manage experiments, and generally support our research

  • Maximize GPU allocation and utilization for both serving and training

Requirements:

  • 4+ years of experience supporting the infrastructure within an ML environment

  • Experience in developing tools used to diagnose ML infrastructure problems and failures

  • Experience with cloud platforms (e.g., Compute Engine, Kubernetes, Cloud Storage)

  • Experience working with GPUs

Nice to have

  • Experience with large GPU clusters and high-performance computing/networking

  • Experience with supporting large language model training

  • Experience with ML frameworks like Pytorch/TensorFlow/JAX

  • Experience with GPU kernel development

About Character.AI

Founded in 2021, Character is a leading AI company offering personalized experiences through customizable AI 'Characters.' As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.

In just two years, we achieved unicorn status and were named Google Play's AI App of the Year โ€“ a testament to our groundbreaking technology and vision.

Ready to shape the future of Consumer AI? ๐Ÿš€

At Character, we value diversity and welcome applicants from all backgrounds. As an equal opportunity employer, we firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to our success.

Compensation Range: $150K - $350K

Similar Jobs

ByteDance - Senior Machine Learning Engineer

ByteDance

San Jose, California, United States (On-Site)
โ€ข 5 Days ago
Rackspace Technology - Presales Data Science Architect โ€“ AWS Cloud

Rackspace Technology

Mexico City, Mexico (On-Site)
โ€ข 3 Months ago
bosh group india - Asst Manager / Sr. Engineer - Data Scientist Prognostics

bosh group india

Karnataka, India (On-Site)
โ€ข 2 Months ago
NVIDIA - AI Algorithms SW Engineer (RDSS Intern)

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
โ€ข 1 Month ago
ASSIST Software - AI Engineer

ASSIST Software

Suceava, Suceava County, Romania (Remote)
โ€ข 3 Months ago
Trend Micro - Large Language Models (LLM) Expert (VicOne_Automotive Security)

Trend Micro

Taipei City, Taiwan (On-Site)
โ€ข 4 Months ago
Microsoft - Senior Researcher - Embodied AI/Robotics - Microsoft Research

Microsoft

Redmond, Washington, United States (On-Site)
โ€ข 1 Month ago
Flutter Entertainment - Lead Data Scientist

Flutter Entertainment

Hyderabad, Telangana, India (Hybrid)
โ€ข 3 Months ago
Kenvue - Generative AI TPO

Kenvue

Bengaluru, Karnataka, India (On-Site)
โ€ข 4 Months ago
Microsoft - Senior Researcher

Microsoft

Bengaluru, Karnataka, India (On-Site)
โ€ข 1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - Speech Understanding) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
โ€ข 3 Months ago
NVIDIA - Senior Solutions Architect, Gen AI Manufacturing

NVIDIA

California, United States (Remote)
โ€ข 1 Month ago
The Walt Disney Company - Lead Data Scientist

The Walt Disney Company

Santa Monica, California, United States (On-Site)
โ€ข 3 Months ago
Stupa Sports Analytics - Computer Vision Engineer

Stupa Sports Analytics

Gurugram, Haryana, India (On-Site)
โ€ข 4 Months ago
Trendyol - Data Science Professionals - Trendyol GO

Trendyol

Ankara, Ankara, Tรผrkiye (Hybrid)
โ€ข 3 Months ago
Google - Software Engineer III, Machine Learning, Google Ads

Google

Los Angeles, California, United States (On-Site)
โ€ข 3 Months ago
ByteDance - Software Engineer in ML Engineering Platform

ByteDance

San Jose, California, United States (On-Site)
โ€ข 3 Months ago
Paypal - Machine Learning Engineer

Paypal

San Jose, California, United States (Hybrid)
โ€ข 4 Months ago
Alpha Sense - Join AlphaSense India Talent Community

Alpha Sense

Pune, Maharashtra, India (On-Site)
โ€ข 3 Months ago
Canva - Senior Machine Learning Engineer - Photo AI

Canva

Prague, Czechia (Remote)
โ€ข 3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in New York, New York, United States

Next Level Business Services - Salesforce BA

Next Level Business Services

Bridgewater, New Jersey, United States (On-Site)
โ€ข 4 Months ago
Meta - Software Engineering Manager, Product

Meta

Sunnyvale, California, United States (On-Site)
โ€ข 3 Months ago
NVIDIA - Senior Solutions Architect, Retail

NVIDIA

Arkansas, United States (Remote)
โ€ข 1 Month ago
Magnopus - Game Designer - Project

Magnopus

Los Angeles, California, United States (On-Site)
โ€ข 7 Months ago
WebFX - Jr. Digital Media Publishing Specialist

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
โ€ข 4 Months ago
Trek - Sales Associate

Trek

Midland, Texas, United States (On-Site)
โ€ข 1 Week ago
Pipeworks - Senior Environment Artist

Pipeworks

Eugene, Oregon, United States (Remote)
โ€ข 2 Weeks ago
Nintendo - Customer Business Analyst (Sales Planning Analyst)

Nintendo

Redmond, Washington, United States (Hybrid)
โ€ข 4 Weeks ago
The Walt Disney Company - Senior Analyst, Disney+ Tactical Opportunities

The Walt Disney Company

New York, New York, United States (On-Site)
โ€ข 1 Month ago
The Walt Disney Company - Sr. Manager, Lead Character Animator

The Walt Disney Company

Glendale, California, United States (Remote)
โ€ข 2 Weeks ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

PristineAI - AI Engineer

PristineAI

Chennai, Tamil Nadu, India (On-Site)
โ€ข 7 Months ago
NVIDIA - Senior Deep Learning Performance Architect

NVIDIA

Santa Clara, California, United States (On-Site)
โ€ข 1 Month ago
Zoox - Senior/Staff Software Engineer - Simulation Infrastructure

Zoox

Seattle, Washington, United States (Hybrid)
โ€ข 4 Months ago
Digital Green - AI Researcher

Digital Green

Bengaluru, Karnataka, India (On-Site)
โ€ข 7 Months ago
Google - Student Researcher, PhD, Winter/Summer 2025

Google

Ann Arbor, Michigan, United States (On-Site)
โ€ข 3 Months ago
Zoox - Software Engineer - Simulation Traffic & Behavior Modeling

Zoox

Foster City, California, United States (Hybrid)
โ€ข 4 Months ago
NVIDIA - Machine Learning Software Platform Architect

NVIDIA

Santa Clara, California, United States (On-Site)
โ€ข 1 Month ago
TVH - Data Scientist

TVH

Pune, Maharashtra, India (On-Site)
โ€ข 5 Months ago
Meta - Software Engineer, Machine Learning

Meta

Burlingame, California, United States (On-Site)
โ€ข 3 Months ago
Microsoft - Post Doc Researcher

Microsoft

Bengaluru, Karnataka, India (On-Site)
โ€ข 4 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Character is one of the world's leading personal AI platforms. Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character is a full-stack AI company with a globally scaled direct-to-consumer platform. 

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

New York, New York, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

Menlo Park, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Character.AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug