Principal Software Engineer - GPU Performance

51 Minutes ago • 8-10 Years • Artificial Intelligence

About the job

Job Description

Microsoft's AI Platform organization seeks a Principal Software Engineer to focus on GPU performance analysis and optimization for large-scale AI model training and inference. The role involves collaborating with hardware teams, ML developers, and OpenAI to build and optimize software stacks for next-generation AI supercomputers and accelerators (like Maia-100). Responsibilities include software development (C/C++, Python, CUDA, ROCm, Triton), performance analysis, identifying requirements, and collaborating with various teams to deliver robust solutions for state-of-the-art AI models. This is a hands-on technical role demanding expertise in GPU programming and optimization techniques.
Must have:
  • 8+ years experience
  • 4+ years C/C++ experience
  • 4+ years GPU application experience
  • GPU kernel optimization
  • Collaboration skills
Good to have:
  • Advanced degree
  • Low-level programming expertise
  • Profiling tool proficiency (NVIDIA tools)
  • Deep learning workload experience
  • CUDA, ROCm, or Triton experience
Perks:
  • Industry leading healthcare
  • Educational resources
  • Product and service discounts
  • Savings and investments
  • Maternity/paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

Microsoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.

 

The Artificial Intelligence (AI) Platform organization at Microsoft builds the end-to-end Azure AI stack/Platform as a Service (PaaS) and is core to Azure’s innovation and differentiation, as well as all of Microsoft’s flagship products, from Office to Teams, to Xbox. We are the team building Azure OpenAI, Azure Machine Learning (ML), Cognitive Services, and the global Azure AI infrastructure for running the largest AI workloads on the planet.

 

We do not just value differences or different perspectives. We seek them out and invite them in so we can tap into the collective power of everyone in the company. As a result, our customers are better served.

The Artificial Intelligence (AI) Frameworks team at Microsoft develops the AI software used to train and deploy the world’s most advanced AI models. We collaborate with our hardware teams and partners to build the software stacks for Microsoft’s next-generation supercomputers and the new Maia-100 AI accelerator.  We work closely with ML researchers and developers to optimize and scale out model training and inference.  We work directly with OpenAI on the models hosted on the Azure OpenAI service.

We are hiring a Principal Software Engineer to work on graphics processing unit (GPU) performance analysis and optimization.  As a member of this team, you will have the opportunity to work on the fundamental abstractions, programming models, runtimes, libraries and application programming interfaces (APIs) to enable large scale training and inferencing of models on novel AI hardware.

 

This is a technical role: it requires hands on software design and development skills. We’re looking for someone who has a demonstrated history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack.  An entrepreneurial approach and ability to take initiative and move fast are essential.

 

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required/Minimum Qualifications 

  • Bachelor's Degree in Computer Science, or related technical discipline AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

    • OR equivalent experience.

  • 4+ years' experience with C/C++
  • 4+ years’ practical experience working on real-world applications that use GPUs, experience in optimizing GPU kernels for performance

 

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

 

Preferred/Additional Qualifications 

  • Bachelors Degree or advanced degree in computer engineering, computer science, or related fields, and 10+ years of software development experience.
  • Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
  • Technical background and foundation in software engineering principles, architecture design, and performance analysis
  • Intellectual curiosity and passion about learning new technologies
  • Exposure to state-of-the art Deep Neural Network training and inference workloads, including research techniques
  • Great cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $161,600 - $286,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $209,600 - $314,400 per year.
  
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
   
Microsoft will accept applications and processes offers for these roles on an ongoing basis.

 

 

#AIFX

#SWE24

#SHPE24MSFT

Responsibilities

  • Collaborate broadly across multiple disciplines from hardware designers to ML developers
  • Engage with key partners to understand and implement robust performance analysis and optimization for state-of-the-art large language models (LLMs) and other models.
  • Perform software development in C/C++, Python, and GPU development in languages such as CUDA, ROCm, or Triton.
  • Identify requirements, scope solutions, estimate work, schedule deliverables.
  • Embody our and 
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$161.6K - $314.4K/yr (Outscal est.)
$238.0K/yr avg.
Redmond, Washington, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

Redmond, Washington, United States (On-Site)

Bengaluru, Karnataka, India (On-Site)

Redmond, Washington, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Jobs

Blizzard Entertainment - Senior Data Scientist, Computer Graphics

Blizzard Entertainment, United States (On-Site)

Activision - 2025 US Summer Internship - Software Engineering

Activision, United States (On-Site)

Playtech - Senior Java Developer

Playtech, (On-Site)

PlayStation Global - Senior Software Engineer

PlayStation Global, United States (On-Site)

Novancy One | Digital Talent Recruitment - Expert data scientists/Researcher in Generative AI Ref. 005529

Novancy One | Digital Talent Recruitment, United States (On-Site)

The Walt Disney Company - Principal Machine Learning Engineer, Research - Ad Platforms

The Walt Disney Company, United States (On-Site)

PlayStation Global - Machine Learning Engineer II

PlayStation Global, United Kingdom (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Aristocrat Gaming - Workday Senior Analyst

Aristocrat Gaming, United States (Hybrid)

The Walt Disney Company - Sr Software Engineer, iOS

The Walt Disney Company, United States (On-Site)

CloudHire - Pathologist Assistant

CloudHire, United States (On-Site)

Crunchyroll - Director of Content Accounting

Crunchyroll, United States (On-Site)

Activision - Sr. Producer, Publishing Operations

Activision, United States (On-Site)

TBNR Productions - Gaming Editor (Minecraft)

TBNR Productions, United States (On-Site)

The Walt Disney Company - Business Operations Sr Associate

The Walt Disney Company, United States (On-Site)

WebFX - Jr. React Developer

WebFX, United States (On-Site)

Get notifed when new similar jobs are uploaded