Data & Applied Scientist II

58 Minutes ago • 1-4 Years • Data Analyst

About the job

Job Description

The Bing Ads Understanding team at Microsoft seeks a Data & Applied Scientist II to optimize ad selection, maximizing revenue, user experience, and advertiser ROI. Responsibilities involve building and maintaining production-level machine learning models using LLMs and cutting-edge techniques. This role requires expertise in NLP, multi-modal modeling, and experience with frameworks like PyTorch and TensorFlow. The candidate will derive insights from massive datasets, design experiments, and communicate findings to stakeholders. The position demands proficiency in Python, managing petabyte-scale datasets, and collaborating with data engineering teams.
Must have:
  • Doctorate or Master's/Bachelor's with experience in relevant field
  • Experience with LLMs and NLP
  • Multi-modal modeling expertise
  • Proficiency in Python, PyTorch, TensorFlow
  • Building and maintaining production ML models
Good to have:
  • Experience with ViT, CLIP, LLAVA
  • Transfer learning, domain adaptation, prompt engineering
  • Experience with large datasets
  • Understanding of model deployment and scaling
Perks:
  • Industry-leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Overview

The online advertising industry is experiencing rapid growth, delivering hundreds of millions of ad impressions daily and generating terabytes of user event data. This expansion presents incredible opportunities alongside complex technical challenges that require advanced computational intelligence. The Bing Ads Understanding team is at the forefront of this dynamic field, tackling these challenges through cutting-edge technologies, including data mining, statistical analysis, machine learning, deep learning, natural language processing, large language modeling, multi-lingual and multi-modality modeling. Our team is looking for a Data & Applied Scientist II to join us in our mission.  


Our mission centers on solving the core problem of computational advertising: selecting an optimized slate of relevant ads that maximizes a comprehensive utility function encompassing expected revenue, user experience, and advertiser return on investment.

As a world-class R&D team of passionate scientists and engineers, we are dedicated to addressing these challenges with innovative ideas and turning them into high-quality products and impactful solutions. We empower hundreds of millions of users to find what they need while enabling advertisers to reach their ideal audiences, creating a seamless marketplace experience that drives success across the board.

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field

    o OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) or consulting experience

    • OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 2+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results)

      o OR equivalent experience.

 

 

Preferred Qualifications:

  • Experience with Large Language Models: Demonstrated experience working with LLMs, such as GPT, BERT, or similar models, including knowledge of their strengths, limitations, and capabilities.
  • Understanding of NLP: In-depth knowledge of natural language processing (NLP) techniques and concepts, including tokenization, embeddings, semantic analysis, and their integration into machine learning pipelines.
  • Understanding and Experience with Multi-Modal Modeling: Familiarity and hands-on experience with multi-modal models such as ViT (Vision Transformer), CLIP, and LLAVA. Ability to apply these models in scenarios involving the integration of text and visual data for tasks such as cross-modal understanding, retrieval, relevance and ranking.
  • Proven ability to work independently in a team to deliver innovative solutions solving challenging business/technical problems from high level vision and architecture, down to quality design and implementation. Self-motivated and self-directed and be able to work constructively with a wide variety of people, team and changing business priorities
  • Understanding of state-of-the-art machine learning and deep learning technologies. In particular, hands-on experiences with deep learning models (DNN, CNN, RNN, Attention, Transformer) and frameworks (TensorFlow, PyTorch, Keras, etc.)

 

 

Applied Sciences IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until January 7, 2025

 

 

Responsibilities

Response and Resolution:

  • Leverages understanding of data science and business to examine a project and consider factors that can influence final outcomes within a technical area. Evaluates project plan for resources, risks, contingencies, requirements, assumptions, and constraints. Documents key business objectives. Effectively communicates business goals in analytical and technical terms. Consistently shares insights with stakeholders.
  • Build and maintain production-level machine learning models to assess and predict the relevance between ads and diverse user contexts, such as search queries or conversational interactions. Employ cutting-edge techniques, including large language models (LLMs) and state-of-the-art innovations from academia and industry, to enhance relevance modeling and drive impactful outcomes. Utilize Python, PyTorch and open-source libraries to train and fine-tune large language models. Apply advanced techniques like transfer learning, domain adaptation, and prompt engineering to tailor pre-trained LLMs to specific advertising scenarios. Build efficient training pipelines, inference pipelines for offline and online serving on production environments.

 

Readiness:

  • Understands where to acquire data necessary for successful completion of the project plan. Utilizes querying, visualization, and reporting techniques to describe acquired data, including format, quantity, identities, and other surface properties. Explores data for key attributes and contributes to the development of data quality report describing results of the task, initial findings, and impact on the project. Collaborates with others to perform data-science experiments using established methodologies, statistics, optimization, and probability theory for general purpose software and statistical packages. Assesses different tools and techniques and selects the appropriate one. Serves as an effective partner in data preparation efforts to Solution Architects, Consultants, and Data Engineers. Adheres to Microsoft's privacy policy related to collecting and preparing data. Identifies data integrity problems.
  • Derive meaningful insights and generate hypotheses from massive datasets using a variety of advanced techniques such as machine learning, feature engineering, statistical modeling, and data mining. Leverage methods like regression, classification, natural language processing (NLP), optimization, and p-value analysis to solve complex problems effectively.

 

Product/Process Improvement:

  • Leverages knowledge of machine learning solutions (e.g., classification, regression, clustering, forecasting, natural language processing [NLP], image recognition) and individual algorithms (e.g., linear and logistic regression, k-means, gradient boosting, autoregressive integrated moving average [ARIMA], recurrent neutral networks [RNN], long short-term memory [LSTM] networks) to identify the best approach to complete objectives. Understands modeling techniques (e.g., dimensionality reduction, cross-validation, regularization, encoding, assembling, activation functions) and selects the correct approach to prepare data, train and optimize the model, and evaluate the output for statistical and business significance. Understands the risks of data leakage, the bias/variance tradeoff, methodological limitations, etc. Writes all necessary scripts in the appropriate language: T-SQL, U-SQL, KQL, Python, R, etc. Constructs hypotheses, designs controlled experiments, analyzes results using statistical tests, and communicates findings to business stakeholders. Effectively communicates with diverse audiences on data-quality issues and initiatives. Understands operational considerations of model deployment, such as performance, scalability, monitoring, maintenance, integration into engineering production system, stability. Develops operational models that run at scale through partnership with data engineering teams.

 

 

Business Integration:

  • Leverages understanding of data science and business to examine projects through a customer-oriented focus. Manages customer expectations regarding project/product progress and timeline. Takes responsibility to enhance customer excellence. Assists and learns from senior team members interpret results, develops insights, and communicates results to customers. Possesses basic understanding about model accuracy dependency on data quality and able to articulate it in customer discussions.
  • Manage and manipulate petabyte-scale datasets using a combination of open-source and proprietary tools. Proficiency in programming languages like Python, R, C#, C++, Java, and SQL is highly valued to implement scalable data workflows and pipelines.

 

Other · Embody our

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect
View Full Job Description
$98.3K - $208.8K/yr (Outscal est.)
$153.6K/yr avg.
Redmond, Washington, United States

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Seoul, South Korea (On-Site)

New York, New York, United States (On-Site)

Texas, United States (Hybrid)

Dublin, County Dublin, Ireland (On-Site)

Hyderabad, Telangana, India (On-Site)

Sydney, New South Wales, Australia (Hybrid)

Bengaluru, Karnataka, India (On-Site)

Hyderabad, Telangana, India (On-Site)

London, England, United Kingdom (On-Site)

Beijing, Beijing, China (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Similar Jobs

Intellipaat - Motion Graphic Designer

Intellipaat, India (On-Site)

Paypal - Senior Data Scientist, MMM

Paypal, United States (Hybrid)

Glean - Data Science Lead, Ranking

Glean, United States (On-Site)

Playrix - Support Engineer (Automation)

Playrix, Armenia (Remote)

Microsoft - Research Intern - Innovation Market Analyst Intern

Microsoft, United States (On-Site)

ION - Data Engineer, Italy

ION, Italy (Hybrid)

Catalyst Technical Consulting Group, LLC - Data Scientist

Catalyst Technical Consulting Group, LLC, United States (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

CharacterAI - Staff Software Engineer, Applied ML

CharacterAI, Canada (On-Site)

Netflix - Ads Measurement Partner, EMEA

Netflix, United Kingdom (On-Site)

Info Stretch - Practice Lead AI & ML Engineer

Info Stretch, United Kingdom (On-Site)

Coursera - Senior Specialist, SEC Reporting

Coursera, United States (Remote)

Truecaller - Senior MLOps Engineer

Truecaller, Sweden (On-Site)

Twitch - Product Manager - Community

Twitch, United States (On-Site)

Netflix - Researcher, Korea

Netflix, (On-Site)

ARHS - Business Analyst

ARHS, Sweden (On-Site)

CD PROJEKT RED - Data Scientist

CD PROJEKT RED, Poland (On-Site)

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

The Walt Disney Company - Senior Principal Software Engineer

The Walt Disney Company, United States (On-Site)

Dmg - Senior Staff Engineer

Dmg, United States (On-Site)

Life church - Senior Data Product Manager

Life church, United States (On-Site)

Barbaricum - Senior Financial Cloud Budget Management Support

Barbaricum, United States (On-Site)

Meta - Art Manager, Wearables

Meta, United States (On-Site)

Paypal - Staff Full Stack Engineer (GenAI)

Paypal, United States (Hybrid)

Gym Class VR - Senior 3D Artist - Generalist

Gym Class VR, United States (On-Site)

Sinch - Corporate Counsel - North America

Sinch, United States (Hybrid)

Google - Software Engineer III, YouTube

Google, United States (On-Site)

Meta - Technical Game Designer

Meta, United States (Remote)

Get notifed when new similar jobs are uploaded

Data Analyst Jobs

Microsoft - Software Engineering II

Microsoft, Spain (On-Site)

ION - UK Holdings - India - 854

ION, India (On-Site)

Netflix - Analytics Engineer (L5) - Content & Studio

Netflix, United States (On-Site)

Twitch - Data Engineer - Monetization

Twitch, United States (On-Site)

Voodoo - Expansion Intern - Jamble

Voodoo, France (On-Site)

Eneba Games - Machine Learning Engineer

Eneba Games, Lithuania (Remote)

JustPlay - Senior Data Scientist (all genders)

JustPlay, Germany (Hybrid)

Match Group - Senior Data Scientist (Product Analytics)

Match Group, United States (Hybrid)

Get notifed when new similar jobs are uploaded