Nintendo Technology Development
· Machine Learning Operations Engineer for the Nintendo Technology Development Inc. (NTD) organization will be responsible for building and maintaining infrastructure and highly available services for ML research and development. This role will work closely with IT, Engineering and Data Science teams.
- Design and implement the engineering infrastructure and the data pipelines to support machine learning systems for our engineering and data science teams.
- Take offline models data scientists build and turn them into real machine learning production systems.
- Develop and deploy scalable tools and services to handle machine learning training and inference.
- Identify and evaluate new technologies to improve performance, maintainability, and reliability of our machine learning systems.
- Apply software engineering rigor and best practices to machine learning, including CI/CD, automation, etc.
- Support model development, with an emphasis on auditability, versioning, and data security.
- Facilitate the development and deployment of proof-of-concept machine learning systems.
- Communicate with engineering and data science teams to build requirements and track progress
- Up to 10% travel; domestic and international.
- 3-5 years’ experience building end-to-end systems as a Platform Engineer, ML Ops Engineer, or Data Engineer (or equivalent).
- Strong software engineering skills in complex, multi-language systems.
- Fluency in Python.
- Experience with Linux system administration.
- Experience deploying and maintaining container orchestration platforms like Kubernetes.
- Experience developing containers in cloud computing environments.
- Experience working with cloud computing and database systems.
- Experience building custom integrations between cloud-based systems using APIs.
- Experience developing and maintaining ML systems built with open-source tools.
- Familiarity with data-oriented workflow orchestration frameworks (KubeFlow, Airflow, Argo, etc.).
- Strong understanding of software testing, benchmarking, and continuous integration.
- Exposure to machine learning methodology and best practices.
- Exposure to deep learning approaches and modeling frameworks (PyTorch, Tensorflow, Keras, etc.).
- Ability to translate business needs to technical requirements.
- Fluency in Japanese a plus.
- BS or MS in engineering, computer science, a related field, or equivalent combination of education and experience.
- Valid passport may be required.
This position is onsite in Redmond, WA, and not open to remote status at this time.