Site Reliability Engineering Manager

3 Days ago • 1-2 Years • DevOps

Job Summary

Job Description

Wildlife seeks a Site Reliability Engineering Manager to lead a cross-functional team responsible for providing highly available, user-friendly systems. Responsibilities include managing infrastructure clusters (Kubernetes, NATS, etc.), optimizing costs, improving monitoring and observability, leading incident management, automating infrastructure provisioning (Terraform, etc.), and partnering with teams to architect and scale applications using cloud-native best practices. The role requires strong technical skills (Go/Python), experience managing teams, and a passion for automation and high availability.
Must have:
  • Manage SRE team
  • Optimize infrastructure clusters
  • Improve monitoring & observability
  • Lead incident management
  • Automate infrastructure provisioning
  • Experience with Kubernetes & cloud
  • Go/Python coding experience
  • Strong communication skills

Job Details

We're looking for a talented and passionate Site Reliability Engineering Manager, to join Wildlife's Cloud Platform team.

As an SRE Manager you will have the goal to provide easy-to-use, highly available systems to all the engineers in the company. As an SRE Manager of the Compute team, your main goal is to enable your team to improve the infrastructure services, using and refining our existing automations while being able to contribute in technical and business decisions for new services that will support the scalability and usability of the infrastructure services in the company and improving the team career growth, engagement and retention.

We know that the work we do has a high impact on our company's success and culture. The right person for this position is curious by nature, proactive, loves solving problems,  and can thrive in a fast and growing business. 

What you'll do

  • Be the manager of a cross-functional team, contributing to the team roadmap and growth of its individual contributors.
  • Develop, maintain, and optimize infrastructure clusters (Kubernetes, NATS, ETCD, Postgres, MongoDB, Redis, Elasticsearch) and our APIs and Automations to manage them (Kubernetes Operators, Infrastructure as code, Pipelines, CLIs,)
  • Analyze costs of infrastructure services and help define and optimize the budget of our infrastructure and game teams;
  • Contribute to improvements on monitoring and observability patterns for infrastructure services;
  • Troubleshoot,  manage and lead incidents in production;
  • Automate and improve infrastructure provisioning, by augmenting our Infrastructure as Code or implementing new features and infrastructure services in our internal tools;
  • Help partner teams to architect and scale their applications and infrastructure with cloud-native best practices;

What you'll need

We expect our Managers to be Technical, dedicating around 50% of their time to working together with the ICs in their day-to-day work and being an active voice and participative on the team technical roadmap.

  • Experience managing small teams with infrastructure background
  • Some level of leadership skills, including the areas of people management, communications, project management, talent development, performance management, team effectiveness, agility, hiring, decision making, planning, budgeting, and collaboration.
  • Coding experience in at least one programming language. We work mostly with Go and Python;
  • University degree in courses related to computing such as Computer Engineering, Computer Science, Information Systems, and Systems Analysis and Development or equivalent Market Experience; 
  • Solid understanding of computer concepts (operational systems, networking, concurrency, memory management, and algorithm analysis);
  • Experience with cloud computing services such as Amazon AWS, Google Cloud, or Microsoft Azure;
  • Experience with Infrastructure as Code automations, such as Terraform, Packer, Ansible, Crossplane, etc;
  • Experience managing Kubernetes clusters and developing Kubernetes operators;
  • Experience automating routine tasks, such as deployments and monitoring setup;
  • Experience with incident management and being oncall for productive systems and workloads;
  • Strong written and spoken communication skills in English;
  • Experience with complex, large-scale, and high-available systems;
  • Experience with monitoring and telemetry in applications and infrastructure;
  • History of technical leadership and ownership of critical projects, including the mentoring of junior team members.

More about you

  • Player focused. We are player-oriented and infrastructure has a great impact on their experience. You have empathy with our players and focus on ensuring they have an amazing experience. You aim for a top-level infrastructure, guaranteeing the highest availability possible.
  • Automation is key to scaling. We look for engineers who have a history of projecting and executing automation projects in order to get rid of any manual and repetitive tasks.
  • Calm and pragmatism. When everything seems to be falling apart around you, you have a plan and keep calm.
  • Bleeding edge. You are curious and like to study new technologies, test new solutions, and measure the impact brought by changes. We want to ensure we are using the best stack possible
  • Metrics oriented. We make decisions based on data and metrics. We measure the results of our tasks against the expected outcome. And we ensure our work has delivered the correct impact on our customers. We believe in ownership and in shipping features end to end.
  • Bar raiser. You want to elevate your team skills and raise the bar, by mentoring your peers, spreading knowledge, being proactive and a tech lead.

About Wildlife

Wildlife is one of the leading mobile game developers and publishers in the world. We have released more than 60 titles, reaching billions of people around the globe. Here, we create games that will excite, intrigue, and engage our players for years to come!

Equal Opportunity

Wildlife is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, color, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state, and local law.

We're committed to providing accommodations for candidates with disabilities in our recruiting process.

Similar Jobs

ByteDance - Site Reliability Engineer, Traffic Platform

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
NVIDIA - Solutions Architect, AI Infrastructure

NVIDIA

Canada (Remote)
1 Month ago
Bazaar Voice - Staff Software Engineer - Full Stack, R6542

Bazaar Voice

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
Meltwater - Backend & Cloud Engineer – Javascript

Meltwater

Hyderabad, Telangana, India (Hybrid)
4 Months ago
ByteDance - Software Engineer, ML System Architecture

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Philippines (Remote)
3 Months ago
Scopely - Senior Site Reliability Engineer - Unannounced Project

Scopely

Dublin, County Dublin, Ireland (Hybrid)
1 Month ago
DEVOTEAM - Distributed Cloud | Azure Cloud Architect

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
4 Months ago
Interactive Brokers - Senior Systems Engineer- Microsoft M365/Active Directory

Interactive Brokers

Fort Lauderdale, Florida, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Warner Bros Discovery - Senior Manager, Data Platform & AWS Infrastructure - (Streaming), Hyderabad

Warner Bros Discovery

Hyderabad, Telangana, India (On-Site)
3 Months ago
N-iX - Senior DevOps Engineer

N-iX

Ukraine (Remote)
2 Weeks ago
NVIDIA - Senior Solutions Architect, Retail

NVIDIA

Arkansas, United States (Remote)
1 Month ago
Luxoft - Data Engineer for Market Data Projects (with Streamlit Expertise)

Luxoft

Brazil, Indiana, United States (Remote)
3 Months ago
ByteDance - Site Reliability Engineer, Traffic Platform

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Electronic Arts - Senior Software Engineer I

Electronic Arts

Hyderabad, Telangana, India (On-Site)
7 Months ago
Every matrix - Middle Frontend Developer (JavaScript)

Every matrix

Bucharest, Bucharest, Romania (Hybrid)
4 Months ago
Ajmera Infotech - SENIOR ASP.NET DEVELOPER

Ajmera Infotech

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Whatnot - Engineering Manager, Infrastructure

Whatnot

Los Angeles, California, United States (Remote)
4 Months ago
Maersk Careers - Elixir Software Engineer

Maersk Careers

Pune, Maharashtra, India (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in São Paulo, State of São Paulo, Brazil

PwC - Consultor Tributário Jr| Associate 2

PwC

Ribeirao Preto, State Of São Paulo, Brazil (On-Site)
4 Months ago
Quizizz - Customer Success Manager, Brasil

Quizizz

Brazil (Remote)
1 Month ago
PwC - Desenvolvedor Power BI | Senior Associate 2 [tag01]

PwC

São Paulo, State Of São Paulo, Brazil (On-Site)
5 Months ago
PTW - Staff Web Engineer

PTW

São Paulo, State Of São Paulo, Brazil (Remote)
2 Weeks ago
Google - Senior Analytical Lead, People with Disabilities

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
3 Months ago
Nissan - Banco de Talentos para Operador e ou Operadora

Nissan

Resende, State Of Rio De Janeiro, Brazil (On-Site)
5 Months ago
Google - Software Engineering Manager, Black Community Inclusion

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
3 Months ago
Epic Games - Profissional de Implementação de Arte

Epic Games

Porto Alegre, State Of Rio Grande Do Sul, Brazil (On-Site)
1 Month ago
The Walt Disney Company - Manager, Ad Operations

The Walt Disney Company

São Paulo, State Of São Paulo, Brazil (On-Site)
1 Week ago
Canva - Marketing Manager, Education (K12) - Contract

Canva

São Paulo, State Of São Paulo, Brazil (Remote)
3 Days ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Sinch - Site Reliability Engineer III

Sinch

France (Remote)
4 Months ago
Ubisoft - Fullstack Engineer Assistant

Ubisoft

Bordeaux, Nouvelle-Aquitaine, France (On-Site)
1 Week ago
Nagarro - Senior Staff Engineer

Nagarro

Philippines (Remote)
4 Months ago
Microsoft - Principal Software Engineer

Microsoft

Prague, Prague, Czechia (On-Site)
2 Months ago
Dentsu - Senior Integration Developer

Dentsu

Pune, Maharashtra, India (On-Site)
5 Months ago
Zeta - Senior Site Reliability Engineer

Zeta

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Kwalee - DevOps Engineer

Kwalee

Bengaluru, Karnataka, India (On-Site)
4 Weeks ago
PwC - IN_Senior Associate_Azure Data Engineer _OneCloud _Advisory _Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Trend Micro - (Sr.) Software Engineer in Linux

Trend Micro

Taipei City, Taiwan (On-Site)
4 Months ago
Hitachi - Azure Developer

Hitachi

Hyderabad, Telangana, India (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Explore gaming industy jobs in one of the leading Game Studios.

São Paulo, State Of São Paulo, Brazil (Hybrid)

São Paulo, State Of São Paulo, Brazil (Hybrid)

São Paulo, State Of São Paulo, Brazil (Hybrid)

São Paulo, State Of São Paulo, Brazil (Hybrid)

São Paulo, State Of São Paulo, Brazil (On-Site)

State Of São Paulo, Brazil (On-Site)

São Paulo, State Of São Paulo, Brazil (On-Site)

São Paulo, State Of São Paulo, Brazil (On-Site)

São Paulo, State Of São Paulo, Brazil (On-Site)

São Paulo, State Of São Paulo, Brazil (On-Site)

View All Jobs

Get notified when new jobs are added by Wildlife Studios

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug