Senior Site Reliability Engineer

1 Month ago • 5 Years +
Create a profile and let recruiters contact you

About the job

SummaryBy Outscal

Senior SRE with 5+ years experience in cloud and on-prem SRE design and implementation, expertise in monitoring, configuration management, and automation.

Responsibilities:

About Tencent Overseas IT:
Tencent Overseas IT has the mission to empower Tencent’s rapid global growth with future-ready, global IT platforms, applications, and services. We are chartered to lead the Overseas IT strategy, architecture, roadmap, and execution. Satisfying our internal/external customers and becoming a world-class global IT team are our top aspirations.


We are seeking a Sr. Site Reliability Engineer with extensive cloud and on-prem SRE design and implementation experience.

Duties and Responsibilities:
This senior role will closely work with our internal IT and cloud providers to design the best global SRE architecture and solution in the cloud. This role will also support the studio’s infrastructure, game publishing infrastructure and its evolution to the cloud. Our customers include internal or acquired gaming studios, game publishing services, innovative offices/workplaces, various business groups, and external customers. The work scope will include understanding the internal customers’ business requirements, collecting the technical requirements, developing reference architecture and prototypes based on leading industry best practices, leading implementation, and deployment for global locations, as well as issue troubleshooting when necessary.

For this SRE job, you will:
• Design, implement, and support operational and reliability of large-scale Cloud-enabled studio with a focus on performance at scale, real-time monitoring, logging ,analyzing and alerting
• Maintain services once they go live by measuring and monitoring availability, latency, and overall system health.
• Design and develop robust and scalable products and tools to enhance operational efficiency.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Participate in incident response and troubleshooting efforts to minimize downtime and ensure system reliability.
• Maintain project and product documents and knowledge
• Be part of an on-call rotation to support production systems (if needed)


Based in Shanghai, China, this person will work closely with the global IT team, and HQ teams.

Whom we are looking for:

  • A quick learner
  • A positive, self-motivated, and passionate person
  • Independent, insistent, and open-minded.
  • A great team player and both dependable and autonomous.
  • Customer-oriented and could work at a very fast pace.

Requirements:

Requirements

  • 5+ years of experience with Infrastructure automation, distributed systems design, experience with design, develop tools for running large-scale private or public cloud systems in Production
  • In-depth knowledge and understanding of monitoring concepts, alert mechanisms, log monitoring, anomaly detections, creation, and setup of dashboards.
  • In-depth knowledge and experience with Elastic Search, Prometheus
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
  • Passion for infrastructure and monitoring as code
  • Bachelor’s degree (or higher), Computer Science, Mathematics, or related science or engineering major
  • Solid understanding of cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Good understanding and hands on experience in network is plus
  • Bilingual preferred (English, Chinese)

About The Company

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.


Founded in 1998 with its headquarters in Shenzhen, China, Tencent's guiding principle is to use technology for good. Our communication and social services connect more than one billion people around the world, helping them to keep in touch with friends and family, access transportation, pay for daily necessities, and even be entertained.


Tencent also publishes some of the world's most popular video games and other high-quality digital content, enriching interactive entertainment experiences for people around the globe.


Tencent also offers a range of services such as cloud computing, advertising, FinTech, and other enterprise services to support our clients' digital transformation and business growth.


Tencent has been listed on the Stock Exchange of Hong Kong since 2004.

California, United States (On-Site)

Singapore (On-Site)

Federal Territory Of Kuala Lumpur, Malaysia (On-Site)

California, United States (On-Site)

California, United States (On-Site)

Quebec, Canada (On-Site)

Tokyo, Japan (On-Site)

View All Jobs

Similar Jobs

VGW - Senior Site Reliability Engineer

Mecklenburg-Vorpommern, Germany (On-Site)

Electronic Arts - Site Reliability Engineer

Telangana, India (On-Site)

2K - Staff Site Reliability Engineer

California, United States (Hybrid)

Keywords Studios (Player Support) - Site Reliability Engineer (SRE) - Intermediate

County Dublin, Ireland (On-Site)

2K - Senior Site Reliability Engineer

California, United States (Hybrid)

Guerrilla - SENIOR SITE RELIABILITY ENGINEER

North Holland, Netherlands (On-Site)

Bungie - Data Reliability Engineer

Worldwide (Hybrid)

2K - Staff Site Reliability Engineer

Texas, United States (Hybrid)

2K - Senior Site Reliability Engineer

Texas, United States (Hybrid)

Moon Active - Site Reliability Engineer

Masovian Voivodeship, Poland (On-Site)

Similar Skill Jobs

Aristocrat Gaming - Software Development Manager

Texas, United States (Hybrid)

amc-studio - 3D Stylized Environment Artist

Bucharest, Romania (Remote)

amc-studio - 3D Stylized Character Artist

Bucharest, Romania (Remote)

Streamline Studios - Project Manager (Games)

Federal Territory Of Kuala Lumpur, Malaysia (On-Site)

Skillz - Lead Data Engineer

Nevada, United States (On-Site)

Hedra - Lead Full-Stack Engineer

California, United States (On-Site)

Hedra - Lead Full-Stack Engineer

New York, United States (On-Site)

Easygo - Sportsbook Manager LATAM

Bogotá, Colombia (On-Site)

Evolution - HR Admin

Sofia City Province, Bulgaria (On-Site)

Tata Consultancy Servicess - Generative AI Engineer

Maharashtra, India (On-Site)

Jobs in Shanghai, Shanghai, China

OUTFIT7 - Product Data Analyst CN

Shanghai, China (On-Site)

AppLovin - Sr. Analyst, eCommerce

Beijing, China (On-Site)

Riot Games - Senior Content Producer, Wild Rift

Shanghai, China (On-Site)

Xsolla - QA (Anti-Fraud System)

Beijing, China (On-Site)

Ubisoft - Senior Programmer [Unity]

Shanghai, China (On-Site)

Xsolla - Product Designer

Beijing, China (On-Site)

Software Engineering Jobs

Aristocrat Gaming - Software Development Manager

Texas, United States (Hybrid)

Skillz - Lead Data Engineer

Nevada, United States (On-Site)

Hedra - Lead Full-Stack Engineer

California, United States (On-Site)

Hedra - Lead Full-Stack Engineer

New York, United States (On-Site)

vi - Javascript Developer

New York / Remote (Remote)

CodeVyasa - Senior React js Developer

Maharashtra, India (On-Site)

Second Talent - Full Stack Engineer

Maharashtra, India (Hybrid)

SCULPTD GEOMETRY - Junior Architect

Tamil Nadu, India (On-Site)

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug