Senior Site Reliability Engineer

4 Months ago • 5-10 Years • DevOps

Job Summary

Job Description

Join our SRE team and design, implement, and maintain highly scalable and reliable systems in a cloud-native environment. Must have experience with AWS, Kubernetes, and observability tools like Prometheus and Grafana. Experience with automation tools like Terraform and Jenkins is essential.
Must have:
  • AWS Experience
  • Kubernetes Expertise
  • Observability Tools
  • Automation Tools
Good to have:
  • Database Technologies
  • Performance Tuning
  • Incident Management
  • Capacity Planning
Perks:
  • Hybrid Work Model
  • Dynamic Team

Job Details

About the job

Why Lytx:

Join our dynamic and passionate team of driven, low-ego engineers who are at the forefront of designing and supporting cutting-edge IoT infrastructure. As we rapidly grow and transition to the cloud, we're diving into the exciting realms of "Operations as Code," "Infrastructure as Code," and innovative infrastructure automation.

Our Site Reliability Engineering (SRE) team is pivotal in ensuring the availability, reliability, observability, and resilience of Lytx’ services, both on-premises and in the cloud. We're not just keeping the lights on—we're engineering the future of our business's continuity.

If you're energized by crafting transformative solutions and excel at designing robust, detailed cloud infrastructure with a focus on continuous improvement, this could be the perfect role for you!

Responsibilities:

  • System Design and Architecture: Design, implement, and maintain scalable and reliable systems, ensuring they can handle both current and future demands.
  • Incident Management: Lead incident response efforts, diagnose root causes, and implement long-term solutions to prevent recurrence. Ensure effective communication during outages.
  • Monitoring and Observability: Develop and maintain comprehensive monitoring and alerting systems to proactively identify and address issues before they impact users.
  • Automation and Efficiency: Automate repetitive tasks and processes to improve operational efficiency and reduce manual intervention.
  • Performance Tuning: Continuously optimize system performance, including fine-tuning applications, databases, and infrastructure to meet service level objectives (SLOs).
  • Capacity Planning: Forecast future system requirements based on growth trends and current usage, and plan capacity upgrades to ensure system reliability.
  • Collaboration and Mentoring: Work closely with development teams to integrate reliability into the software development lifecycle. Mentor junior SREs and share best practices.
  • Documentation and Knowledge Sharing: Create and maintain detailed documentation on system design, incident response procedures, and operational practices to ensure knowledge is preserved and accessible.

Requirements:

  • 5+ years of experience as an SRE within AWS environments at medium to large-scale organizations.
  • 3+ years of hands-on experience implementing and managing observability tools, such as Prometheus, New Relic, Grafana, or similar.
  • Advanced programming skills in Python, Groovy, and Bash.
  • Strong understanding of database technologies, including both SQL and NoSQL systems.
  • 3+ years of experience developing and managing infrastructure deployment pipelines using Git, Terraform, Helm, Jenkins/Jenkins X/ArgoCD, or similar tools.
  • Proven expertise in designing, evaluating, and supporting production environments in AWS, including VPCs, EKS, IAM, AMI, EC2, CloudWatch, CloudTrail, Control Tower, GuardDuty, MSK, S3, Glacier, Gateways, Direct Connect, Route 53, RDS, ALBs, Autoscaling, and more.
  • Hands-on experience with Linux systems and protocols and technologies such as HTTP, REST, TCP/IP, SSL, DNS, SMTP, SSH, NTP, Load Balancing, SQL/NoSQL, Message Brokers, Nginx, Vault, etc.
  • Extensive experience with Kubernetes and various container and cloud-native technologies.
  • Significant experience in managing 24/7 on-call rotations, creating runbooks, establishing support procedures, and proactively monitoring systems across multiple geographic locations.
  • Ability to thrive under pressure and excel in a technically challenging environment.

Similar Jobs

The Walt Disney Company - Lead Integration Developer

The Walt Disney Company

Montévrain, Île-de-France, France (On-Site)
23 Hours ago
Wargaming - Senior Backend Engineer (Unannounced project)

Wargaming

Prague, Prague, Czechia (Hybrid)
2 Months ago
Zeta - Principal Engineer

Zeta

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Armada - Senior Software Engineer

Armada

Thiruvananthapuram, Kerala, India (On-Site)
4 Months ago
Trimble  Inc  - Lead DevOps Engineer

Trimble Inc

Chennai, Tamil Nadu, India (On-Site)
2 Months ago
Warner Bros Games - Senior Software Engineer - Observability

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
2 Weeks ago
PwC - IN-Associate_Azure Devops_MS Engg_Advisory_Kolkata

PwC

Kolkata, West Bengal, India (On-Site)
3 Months ago
Luxoft - Solutions Architect

Luxoft

Bengaluru, Karnataka, India (On-Site)
2 Months ago
DaySmart - Senior DevOps Engineer

DaySmart

Hyderabad, Telangana, India (On-Site)
4 Months ago
Aristocrat Gaming - Senior Systems Reliability Engineer (SRE)

Aristocrat Gaming

Austin, Texas, United States (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Visa - Staff Site Reliability Engineer - PRE

Visa

Austin, Texas, United States (Hybrid)
3 Months ago
Every matrix - AI/ML Lead Engineer

Every matrix

Lviv, Lviv Oblast, Ukraine (Hybrid)
1 Week ago
Info Stretch - Senior Engineer

Info Stretch

Pune, Maharashtra, India (On-Site)
3 Months ago
ICE - Software Engineer II

ICE

Pune, Maharashtra, India (Hybrid)
2 Months ago
PwC - Backend Developer/Consultant with German (freelance)

PwC

Warsaw, Masovian Voivodeship, Poland (Hybrid)
4 Months ago
Activision - Senior Staff Backend Engineer

Activision

San Francisco, California, United States (On-Site)
3 Months ago
Paypal - Principal Machine Learning Engineer - AI

Paypal

San Jose, California, United States (On-Site)
4 Months ago
Microsoft - Sr. Solution Area Specialist - Data & AI

Microsoft

Gurugram, Haryana, India (On-Site)
1 Month ago
Luxoft - Solutions Architect

Luxoft

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Blazesoft - .Net Developer

Blazesoft

Vaughan, Ontario, Canada (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

DNEG - Pipeline ATD

DNEG

Mumbai, Maharashtra, India (On-Site)
3 Months ago
bosh group india - Circuit Analysis Engineer - Team Lead

bosh group india

Bengaluru, Karnataka, India (On-Site)
3 Days ago
Milliman - Senior Quality Assurance Engineer

Milliman

Navi Mumbai, Maharashtra, India (On-Site)
3 Months ago
Gunjan App Studios - Game Designer

Gunjan App Studios

Kolkata, West Bengal, India (On-Site)
2 Months ago
Paytm - City Head - Hyderabad - QR sales

Paytm

Hyderabad, Telangana, India (On-Site)
3 Months ago
Luxoft - Senior React JS Developer

Luxoft

Chennai, Tamil Nadu, India (On-Site)
2 Months ago
Paytm - Team Lead Sales - Kathua

Paytm

Kathua, Uttar Pradesh, India (On-Site)
2 Months ago
Microsoft - Principal Applied Scientist

Microsoft

Bengaluru, Karnataka, India (On-Site)
1 Month ago
PwC - IN_Senior Associate_ Control Testing _Internal Audit Services _Advisory _Pune

PwC

Pune, Maharashtra, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

The Walt Disney Company - Lead Software Engineer (Identity)

The Walt Disney Company

Seattle, Washington, United States (On-Site)
2 Months ago
ByteDance - Global SRE Lead, Security Engineering

ByteDance

Singapore (On-Site)
3 Months ago
OtherSide Entertainment - Senior Online Engineer

OtherSide Entertainment

United States (Remote)
2 Weeks ago
InMobiInMobi - SDE III - Devops

InMobiInMobi

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Inkittt - Senior Machine Learning Engineer, Recommendations

Inkittt

San Francisco, California, United States (Hybrid)
2 Weeks ago
Innoactive - Software Engineer

Innoactive

(Remote)
1 Month ago
BlackRock - Linux System Engineer -Vice President

BlackRock

Gurugram, Haryana, India (Hybrid)
4 Months ago
Brillio - Azure DB Architect - Migration - R01531206

Brillio

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Alstom - Engineering Tools Deployment Manager

Alstom

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Nagarro - Senior Staff Engineer

Nagarro

Philippines (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Lytx is the global leader in fleet management technologies. Our solutions harness the power of video to empower drivers and fleets to be safer and more efficient, productive, and profitable so they can thrive in today’s competitive environment. Through the Lytx platform, direct and reseller clients access our customizable services and programs spanning driver safety, risk detection, fleet tracking, compliance, preventative maintenance, and fuel management. Using the world’s largest driving database of its kind, along with proprietary machine vision and artificial intelligence technology, we help protect and connect thousands of fleets and 1.6 million drivers in more than 60 countries worldwide. Lytx is privately held and headquartered in San Diego, California. For more information, visit us at Lytx.com.


The SurfsightTM solution is Lytx's indirect market offering, available in North America and internationally. Strategic partners and resellers can use Surfsight's open API platform to easily add video to their telematics stack or utilize our stand-alone Surfsight Cloud dashboard to allow fleet managers to track vehicles, view risky and distracted driving events, retrieve videos from the field, and view and analyze data. The innovative technology in the Surfsight dash cam, powered by Lytx, uses robust machine vision and artificial intelligence to proactively detect and mitigate risk. It provides detailed analytics and real-time visibility into overall fleet performance, giving companies valuable data to help increase safety and savings through better fleet management. The solution offers an accessible entry point into video telematics without compromising on features, functionality, and configuration options. For more information visit https://www.lytx.com/en-us/surfsight.



Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

View All Jobs

Get notified when new jobs are added by Lytx, Inc.

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug