Platform Engineer

4 Months ago • 6-10 Years • DevOps

Job Summary

Job Description

Cosm seeks a Platform Engineer to design, implement, and monitor their operations center infrastructure. Expertise in Grafana, Prometheus, Loki, and Tempo is crucial, along with strong knowledge of cloud platforms like Azure and AWS. Experience with virtualization/containerization technologies like Docker & Kubernetes is essential.
Must have:
  • Grafana, Prometheus
  • Azure, AWS
  • Docker, Kubernetes
  • Platform Engineer
Good to have:
  • Hyper-V, VMware
  • Pulumi, Terraform
  • Ansible, Puppet
  • Windows Server
Perks:
  • Hybrid Work
  • Global Company

Job Details

About the job

Cosm is a global technology company that brings experiences to life in immersive environments. We help our partners create spaces and content that blur the lines of real and virtual across three primary markets: Sports and Entertainment, Science and Education, and Parks and Attractions. Cosm was born from the fusion of some of the greatest innovators in the history of technology. Evans & Sutherland, Spitz, Inc., and Cosm Immersive combined forces to power the immersive experiences of the future as Cosm. Innovation is in our DNA.

Summary

As a Platform Engineer, you will play a pivotal role in designing, implementing, automating, and maintaining the technology infrastructure that supports our organization's operations center. You will be responsible for designing robust, scalable, and resilient platforms that facilitate real-time monitoring, analysis, and decision-making processes critical to our business and product operations.

You will liaise with product and engineering teams to ensure applications and microservices support telemetry ingestion for actionable alerting and historical data graphing, thus building a continuous feedback loop for platform and product reliability.

The ideal candidate is a solutions-oriented person who can learn new technologies quickly and who can become competent with all layers of the development platform. They should be willing to roll up their sleeves and be familiar with various technologies but know how to choose the best technology for the job. Ideally, they are familiar with SaaS, live entertainment and broadcast as well as digital, tech, and streaming media. If you think you have the skills and are up for the challenge, consider this your calling.

Responsibilities

  • Monitoring and Alerting: Design and automate robust monitoring and alerting mechanisms to ensure the health, performance, and availability of the operations center platform, products and associated infrastructure components.
  • Application Monitoring: Work with software engineering and product teams to best understand how to monitor their applications and microservices.
  • Infrastructure Deployment: Collaborate with infrastructure teams to deploy and configure the necessary hardware and software components to support the operations center platform, including servers, networks, databases, and monitoring tools.
  • Documentation and Training: Create comprehensive documentation, diagrams, and guides to facilitate system understanding, troubleshooting, and knowledge transfer. Provide training and support to operations center staff on platform usage and best practices.
  • Collaboration and Stakeholder Management: Collaborate closely with cross-functional teams, including product, operations, IT, security, and business units, to understand requirements, gather feedback, and align observability platform architecture with organizational goals and priorities.
  • Incident Management: Work an on-call rotation to troubleshoot and resolve incidents, working closely with the support team to ensure prompt resolution.
  • Continuous Learning: Stay informed about industry trends and emerging technologies related to Windows Server, on-premise infrastructure, and Azure and AWS Cloud platforms.
  • Leadership: Provide technical guidance and mentorship to junior team members as needed.
  • Communication: Exemplify excellent written and verbal communication skills and the ability to tailor technical communications to any audience deftly.
  • Be Audacious: Push the limits, try new technologies, take calculated risks, swing for the fences, and proactively search for the best solutions and ideas in the marketplace.

Experience

  • Bachelor's or Master's degree in Computer Science, Information Technology, or a related field, or relevant work experience.
  • 6+ years of proven experience as a platform engineer, site reliability engineer, systems engineer or a similar role, with a focus on designing, implementing and monitoring the health of complex, distributed systems.
  • Expert-level knowledge of Grafana, Prometheus, Loki, and Tempo
  • Familiarity with scripting languages for automation and configuration management. PowerShell & BASH are paramount.
  • Strong understanding of cloud computing concepts and hands-on experience with Azure and/or AWS.
  • Experience with virtualization/containerization technologies such as Hyper-V or VMware, Amazon EC2, Docker & Kubernetes
  • Experience using Pulumi, Terraform and/or other IaC tools.
  • In-depth knowledge of Windows Server operating systems (2016/2019/2022), including installation, configuration, and troubleshooting.
  • Familiarity with Linux automation with tools such as Ansible or Puppet is a plus.
  • Expertise in data retrieval technologies, including constructing efficient PromQL, GraphQL & LogQL queries.
  • Solid understanding of networking principles and protocols.
  • Excellent problem-solving and troubleshooting skills, with a keen attention to detail.
  • Strong communication and interpersonal skills, with the ability to collaborate effectively with clients and team members.
  • Driven to automate your processes, test continually, and document your work.
  • You’re not afraid of an open, candid, and respectful work environment.
  • Experience in working with a cross-functional, distributed team from concept through completion and future iterations including agile methodologies.
  • Excellent time management skills.

Preferred Qualifications

  • Certifications in cloud platforms (e.g., AWS Certified Solutions Architect, Azure Solutions Architect) or similar.

Cosm is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Apply Now

Similar Jobs

NetApp - Quality Assurance Engineer

NetApp

Bengaluru, Karnataka, India (On-Site)
3 Months ago
ION - Site Reliability Engineer

ION

London, England, United Kingdom (Hybrid)
4 Months ago
LSEG (London Stock Exchange Group) - DevOps Engineer

LSEG (London Stock Exchange Group)

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
PwC - IN-Senior Manager – ERP - Sales-Ms Dynamics– Advisory  - Gurgaon

PwC

Gurugram, Haryana, India (On-Site)
4 Months ago
Epic Games - Lead Programmer

Epic Games

Montreal, Quebec, Canada (On-Site)
1 Month ago
Tencent - Technical Account Manager

Tencent

Tokyo, Japan (On-Site)
1 Month ago
ARHS - Configuration / Deployment Specialist

ARHS

Warsaw, Masovian Voivodeship, Poland (On-Site)
4 Months ago
Interactive Brokers - Senior Systems Engineer- Microsoft M365/Active Directory

Interactive Brokers

Chicago, Illinois, United States (Hybrid)
4 Months ago
Paytm - DevOps Engineer/Senior DevOps-Paytm Money

Paytm

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Microsoft - Senior Cloud Software Engineer - Storage

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Visa - Senior Manager Data Science - Visa Consulting & Analytics

Visa

Mumbai, Maharashtra, India (On-Site)
4 Months ago
Kyndryl - Lead II - Power Automate Developer

Kyndryl

Gurugram, Haryana, India (Hybrid)
4 Months ago
Playtech - Network Security Engineer

Playtech

Sofia, Sofia City Province, Bulgaria (On-Site)
2 Months ago
PwC - Manager - Cloud Strategy- Technology Strategy & Transformation (TS&T)

PwC

Hyderabad, Telangana, India (On-Site)
3 Months ago
PENN Interactive - Data Engineer, Java

PENN Interactive

Philadelphia, Pennsylvania, United States (Hybrid)
1 Month ago
Microsoft - Technical Support Engineer – Web Technologies

Microsoft

Seoul, South Korea (Remote)
1 Month ago
Paypal - Lead Principal ML Engineer, AI Solutions

Paypal

San Jose, California, United States (On-Site)
4 Months ago
Dream Sports - Lead System Engineer

Dream Sports

Mumbai, Maharashtra, India (On-Site)
4 Months ago
Microsoft - Senior Hardware Engineer

Microsoft

Taipei City, Taiwan (On-Site)
1 Month ago
RoofStack - Senior Database Administrator

RoofStack

İstanbul, İstanbul, Türkiye (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Gurugram, Haryana, India

Interactive Brokers - Software Engineer - Java

Interactive Brokers

Mumbai, Maharashtra, India (Hybrid)
4 Months ago
DNEG - FX Lead (DNEG Animation)

DNEG

Chennai, Tamil Nadu, India (On-Site)
4 Months ago
Paytm - Internal Audit ( IT Security)  Assistant  Manager

Paytm

Noida, Uttar Pradesh, India (On-Site)
4 Months ago
PwC - IN-Manager – D365 Scm -Ms Dynamics– Advisory  - Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
4 Months ago
PwC - SAP - CPI - Senior Associate- Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
2 Months ago
PwC - IN-Director_Decarbonization_Decarbonization_Advisory_Kolkata

PwC

Kolkata, West Bengal, India (On-Site)
4 Months ago
Virtusa - Data Scientist

Virtusa

Andhra Pradesh, India (On-Site)
5 Months ago
Labcorp - Functional Tester

Labcorp

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Simple Viral Games - Android Developer Intern

Simple Viral Games

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Paytm - Micro Market Manager - QR - Chennai

Paytm

Chennai, Tamil Nadu, India (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Tencent - Tencent Cloud - Technical Account Manager (South Korea)

Tencent

Seoul, South Korea (On-Site)
1 Month ago
Intel Corporation - Infrastructure and Design Automation Engineer – Foundry Services (MAG)

Intel Corporation

Santa Clara, California, United States (Hybrid)
3 Months ago
WorldWinner - Senior DevOps Engineer

WorldWinner

(Remote)
1 Month ago
Nagarro - Principal Engineer - Senior Salesforce Architect

Nagarro

Boston, Massachusetts, United States (Hybrid)
3 Months ago
Meta - Production Engineer

Meta

Dublin, County Dublin, Ireland (On-Site)
3 Months ago
ByteDance - SRE and DevOps Tech Lead - Edge Cloud Infrastructure - London

ByteDance

London, England, United Kingdom (On-Site)
3 Months ago
TrueBlue  Inc  - Site Reliability Engineer

TrueBlue Inc

Gurugram, Haryana, India (On-Site)
5 Months ago
Nintendo - Sr Manager, Engineering Infrastructure and IT

Nintendo

Redmond, Washington, United States (On-Site)
2 Months ago
PwC - IN-Associate_ Azure DevOps Engineer_OneCloud_Advisory_Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
2 Months ago
ION - Senior DevSecOps Engineer, Italy

ION

Pisa, Tuscany, Italy (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded