Pyspark Engineer

2 Months ago • All levels • Data Analyst

Job Summary

Job Description

This role involves designing, developing, and maintaining ETL pipelines using PySpark, optimizing for performance and scalability. The PySpark Engineer will work with large structured and unstructured datasets, transforming data to meet business needs and integrating data from multiple sources. Collaboration with cross-functional teams is key to understanding data requirements and translating them into efficient workflows. Responsibilities include implementing data governance best practices, debugging pipelines, improving performance and reliability, and providing documentation and training. The project focuses on a high-impact data engineering initiative, delivering data-driven insights for business decisions.
Must have:
  • Proficiency in PySpark
  • Strong SQL knowledge
  • Data Warehousing Concepts
  • Cloud Platform experience (AWS, GCP, Azure)
  • Big Data Technologies (Hadoop, Spark)
  • Data Modeling experience
  • Strong Python skills
Good to have:
  • Airflow or other orchestration tools
  • Apache Kafka knowledge
  • Data visualization tools (Tableau, Power BI)
  • Machine learning familiarity
  • Agile methodology experience
  • Data governance and compliance knowledge

Job Details

Project description

We are looking for skilled PySpark Engineers to join our team, working on a high-impact data engineering project. The project involves processing large datasets, optimizing ETL pipelines, and building scalable solutions to manage complex data workflows. The ideal candidate will collaborate closely with data scientists, data analysts, and software engineers to drive robust, data-driven insights for business decisions.

Responsibilities

Design, develop, and maintain ETL pipelines using PySpark, optimizing for performance and scalability.

Work with large volumes of structured and unstructured data, transforming data to meet business needs.

Integrate data from multiple sources into the data platform, ensuring data integrity and quality.

Collaborate with cross-functional teams to understand data requirements and translate them into efficient data workflows.

Implement best practices for data governance, monitoring, and data security.

Debug and troubleshoot issues across ETL pipelines and data workflows.

Continuously improve performance, scalability, and reliability of existing data pipelines.

Provide documentation and training for data workflows and processes.

Skills

Must have

Proficiency in PySpark: In-depth experience with PySpark for data processing and transformation tasks.

SQL Knowledge: Strong command of SQL for querying and processing data.

Data Warehousing Concepts: Familiarity with data warehousing, data lakes, and data integration principles.

Cloud Platforms: Experience with cloud environments like AWS, GCP, or Azure for data storage and processing.

Big Data Technologies: Hands-on experience with Hadoop and Spark ecosystem (Spark SQL, Spark Streaming).

Data Modeling: Experience in designing and implementing efficient data models.

Python Programming: Strong Python skills, particularly in data manipulation and analysis.

Nice to have

Experience with Airflow or Other Orchestration Tools: Knowledge of workflow orchestration tools for scheduling and monitoring data pipelines.

Knowledge of Apache Kafka: Understanding of Kafka for real-time data streaming and integration.

Familiarity with Data Visualization Tools: Knowledge of visualization tools like Tableau, Power BI, or similar.

Machine Learning Exposure: Familiarity with machine learning concepts, particularly with integrating ML models in data workflows.

Agile Methodology: Experience working in Agile/Scrum environments.

Data Governance and Compliance Knowledge: Understanding of data governance frameworks and compliance standards, such as GDPR.

Other

Languages

English: C1 Advanced

Seniority

Senior

Similar Jobs

Luxoft - Senior Data Engineer

Luxoft

New Delhi, Delhi, India (Remote)
2 Months ago
ComeOn Group - Data Engineer

ComeOn Group

Stockholm, Stockholm County, Sweden (Hybrid)
3 Months ago
OLIVER Agency - Motion Graphic Designer

OLIVER Agency

Maharashtra, India (On-Site)
4 Months ago
Luxoft - Senior PySpark Data Engineer

Luxoft

(Remote)
2 Months ago
PublicisGroupe - Search Analyst

PublicisGroupe

Bogotá, Bogota, Colombia (On-Site)
3 Months ago
Glean - Senior Technical Writer and Knowledge Manager

Glean

Palo Alto, California, United States (On-Site)
2 Months ago
Crunchyroll - Senior Data Analyst

Crunchyroll

Hyderabad, Telangana, India (On-Site)
2 Months ago
Lowe's India - IND_Analyst, LMN Insights

Lowe's India

Bengaluru, Karnataka, India (On-Site)
4 Months ago
PwC - Data Analyst - Financial Crime team

PwC

Prague, Prague, Czechia (On-Site)
3 Months ago
Chess - Senior Data Scientist - Product

Chess

United States (Remote)
7 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Visa - Lead SW Engineer (14+ years exp, Java, Spring, React, GenAI)

Visa

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Whatnot - Software Engineer, Discovery Feed & Browse

Whatnot

(Remote)
3 Months ago
Netflix - Engineering Manager, Spark

Netflix

United States (Remote)
3 Months ago
Whatnot - Director, Data Science (Revenue Analytics)

Whatnot

(Remote)
3 Months ago
Meta - Software Engineer

Meta

Redmond, Washington, United States (On-Site)
2 Months ago
Warner Bros Discovery - Principal Software Engineer - Large Scale Distributed Systems & Data Platform

Warner Bros Discovery

Hyderabad, Telangana, India (On-Site)
2 Months ago
PublicisGroupe - Senior Associate L1 DE-Big Data AWS

PublicisGroupe

Hyderabad, Telangana, India (On-Site)
3 Months ago
Gameloft - Java Software Developer

Gameloft

Barcelona, Catalonia, Spain (Hybrid)
6 Months ago
Riot Games - Data Science Intern - VALORANT - Summer 2025 (Remote)

Riot Games

Los Angeles, California, United States (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Sydney, New South Wales, Australia

Canva - Software Engineer (Java) - Growth - (Remote across ANZ)

Canva

Sydney, New South Wales, Australia (Remote)
3 Months ago
Easygo - Software Development Engineer - Payments (Sydney)

Easygo

Melbourne, Victoria, Australia (On-Site)
2 Months ago
Coupa Software - Deal Desk Specialist

Coupa Software

Sydney, New South Wales, Australia (Hybrid)
3 Months ago
The Walt Disney Company - Senior Paint & Roto Artist

The Walt Disney Company

Sydney, New South Wales, Australia (On-Site)
3 Months ago
The Walt Disney Company - Creature Technical Director (all levels)

The Walt Disney Company

Sydney, New South Wales, Australia (On-Site)
7 Months ago
Canva - Engineering Manager (BE) - Content Permissions - (Remote across ANZ)

Canva

Brisbane, Queensland, Australia (Remote)
2 Months ago
Easygo - IT Support Officer

Easygo

Melbourne, Victoria, Australia (On-Site)
2 Months ago
Flying Bark Productions - Senior Lookdev Artist

Flying Bark Productions

Sydney, New South Wales, Australia (Hybrid)
4 Months ago
Dentsu - Senior Client Executive | iProspect

Dentsu

Sydney, New South Wales, Australia (On-Site)
3 Months ago
Canva - Senior Software Engineer (Cloud FinOps) - remote across ANZ

Canva

Sydney, New South Wales, Australia (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

Data Analyst Jobs

Playrix - Senior Big Data Engineer

Playrix

Portugal (Remote)
3 Months ago
PwC - AWS Data Architect Senior Manager

PwC

Toronto, Ontario, Canada (On-Site)
4 Months ago
Fliff  Inc  - Data Scientist

Fliff Inc

Austin, Texas, United States (On-Site)
6 Months ago
Equivalent Jobs - QUANTITATIVE ANALYST

Equivalent Jobs

(Remote)
2 Months ago
NinjaVan - Senior Data Engineer

NinjaVan

Hyderabad, Telangana, India (On-Site)
3 Months ago
Saviynt - Apache Superset Developer

Saviynt

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
PwC - IN_Senior Associate_Monitoring & Evaluation_Citizen Services _Advisory_Gurgaon

PwC

Gurugram, Haryana, India (On-Site)
4 Months ago
PlayStation Global - Marketing Data Engineer Intern - Masters or PhD

PlayStation Global

Aliso Viejo, California, United States (On-Site)
4 Months ago
Paypal - Database Marketing Manager

Paypal

Shanghai, Shanghai, China (On-Site)
4 Months ago
Crunchyroll - Staff Data Analyst

Crunchyroll

Hyderabad, Telangana, India (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Luxoft, a DXC Technology Company (NYSE: DXC), is a digital strategy and software engineering firm providing bespoke technology solutions that drive business change for customers the world over. Acquired by U.S. company DXC Technology in 2019, Luxoft is a global operation in 44 cities and 21 countries with an international, agile workforce of nearly 18,000 people. It combines a unique blend of engineering excellence and deep industry expertise, helping over 425 global clients innovate in the areas of automotive, financial services, travel and hospitality, healthcare, life sciences, media and telecommunications.

DXC Technology is a leading Fortune 500 IT services company which helps global companies run their mission critical systems. Together, DXC and Luxoft offer a differentiated customer-value proposition for digital transformation by combining Luxoft’s front-end digital capabilities with DXC’s expertise in IT modernization and integration. Follow our profile for regular updates and insights into technology and business needs.

Gothenburg, Västra Götaland County, Sweden (On-Site)

New Delhi, Delhi, India (Remote)

Poland, Ohio, United States (Remote)

Kraków, Lesser Poland Voivodeship, Poland (On-Site)

Wrocław, Lower Silesian Voivodeship, Poland (On-Site)

Ukrainka, Kyiv Oblast, Ukraine (Remote)

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bucharest, Bucharest, Romania (On-Site)

View All Jobs

Get notified when new jobs are added by Luxoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug