Data Engineer - Python & Databricks

1 Month ago • 5 Years + • Data Analyst

About the job

Job Description

As a Data Engineer Developer, you will design, develop, and maintain data pipelines using Python and Databricks to process large-scale datasets. You'll collaborate with data scientists, analysts, and stakeholders to gather requirements and build efficient, scalable solutions for advanced analytics and reporting. Responsibilities include data pipeline development (batch and real-time), ETL process creation and maintenance, data integration from diverse sources, collaboration with cross-functional teams, performance optimization, data validation, cloud platform integration (AWS, Azure, or Google Cloud), automation and scheduling, and comprehensive documentation. The role requires expertise in Python, Databricks, and various data technologies.
Must have:
  • 5+ years Data Engineering experience with Python expertise
  • Databricks or similar big data platform experience
  • Strong understanding of data pipelines, ETL, data integration
  • Cloud platform experience (AWS, Azure, GCP)
  • SQL proficiency and relational/non-relational database experience
  • Big data technologies (Spark, Kafka, Hadoop)
  • Data modeling, warehousing, database design
  • Experience with large datasets, ensuring data integrity and performance
  • Git and CI/CD pipeline experience
Good to have:
  • Delta Lake, Lakehouse architecture
  • Machine learning and data science workflows
  • DevOps/DataOps practices
  • Terraform, Docker, Kubernetes
  • Data governance, data privacy (GDPR, CCPA), data security
Project description

As a Data Engineer Developer, you will design, develop, and maintain data pipelines using Python and Databricks to process large-scale data sets. You will collaborate with data scientists, analysts, and business stakeholders to gather data requirements and build efficient, scalable solutions that enable advanced analytics and reporting.

Responsibilities

Data Pipeline Development: Design, develop, and implement scalable data pipelines using Python and Databricks for batch and real-time data processing.

ETL Processes: Build and maintain ETL (Extract, Transform, Load) processes to gather, transform, and store data from multiple sources.

Data Integration: Integrate structured and unstructured data from various internal and external sources into data lakes or warehouses, ensuring data accuracy and quality.

Collaboration: Work closely with data scientists, analysts, and business teams to understand data needs and deliver efficient solutions.

Performance Optimization: Optimize the performance of data pipelines and workflows to ensure efficient processing of large data sets.

Data Validation: Implement data validation and monitoring mechanisms to ensure data quality, consistency, and reliability.

Cloud Integration: Work with cloud platforms like AWS, Azure, or Google Cloud to build and maintain data storage and processing infrastructure.

Automation & Scheduling: Automate data pipelines and implement scheduling mechanisms to ensure timely and reliable data delivery.

Documentation: Maintain comprehensive documentation for data pipelines, processes, and best practices.

Skills

Must have

5+ years of experience as a Data Engineer with strong expertise in Python.

Bachelor's degree in Computer Science, Data Engineering, or a related field (or equivalent experience).

Hands-on experience with Databricks or similar big data platforms.

Strong understanding of data pipelines, ETL processes, and data integration techniques.

Experience with cloud-based platforms such as AWS, Azure, or Google Cloud, particularly with services like Data Lakes, S3, or Azure Blob Storage.

Proficiency in SQL and experience with relational and non-relational databases.

Familiarity with big data technologies like Apache Spark, Kafka, or Hadoop.

Strong understanding of data modeling, data warehousing, and database design principles.

Ability to work with large, complex datasets, ensuring data integrity and performance optimization.

Experience with version control tools like Git and CI/CD pipelines for data engineering.

Excellent problem-solving skills, attention to detail, and the ability to work in a collaborative environment.

Nice to have

Experience with Delta Lake, Lakehouse architecture, or other modern data storage solutions.

Familiarity with machine learning and data science workflows.

Experience with DevOps or DataOps practices.

Knowledge of Terraform, Docker, or Kubernetes for cloud infrastructure automation.

Familiarity with data governance, data privacy regulations (e.g., GDPR, CCPA), and data security best practices.

Other

Languages

English: B2 Upper Intermediate

Seniority

Regular

View Full Job Description

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Luxoft, a DXC Technology Company (NYSE: DXC), is a digital strategy and software engineering firm providing bespoke technology solutions that drive business change for customers the world over. Acquired by U.S. company DXC Technology in 2019, Luxoft is a global operation in 44 cities and 21 countries with an international, agile workforce of nearly 18,000 people. It combines a unique blend of engineering excellence and deep industry expertise, helping over 425 global clients innovate in the areas of automotive, financial services, travel and hospitality, healthcare, life sciences, media and telecommunications.

DXC Technology is a leading Fortune 500 IT services company which helps global companies run their mission critical systems. Together, DXC and Luxoft offer a differentiated customer-value proposition for digital transformation by combining Luxoft’s front-end digital capabilities with DXC’s expertise in IT modernization and integration. Follow our profile for regular updates and insights into technology and business needs.

Gothenburg, Västra Götaland County, Sweden (On-Site)

United States (Remote)

New Delhi, Delhi, India (Remote)

Poland, Ohio, United States (Remote)

Ukrainka, Kyiv Oblast, Ukraine (Remote)

View All Jobs

Get notified when new jobs are added by Luxoft

Similar Jobs

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Zelis - QA Engineer - Web Applications

Zelis, India (On-Site)

Nisum - QE Lead  - W6528

Nisum, India (Hybrid)

Nisum - Java Developer  - W6527

Nisum, India (Hybrid)

Gala - Senior Infrastructure Platform Engineer

Gala, United States (On-Site)

Care Stack - Site Reliability Engineer

Care Stack, India (On-Site)

undefined - Senior Platform Security Engineer

Hyderabad, Telangana, India (On-Site)

Salesforce - Technical Architect

Salesforce, Japan (On-Site)

Ajmera Infotech - React Developer

Ajmera Infotech, India (On-Site)

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Get notifed when new similar jobs are uploaded

Data Analyst Jobs

Morning Star - QA Manager (Data, ETL)

Morning Star, Romania (Hybrid)

Trendyol - Senior Data Scientist - Seller

Trendyol, Türkiye (Hybrid)

Apollo Global Management,  Inc  - Analyst- ISGI

Apollo Global Management, Inc , India (Hybrid)

Trendyol - Analytics Engineer

Trendyol, Türkiye (Hybrid)

Activision - Expert Software Engineer (Privacy Data)

Activision, United States (On-Site)

Universal Music - Manager, Data Science, Bilingual (English/Spanish)

Universal Music, United States (On-Site)

Playrix - Senior Data Analyst (Attribution)

Playrix, Kazakhstan (Remote)

Playrix - Lead Business Analyst (DWH)

Playrix, Ukraine (Remote)

Get notifed when new similar jobs are uploaded