Community

Home >

Jobs >

Middle/Senior Data Engineer (#2487)

Middle/Senior Data Engineer (#2487)

1 Month ago • 4 Years + • Data Analyst • DevOps • Undisclosed

About the job

6 skills required for this role

Add these skills to join the top 1% applicants for this job

spark

aws

python

terraform

communication

cross-functional

Job Description

This role involves designing, developing, and optimizing data pipelines using PySpark within an AWS ecosystem for a US-based B2B marketplace company. Responsibilities include leveraging AWS services (S3, Glue, EMR, Lambda, Redshift) to build scalable data solutions, optimizing PySpark workflows for performance and cost-efficiency, collaborating with data scientists and analysts, ensuring data quality and integrity, implementing data governance and security best practices, and documenting technical processes. The ideal candidate will have 4+ years of experience in data engineering with a focus on PySpark and AWS, proficiency in Python, a solid understanding of distributed computing, and excellent communication skills.

Must have:

PySpark expertise for data pipeline development
AWS experience (S3, Glue, EMR, Lambda, Redshift)
Python programming proficiency
Data pipeline optimization
Data quality and integrity assurance

Good to have:

Experience with Terraform or CloudFormation
Advanced knowledge of Apache Spark

Perks:

Flexible working format (remote, office, or hybrid)
Competitive salary and benefits package
Personalized career growth
Professional development opportunities
Education reimbursement
Corporate events and team buildings

Role Overview:
As a Data Engineer on our Development Team, you will design, develop, and optimize data pipelines within an AWS ecosystem for a US-based B2B marketplace company. Your expertise in PySpark will be instrumental in processing large-scale datasets, ensuring the reliability and performance of our data systems. You will collaborate with cross-functional teams, including data scientists and analysts, to deliver high-impact solutions that support business objectives.

Key Responsibilities:

Design, develop, and implement data pipelines using PySpark within AWS environments
Leverage AWS services such as S3, Glue, EMR, Lambda, and Redshift for building scalable data solutions
Optimize PySpark workflows for performance, reliability, and cost-efficiency
Collaborate with stakeholders to understand data requirements and translate them into technical solutions
Ensure data quality and integrity through robust testing and monitoring processes
Implement data governance, security, and compliance best practices in all development activities
Document technical designs, processes, and workflows to support ongoing maintenance and team knowledge sharing.

Requirements:

Bachelor’s degree in Computer Science, Engineering, or a related field
4 years+ of experience in data engineering, with a focus on building and optimizing data pipelines using PySpark
Strong experience with AWS services, including S3, Glue, Lambda, EMR, and Redshift
Proficiency in Python programming and familiarity with related frameworks and libraries
Solid understanding of distributed computing and experience with Apache Spark
Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) is a plus
Strong analytical and problem-solving skills, with attention to detail and a proactive approach to troubleshooting
Excellent communication and collaboration skills, with the ability to work in a dynamic, team-oriented environment
Upper-Intermediate level of English
Ukrainian language Advanced or higher.

We offer:

Flexible working format - remote, office-based or flexible
A competitive salary and good compensation package
Personalized career growth
Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
Active tech communities with regular knowledge sharing
Education reimbursement
Memorable anniversary presents
Corporate events and team buildings
Other location-specific benefits

View Full Job Description

Upload your resume, increase your shortlisting chances by 80%