AI SCORE 8.5 / 10

Principal Data Software Engineer - Databricks Remote

$120K–$150K/year

Databricks•Apache Spark•Scala•PySpark•ETL•Data Architecture•Cloud Technologies•AWS•Azure•GCP

About the Role

We are looking for a Principal Data Software Engineer to join our team remotely. This role offers the opportunity to become a pivotal member of our expert team, making significant contributions in a collaborative environment. As a Principal Data Software Engineer remote, you will possess expertise in Databricks and have a proactive, open-minded approach.

What You'll Do

Design, develop, and maintain scalable data pipelines and robust data architectures.
Optimize data models and ETL processes using Databricks and complementary technologies.
Implement data quality checks and monitoring systems to maintain high data integrity.
Stay current with emerging trends and technologies in data engineering, advocating for the integration of new tools.
Troubleshoot and resolve data-related issues efficiently.
Conduct code reviews to ensure high standards of code quality are maintained.
Drive technology initiatives, including designs, proof of concept, and research and development.
Streamline project processes to enhance data engineering practices.
Foster transparent and effective communication with team members and clients to justify and discuss technical solutions.

Requirements

Proven expertise in Spark, using either Scala or PySpark.
Strong background in data architecture, data modeling, and building ETL pipelines using Databricks.
Experience with multiple SDLC phases and technical leadership over complex implementations.
Proficiency in cloud-native technologies and software engineering best practices including unit testing and linting.
Engineering background in at least one major cloud platform: AWS, Azure, or GCP.
Skillful in performance optimization for data-intensive applications.
Keen on technological advancements and modernizing legacy systems.
Proven ability to present and advocate for technical solutions to stakeholders.
Independent problem-solving skills and comfort with ambiguity.
Highly proactive with evident client-facing experience.
Fluent English communication skills, minimum B2+ level.

Nice to Have

Experience with additional data processing frameworks.
Familiarity with machine learning concepts.
Knowledge of data governance and compliance standards.

What We Offer

Private health insurance.
EPAM Employees Stock Purchase Plan.
100% paid sick leave.
Referral Program.
Professional certification opportunities.
Language courses.
Flexible work options with 24 working days of annual leave and paid time off for numerous public holidays.
Continuous learning culture with internal training, mentorship, and sponsored certifications.

Language Requirements

EnglishB2

BasicIntermediateAdvancedNative

Why This Job8.5 of 10

This Principal Data Software Engineer role at EPAM offers a unique opportunity to work remotely while optimizing data pipelines and architectures using Databricks. Enjoy a supportive work culture with excellent benefits.

Salary Range

Required

0/1

Optional

0/1

Bonus

0/1

Who Will Succeed Here

→

Proficiency in Databricks and Apache Spark, with hands-on experience in building ETL processes using Scala and PySpark, demonstrating an ability to optimize data workflows for large-scale data processing.

→

Strong self-motivated work ethic suited for a remote environment, with a track record of effectively collaborating with cross-functional teams using tools like Git and JIRA to manage projects and code versions.

→

Deep understanding of cloud data architecture across AWS, Azure, and GCP, with experience in implementing best practices for cloud resource management and data security, showcasing a strategic mindset for scalable solutions.

Learning Resources

→Databricks Documentationguide

→Apache Spark Programming with Databrickscourse

→Building ETL Pipelines with PySparkarticle

Career Path

Principal Data Software Engineer - Databricks Remote(Now)→Data Engineering Manager(1-2 years)→Director of Data Engineering(3-5 years)

Market Overview

Market Size 2024

$7.5B

Annual Growth

25.4%

AI Adoption in Data Engineering

45%

Investment in Databricks Ecosystem

+200%

Labour Demand for Data Engineers

+30%

Avg Salary for Senior Data Engineer

$150K

Skills & Requirements

Required

DatabricksApache SparkScala

Growing in Demand

Data Lakehouse ArchitectureMachine Learning IntegrationReal-time Data Processing

Declining

Hadoop MapReduceTraditional ETL Tools (e.g., Informatica)

Domain Trends

Rise of Lakehouse Architecture

The shift towards lakehouse architecture is expected to grow by 40% in adoption among enterprises by 2025, combining the best of data lakes and warehouses.

Increased Demand for Real-time Analytics

Real-time analytics capabilities are projected to see a 50% increase in demand, driven by businesses needing immediate insights for decision-making.

Cloud-native Data Solutions

Cloud-native solutions are anticipated to dominate the market, with 70% of data workloads expected to migrate to cloud platforms by 2025, emphasizing the need for expertise in AWS, Azure, and GCP.

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.