




Job Summary: We are seeking a Senior Data Engineer specializing in Spark to optimize production pipelines, refactor code, and design efficient large-scale distributed processing. Key Highlights: 1. Solid production experience with Scala + Apache Spark 2. Deep understanding of Spark architecture and optimization 3. Stable, high-impact project **Senior Data Engineer (Scala \& Apache Spark – Production)** ------------------------------------------------------------- At Excelia, we are a leading multinational firm in Consulting, Technology, and Professional Services, with over 25 years of experience and presence in more than 50 countries. **We are looking for a Senior Data Engineer specialized in Spark** **Requirements (Mandatory)** ---------------------------- * Solid production experience with **Scala \+ Apache Spark** * Deep understanding of **Spark architecture and optimization** * Ability to analyze performance and refactor code * High technical autonomy **Key Experience** --------------------- * **Advanced Spark Optimization** + Catalyst optimizer (logical and physical plans) + DAG reading and bottleneck detection + Tungsten (efficient memory and serialization) * **Performance Tuning** + Cache/persist, checkpointing, and memory management + Shuffle and join elimination (broadcast, co\-partitioning) + Advanced partitioning, salting, and data skew correction * **Complex Distributed Processing** + Advanced mapPartitions + Hierarchical structure processing (trees, forests) + Distributed traversal without costly iterations * **Advanced Snowflake** + Clustering, micro\-partitioning, pruning, and pushdown + Spark\-Snowflake connector optimization **What You Will Do?** --------------- * Optimize Spark pipelines in production * Refactor code suffering from performance issues * Design efficient large-scale distributed processing * Reduce costs, latency, and resource consumption **We Offer** ------------- Stable, high-impact project Competitive salary Challenging technical environment Senior team


