




Job Summary: We are seeking an experienced Data Engineer in Big Data ecosystems for an international infrastructure migration project—from Hadoop to cloud environments based on Kubernetes—developing and automating data pipelines. Key Highlights: 1. International project in a Big Data environment 2. Participation in data platform modernization projects 3. Modern technological environment Data Engineer We are seeking a Data Engineer with experience in Big Data ecosystems to participate in an international project focused on migrating Hadoop infrastructure to Kubernetes-based cloud environments. You will join a data engineering team responsible for designing, developing, and automating data pipelines, working with technologies such as Spark, Scala, Airflow, and CI/CD tools within an agile environment. Responsibilities Migrate Hadoop infrastructure to the cloud using Kubernetes Engine, COS, Spark as a Service, and Airflow as a Service. Develop data transformation and data quality processes to ensure consistency and accuracy. Implement data pipelines using Scala, SQL, and Apache Spark. Automate processes using Airflow and orchestration tools. Create and maintain CI/CD pipelines for automated deployment and testing. Develop unit tests and validate data processes. Produce technical and operational documentation. Collaborate with business and technology teams to design scalable data solutions. Technical Requirements Experience with Apache Spark and Scala Experience with the Hadoop ecosystem Proficiency in SQL and NoSQL databases Experience with Apache Airflow Experience with HDFS Experience with CI/CD (GitLab, Jenkins or similar) Knowledge of S3 / COS Storage Experience working with Parquet and ORC Additional desirable skills Kubernetes / containerization Oozie Shell scripting Dremio Elasticsearch / Kibana Kafka or streaming processing We Offer International project in a Big Data environment Hybrid work model in Madrid (1 day onsite) Participation in data platform modernization projects Modern technological environment Spark, Scala, SQL, NoSQL, CI/CD, S3, COS Storage


