Data Integration Engineer

Indeed

Full-time

Onsite

No experience limit

No degree limit

C. del Marfull, 11, 08197 Sant Cugat del Vallès, Barcelona, Spain

Favourites

Description

Summary: Join Boehringer Ingelheim's IT Research Development and Medicine – Computational Innovation team to enhance data infrastructure, optimize data flows, and ensure data quality for AI/ML initiatives. Highlights: 1. Shape the future of data-driven decision-making in healthcare. 2. Collaborate with researchers, data scientists, and analysts on AI/ML. 3. Utilize cutting-edge cloud technologies like AWS, Snowflake, and Databricks. **Data Integration Engineer** At Boehringer Ingelheim, we believe that Data \& AI have the power to transform healthcare and improve the lives of millions of patients and animals. As a key member of the IT Research Development and Medicine – Computational Innovation team, you will join a passionate group where you will meet and collaborate with like\-minded people dedicated to fostering a strong data and AI culture, delivering key transformation initiatives, and shaping the future of data\-driven decision\-making across our global organization. Your work will empower our researchers to achieve breakthrough therapies for our patients. We are seeking a skilled and motivated Data Engineer to join the IT RDM CI Data Excellence team. In this role, you will play a pivotal part in enhancing our data infrastructure, optimizing data flows, and ensuring the availability and quality of strategically critical data assets from both internal and external providers. You will enable fast and reliable data delivery to the right environments, supporting cutting\-edge use cases in Artificial Intelligence (AI) and Machine Learning (ML). Collaboration is key: you’ll work closely with researchers, data scientists, and analysts to build a consistent and scalable data ecosystem across multiple analytics and AI initiatives. **Tasks and responsibilities** * Design, develop, and maintain scalable data pipelines and ETL processes to support data integration and analytics. * Collaborate with data architects, modelers and IT team members to help define and evolve the overall cloud\-based data architecture strategy, including data warehousing, data lakes, streaming analytics, and data governance frameworks. * Collaborate with integration engineers, analysts, and other business stakeholders to understand data requirements and deliver solutions. * Optimize and manage data storage solutions and data integrations (e.g., S3, Snowflake, dbt, Snaplogic) ensuring data quality, integrity, security, and accessibility. * Leverage Databricks for scalable data processing, analytics, and advanced transformations. * Implement data quality and validation processes to ensure data accuracy and reliability. * Develop and maintain documentation for data processes, architecture, and workflows. * Participate in code reviews and contribute to best practices for data engineering. * Monitor and troubleshoot data pipeline performance and resolve issues promptly. * Consulting and Analysis: Meet with defined stakeholders to understand and analyze their processes and needs. Determine requirements to present possible solutions or improvements. * Technology Evaluation: Stay updated with the latest trends in data engineering, cloud technologies, and big data platforms. * Expert Communities: Engage actively in internal expert groups to exchange knowledge, mentor junior colleagues, and contribute to improving inefficient processes. * Cloud\-Based Data Solutions: Utilize AWS cloud services (e.g., S3, Lambda, Step function, KMS, …) to support data engineering workflows and develop infrastructure as code for data pipelines using tools like Jenkins and AWS CloudFormation. * Performance Optimization: Monitor and optimize data pipelines for performance, scalability, and cost efficiency and troubleshoot and resolve data\-related issues in a timely manner. **Requirements** * Bachelor/Master degree in Computer Science, Engineering, or related field * Proficiency with the Apache ecosystem (Parquet, Iceberg, Spark, Kafka, Airflow). * Strong hands\-on experience with AWS data services (Kinesis, Glue, Appflow, Lambda, S3\). * Demonstrated experience with Snowflake and dbt (dbt labs) for building and modeling data pipelines * Strong analytical skills working with unstructured datasets. * Experience with relational SQL and NoSQL databases, preferably Snowflake and/or Databricks. * Familiarity with data pipeline and workflow orchestration tools. * Strong project management and organizational skills. * Excellent English written and verbal communication skills. * Snaplogic knowledge is a plus. * Preferred Skills: * Proficiency in scripting languages such as Python or Scala. * Familiarity with data visualization tools (e.g., Tableau, Power BI, QuickSight). * AWS Cloud Practitioner, Architecture, Big Data or Data Analytics certification. * NICE\-TO\-HAVE Qualifications **:** AWS Certified Big Data or AWS Certified Solutions Architect certification, experience with Databricks and Snowflake, knowledge of containerization technologies like Docker and Kubernetes, and experience with CI/CD pipelines and DevOps best practices. \#IamBoehringerIngelheim because… We are continuously working to design the best experience for you. Here are some examples of how we will take care of you: * Flexible working conditions * Life and accident insurance * Health insurance at a competitive price * Investment in your learning and development * Gym membership discounts If you have read this far, what are you waiting for to apply? We want to know more about you!

Source: indeed View original post