Senior Data Engineer Research Data Platform

Updated: 3 days ago
Deadline: 08 May 2024

The goal of the Research Data Infrastructure Lab (RDI Lab) is to support our university's academics throughout the research life-cycle, with means and expertise required for them to design and materialize their research by using modern data platforms and tooling. As research matures, the accompanying complexity of the technology stack supporting the work also increases. The RDI Lab aims to offer the right technical abstraction for various research profiles by leveraging industry-hardened, but also experimental, solutions.

The RDI Lab covers a broad landscape where it is futile to 'do it all'. Therefore, we are looking for people who can smartly navigate the domain and are capable to discuss the reasoning behind several approaches with our researchers to support them in their decision-making and our platform design. The RDI Lab is a young team, with a lot of room for personal development and growth. We are looking for people who are not afraid to take initiative and are willing to take on a pioneering role in the development of our services. The RDI Lab covers the entire research data lifecycle ranging from planning phase to data collection, analysis, publication and sharing of data with FAIR principles in mind. Over the past years, the team has focused on establishing the foundational components such as version control systems and CI/CD (GitLab, GitHub), infrastructure for deploying applications (Azure), lab support systems (ClusterMarket), data platforms (Databricks, Microsoft Fabric), Atlassian and other essential tooling for researchers.

We are starting to develop a Research Data Lakehouse to ingest and make available data from various sources. We also want to offer a Trusted Research Environment to process sensitive data. We are looking for a
Data Engineer with a proactive mindset who can collaborate with our multidisciplinary team and contribute to the development and acquisition of these and other systems together with our researchers, institutes, and engineers within the team.

What does the job entail?

  • As Data Engineer, you will be co-creating solutions for the Research Data Lakehouse and Trusted Research Environment developments.
  • You will collect, load, prepare, clean and deliver access to data to a wide range of research stakeholders.
  • You will ensure that proper infrastructure is implemented to guarantee data quality and security practices are in place.
  • You will become an essential part of our RDI Lab team to develop and further operationalize new tooling and services for our university.

What will you be doing?

  • Working on a data lakehouse implementation based on Azure Databrick, together with our team and occasionally with third-party vendors.
  • Designing, implementing, and maintaining data pipelines for data ingestion, processing, and transformation - regarding both realtime and batch data.
  • Implementing required cloud infrastructure based on Terraform and using CI/CD.
  • Propose and implement cloud-based strategy focused on reducing maintenance and costs.
  • Work with our cloud platform team to harden security.


Similar Positions