Internship Opportunities

Type of Internships

Bachelor Internship (B.Sc. thesis)Master Internship (M.Sc. thesis)
Focused on one method or one tool
Richer in coding
Focused on one scientific question
Less centred on coding (i.e. not the goal of the project), but some statistics required


A minimal amount of bioinformatics knowledge is required before any intern can begin her/his internship project.

The knowledge required varies depending on the project, and it might include:
– Bash
– Python
– R
– Nextflow

The intern will then learn, under supervision, to work in either HPC or Cloud environments.

An on-boarding study plan has been prepared: this will serve to identify any knowledge gaps, and suggest a study plan to be followed before commencing the internship activity.


At the moment, the following datasets are available to carry out different laboratory projects:

DiseaseTotal SamplesCasesControls
Macular Degeneration3,3212,1851,155
Myocardial Infarction7,2984,0003,298
Coronary Artery Disease1,968984984
Inflammatory Bowel Disease3,7983,318480
Amyotrophic Lateral Sclerosis43435480

Additional datasets / phenotypes might be available depending on the ongoing collaborations, which the intern might be involved in.

Project focus

Bachelor project types

Package Development

In a project of this type, we will develop a tool to carry out a specific analysis of a dataset type.

The tool could be:

  • a python tool (a conda recipe, a small software to be released, a repository of scripts)
  • an R package

Workflow Development

In this project type, we will assemble an analysis pipeline to carry out a series of tasks, and combine several tools into a single workflow.

To this purpose, we will use the workflow engine Nextflow and release the pipeline in a GitHub repository.

Master project types

Epistasis / Variant interactions

Our Lab is interested in the impact variant combinations might have on Human phenotypes, particularly when each variant cannot be associated by itself to a disease or complex trait.

We will focus on methods to capture the role of this variants, and apply those methods to one of the datasets we have available.

Networks & Data Integration

We are interested in employing and developing methods for cross-phenotype network analysis (multi-layered networks and data integration techniques), in order to capture pervasive physiological responses (inflammation) which might play a crucial role in multiple phenotypes.

By combining diverse traits, and analyse them together, we aim at capturing some players, yet to be unmasked and important for disease evolution and penetrance of their genetics.