Applications
If you wish to apply for an internship, we kindly ask to fill in this form first, then send an email to francesco dot lescai at unipv dot it for a meeting.
Please do read this page, to understand better the characteristics of the projects and the requirements.
The topics described in this page are just meant to provide an example of potential projects: we are working on a diverse set of data and details on available opportunities will be described in our first meeting.
Type of Internships
Bachelor Internship (B.Sc. thesis) | Master Internship (M.Sc. thesis) |
---|---|
Focused on one method or one tool Richer in coding | Focused on one scientific question Less centred on coding (i.e. not the goal of the project), but some statistics required |
Requirements
A minimal amount of bioinformatics knowledge should be acquired before any intern can begin her/his internship project.
The knowledge required varies depending on the project, and it might include:
– Bash
– Python
– R
– Nextflow
The intern will then learn, under supervision, to work in either HPC or Cloud environments.
An on-boarding study plan has been prepared: this will serve to identify any knowledge gaps, and suggest a study plan to be followed before commencing the internship activity.
Datasets
At the moment, the following datasets are available to carry out different laboratory projects:
Disease | Total Samples | Cases | Controls |
---|---|---|---|
Macular Degeneration | 3,321 | 2,185 | 1,155 |
Psoriasis | 4,844 | 2,913 | 1,930 |
Atherosclerosis | 3,592 | 1,800 | 1,792 |
Myocardial Infarction | 7,298 | 4,000 | 3,298 |
Coronary Artery Disease | 1,968 | 984 | 984 |
Inflammatory Bowel Disease | 3,798 | 3,318 | 480 |
Amyotrophic Lateral Sclerosis | 434 | 354 | 80 |
TOTAL | 25,255 | 15,554 | 9,719 |
Additional datasets / phenotypes might be available depending on the ongoing collaborations, which the intern might be involved in.
Project focus
Bachelor project types

Package Development
In a project of this type, we will develop a tool to carry out a specific analysis of a dataset type.
The tool could be:
- a python tool (a conda recipe, a small software to be released, a repository of scripts)
- an R package

Workflow Development
In this project type, we will assemble an analysis pipeline to carry out a series of tasks, and combine several tools into a single workflow.
To this purpose, we will use the workflow engine Nextflow and release the pipeline in a GitHub repository.
Question-focused analysis
In a project of this type, we will focus on analysing a single dataset with a specific question in mind.
The analysis will make use of computational tools and existing pipelines, based on Nextflow, and data will be usually analysed in R. Examples of this could include: an RNAseq differential expression analysis, differential splicing analysis, the assembly of a non-Human organism.
Master project types

Epistasis / Variant interactions
Our Lab is interested in the impact variant combinations might have on Human phenotypes, particularly when each variant cannot be associated by itself to a disease or complex trait.
We will focus on methods to capture the role of this variants, and apply those methods to one of the datasets we have available.

Networks & Data Integration
We are interested in employing and developing methods for cross-phenotype network analysis (multi-layered networks and data integration techniques), in order to capture pervasive physiological responses (inflammation) which might play a crucial role in multiple phenotypes.
By combining diverse traits, and analyse them together, we aim at capturing some players, yet to be unmasked and important for disease evolution and penetrance of their genetics.
Genetic determinants of diseases
Depending on the ongoing collaborations, a more focused analysis based on exome or whole-genome sequencing data could be carried out, to study the presence and contribution of different type of variants to Human pathologies.
We are particularly interested in the contribution of rare variants to phenotypes.