VALIDATE Bioinformatics Support

Bioinformatics

VALIDATE Bioinformatician

Mirvat Surakhy is VALIDATE’s resident Bioinformatician and Computational Biologist. Mirvat ensures that VALIDATE maximises the outputs and lessons learned from the immunology and vaccine-related data across the Network to advance vaccine research and development for our focus pathogens.

VALIDATE Members can contact Mirvat for assistance with:

Mirvat Surakhy
  1. Bulk RNA-seq Analysis (QC, mapping to genome,  counting, differential gene expression and pathway analysis)
  2. Microarray
  3. Single-cell RNA-seq analysis (10X genomics platform) 
  4. Genotyping analysis (association with age of onset)
  5. Survival analysis (with genotype and RNA expression)
  6. Cox proportional hazard (survival, genotype and the effect of immune subtype)
  7. Effect of polymorphisms (SNP)  on protein
  8. Overlaying information from different datasets of interest to give more insight about your data

Contact Mirvat for bioinformatics support. 

VALIDATE Data Portal

To accelerate and facilitate vaccine research and development for our focus pathogens, VALIDATE hosts a data-sharing portal for our members, to enable sharing of knowledge and comparison of similar data sets, both published and unpublished, across pathogens, species, countries, research groups, trials... to find new lessons to learn from the synergies and differences in the data.

Published Data

Published data in the catalogue is open to all and can be accessed here. You can search by pathogen, organism or experiment. There is also a resources page where members can find useful websites and databases related to their work. 

 

Confidential Data

Unpublished data can only be accessed by VALIDATE members who have signed the VALIDATE Confidentiality Agreement and at the discretion of the data owner. Members are reminded that they have agreed to fair publication policies and the keeping of confidentiality of unpublished data as part of their VALIDATE membership terms and conditions.

VALIDATE Members will be able to access the confidential data soon.

Training: R programming language

R Logo

R is a licence free programming language used for statistical computing and data science and has been used to visualise everything from Market trends to vaccine efficacy. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. 

Mirvat has selected training that will help you enhance your bioinformatic skills:

Pdfs:

Online Courses:

Training: Analysis of single cell RNA-seq data

In this course you will discuss some of the questions that can be addressed using scRNA-seq as well as the available computational and statistical methods available. The course is taught through the University of Cambridge Bioinformatics training unit, but the material found on these pages is meant to be used for anyone interested in learning about computational analysis of scRNA-seq data. The course is taught twice per year and the material here is updated prior to each event.

Find out more online.

Training Analyzing RNA-seq data with DESeq2

A basic task in the analysis of count data from RNA-seq is the detection of differentially expressed genes. The count data are presented as a table which reports, for each sample, the number of sequence fragments that have been assigned to each gene. Analogous data also arise for other assay types, including comparative ChIP-Seq, HiC, shRNA screening, and mass spectrometry. An important analysis question is the quantification and statistical inference of systematic changes between conditions, as compared to within-condition variability. The package DESeq2 provides methods to test for differential expression by use of negative binomial generalized linear models; the estimates of dispersion and logarithmic fold changes incorporate data-driven prior distributions. This vignette explains the use of the package and demonstrates typical workflows. An RNA-seq workflow on the Bioconductor website covers similar material to this vignette but at a slower pace, including the generation of count matrices from FASTQ files. DESeq2 package version: 1.30.0

Michael I. Love, Simon Anders, and Wolfgang Huber

Find the full course online.