Data Resources
Welcome to our Data Resources page. Below you will find a list of relevant DNA databases as well as some links to free online training.
You can find a list of available data sets from VALIDATE projects on our Data Sharing page.
Databases
Plasmid repository, archives and distributes plasmids for scientists, while also providing free molecular biology resources.
Basic Local Alignment Search Tool
BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
Microbial genome Web portal that combines thousands of genomes. It provides an extensive range of query tools, visualization services and analysis software.
Website shows the expression of genes of interest based on Ensembl gene ID or gene symbol, and plot them according to tissue types.
Center for Genomic Epidemiology
Website For the analysis of bacterial genome.
ChIP-Atlas is an integrative and comprehensive database for visualizing and making use of public ChIP-seq data. ChIP-Atlas covers almost all public ChIP-seq data submitted to the SRA (Sequence Read Archives) in NCBI, DDBJ, or ENA, and is based on over 144,000 experiments.
A web-based DNA vaccine database and analysis system that curates, stores, and analyzes DNA vaccines and DNA vaccine plasmid vectors. DNAVaxDB includes only those DNA vaccines that have been verified to induce protection in at least a laboratory animal model.
Interactive and Collaborative Gene List Enrichment Analysis Tool.
Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.
Web Portal for accessing genomic-scale datasets associated with the diverse eukaryotic microbes.
A gene-centric data integrator with web UI and API services.
Gene ATLAS is a large database of associations between hundreds of traits and millions of variants using the UK Biobank cohort.
Gene Ontology Enrichment analysis and visualization tool.
Huvax: Licensed Human Vaccines
A web-based human licensed vaccine database. Huvax collects, annotates and analyses licensed human vaccines around the world. Currently it contains all licensed human vaccines in the US and Canada, and many licensed human vaccines from other countries. Huvax provides a user-friendly web interface for you to search, compare, and analyze different vaccines.
Website in support of the NIH mission to share data with the public.
Immune Epitope Database and Analysis Resources
IEDB catalogs experimental data on antibody and T cell epitopes studied in different species in the context of infectious disease, allergy, autoimmunity and transplantation. IEDB could help in the prediction and analysis of epitopes.
Publicly available database of the genes, proteins, experimentally-verified interactions and signaling pathways involved in the innate immune response of humans, mice and bovines to microbial infection. The database captures an improved coverage of the innate immunity interactome by integrating known interactions and pathways from major public databases together with manually-curated data into a centralised resource.
Tool to design qPCR primers.
Metascape is a free gene annotation and analysis resource that helps biologists make sense of one or multiple gene lists.
Primer Analysis Software. It analyzes the secondary structure, melting temperature, and the best primer pairs for given experimental conditions.
Server predicts CTL epitopes in protein sequences.
The NetMHCIIpan-4.0 server predicts peptide binding to any MHC II molecule of known sequence using Artificial Neural Networks.
Genome database for the genus Plasmodium.
PolyPhen-2 (Polymorphism Phenotyping v2) is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations.
TBDB is an integrated database to genome sequence, expression data and literature for tuberculosis. It contains genome sequence data for Mycobacterium tuberculosis strains and other sequenced Mycobacteria. It offers a collection of tools for the visualization, analysis and data download.
This Atlas contains information regarding the expression profiles of human genes both on the mRNA and protein level. The protein expression data from 44 normal human tissue types is derived from antibody-based protein profiling using immunohistochemistry. The protein data covers 15313 genes (78%) for which there are available antibodies. The mRNA expression data is derived from deep sequencing of RNA (RNA-seq) from 37 different normal tissue types. It also contain information about the expression and spatio-temporal distribution of proteins within human cells.
This tool calculates the Tm of primers and estimates an appropriate annealing temperature when using different DNA polymerases.
Vaxign (Vaccine Design) is a vaccine target prediction and analysis system based on the principle of reverse vaccinology.
Server for alignment-independent prediction of protective antigens and subunit vaccines.
A program to analyze vaccine adjuvants used in the vaccines collected in the VIOLIN vaccine database. A program to analyze vaccine adjuvants used in the vaccines collected in the VIOLIN vaccine database.
Vevax: Licensed Veterinary Vaccines
A web-based licensed veterinary vaccine database. Vevax collects, annotates and analyses licensed veterinary vaccines around the world. Current Vevex focuses on the USA-licensed veterinary vaccines. Vevex contains all licensed veterinary vaccines in the US. Vevax provides a user-friendly web interface for you to search, compare, and analyze different vaccines.
A database collect and analyze vaccine vectors used in vaccine development and research for diseases important for the public health.
A Database of Virulent Genes used for Development of Live Attenuated Vaccines.
Public Access Training
A basic task in the analysis of count data from RNA-seq is the detection of differentially expressed genes. The count data are presented as a table which reports, for each sample, the number of sequence fragments that have been assigned to each gene. Analogous data also arise for other assay types, including comparative ChIP-Seq, HiC, shRNA screening, and mass spectrometry. An important analysis question is the quantification and statistical inference of systematic changes between conditions, as compared to within-condition variability. The package DESeq2 provides methods to test for differential expression by use of negative binomial generalized linear models; the estimates of dispersion and logarithmic fold changes incorporate data-driven prior distributions. This vignette explains the use of the package and demonstrates typical workflows. An RNA-seq workflow on the Bioconductor website covers similar material to this vignette but at a slower pace, including the generation of count matrices from FASTQ files. DESeq2 package version: 1.30.0
Michael I. Love, Simon Anders, and Wolfgang Huber
In this course you will discuss some of the questions that can be addressed using scRNA-seq as well as the available computational and statistical methods available. The course is taught through the University of Cambridge Bioinformatics training unit, but the material found on these pages is meant to be used for anyone interested in learning about computational analysis of scRNA-seq data. The course is taught twice per year and the material here is updated prior to each event.
R is a licence free programming language used for statistical computing and data science and has been used to visualise everything from Market trends to vaccine efficacy. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.
Mirvat has selected training that will help you enhance your bioinformatic skills:
Pdfs:
Online Courses: