The Bioinformatics and Biostatistics Core at Joslin Diabetes Center offers support for data-driven projects related to basic, clinical, and translational research, with a particular emphasis on diabetes. The Core aims to ensure that researchers take advantage of the most modern and robust methods available in the field of bioinformatics and biostatistics.
Core services are available to investigators at Joslin Diabetes Center, Harvard Medical School, the Longwood Medical Area communities, and outside institutions and companies for study design, data analysis, and method write-ups for manuscripts, grant applications, conference abstracts, and other projects.
We offer analysis from a growing list of high-throughput data types, including:
- Single-cell/bulk RNA/DNA sequencing for gene expression, single nuclei RNA-seq, ribometh-seq, ribosome and polysome profiling, methyl-seq, ChIP-seq, ATAC-seq, metagenomics, variant analysis
- Mass spectrometry for proteomics including post-translational modifications (e.g. phosphoproteomics, ubiquitomics), metabolomics, lipidomics
- Microarrays for gene expression, SNPs, protein arrays
For sequencing data, we first, trim adapters, align and quantify expression. Once we have an abundance table, we typically normalize, perform quality control, account for missing values (for mass spectrometry data), assess clustering with principal component analysis (PCA) or t-distributed stochastic neighbor embedding (t-SNE), test association to phenotype or differential abundance between groups for analytes (e.g. genes) and pathways, and produce visualizations such as heatmaps, volcano plots, violin plots, and interactive plots. We produce a report from this analysis with a description of our methods and results.
This pipeline takes approximately 5 hours if we are given normalized data set, 10 hours for common raw data sets (such as bulk RNA-seq and metabolomics) but for single-cell data it takes approximately 20 hours. These costs are insensitive to the data set’s sample size.
- Sample size and power calculations for high-throughput studies
- Clustering and classification
- State-of-the-art reproducible workflows
- Analysis and meta-analysis of public data
- Network analysis
- Integration of multiple data types, including clinical covariates
- Causal inference testing (AKA mediation analysis)
- Metabolic flux inference, e.g. from Seahorse assays
- (For Joslin investigators) an in-house searchable gene expression database with >75 studies and seminars on the free R software
We offer analysis of data from clinical, basic, translational, and epidemiologic research often including sample size and power calculations, comparisons of group means (e.g. t-test, ANOVA, non-parametric tests), measures of association (e.g. correlation, regression), time-to-event analyses (e.g. survival analysis, Cox regression), and mixed models/repeated measures approaches. We also provide modern approaches to dealing with missing data, network analyses, simulation, and machine learning.
Initial brief consultations are provided at no charge through the support of the Joslin Core and/or Harvard Catalyst. Please contact us for more details.
Subsequently, services are charged at the rate of either $100.00 per hour for Joslin internal and adjunct users and at the hourly rate of $135.00 per hour for non-Joslin users. Services should be ordered from the iLab Application Suite’s Bioinformatics & Biostatistics Core page.
The Core members will make the best efforts to accommodate all requests. The average turnaround time is approximately one week.
For consultations and questions about our Core, please email Core Director jonathan.dreyfuss [at] joslin.harvard.edu (Dr. Jonathan Dreyfuss) and Senior Bioinformatician hui.pan [at] joslin.harvard.edu (Dr. Hui Pan).