Bioinformatics & Biostatistics Core

Learn how you can collaborate with the Bioinformatics & Biostatistics Core in advancing diabetes research and care

About Bioinformatics and Biostatistics Core

The Bioinformatics and Biostatistics Core at Joslin Diabetes Center offers support for data-driven projects related to basic, clinical, and translational research, with a particular emphasis on diabetes. The Core aims to ensure that researchers take advantage of the most modern and robust methods available in the field of bioinformatics and biostatistics.

Core services are available to investigators at Joslin Diabetes Center, Harvard Medical School, the Longwood Medical Area communities, and outside institutions and companies for study design, data analysis, and method write-ups for manuscripts, grant applications, conference abstracts, and other projects.

Services Provided

Bioinformatics Services

We offer analysis from a growing list of high-throughput data types, including:

Single-cell/bulk RNA/DNA sequencing for gene expression, single nuclei RNA-seq, ribometh-seq, ribosome and polysome profiling, methyl-seq, ChIP-seq, ATAC-seq, metagenomics, variant analysis
Mass spectrometry for proteomics including post-translational modifications (e.g. phosphoproteomics, ubiquitomics), metabolomics, lipidomics
Microarrays for gene expression, SNPs, protein arrays

For sequencing data, we first, trim adapters, align and quantify expression. Once we have an abundance table, we typically normalize, perform quality control, account for missing values (for mass spectrometry data), assess clustering with principal component analysis (PCA) or t-distributed stochastic neighbor embedding (t-SNE), test association to phenotype or differential abundance between groups for analytes (e.g. genes) and pathways, and produce visualizations such as heatmaps, volcano plots, violin plots, and interactive plots. We produce a report from this analysis with a description of our methods and results.

This pipeline takes approximately 5 hours if we are given normalized data set, 10 hours for common raw data sets (such as bulk RNA-seq and metabolomics) but for single-cell data it takes approximately 20 hours. These costs are insensitive to the data set’s sample size.

Additional features:

Sample size and power calculations for high-throughput studies
Clustering and classification
State-of-the-art reproducible workflows
Analysis and meta-analysis of public data
Network analysis
Integration of multiple data types, including clinical covariates
Causal inference testing (AKA mediation analysis)
Metabolic flux inference, e.g. from Seahorse assays
(For Joslin investigators) an in-house searchable gene expression database with >75 studies and seminars on the free R software

Biostatistics Services

We offer analysis of data from clinical, basic, translational, and epidemiologic research often including sample size and power calculations, comparisons of group means (e.g. t-test, ANOVA, non-parametric tests), measures of association (e.g. correlation, regression), time-to-event analyses (e.g. survival analysis, Cox regression), and mixed models/repeated measures approaches. We also provide modern approaches to dealing with missing data, network analyses, simulation, and machine learning.

Ordering

Initial brief consultations are provided at no charge through the support of the Joslin Core and/or Harvard Catalyst. Please contact us for more details.

Subsequently, services are charged at the rate of either $100.00 per hour for Joslin internal and adjunct users and at the hourly rate of $135.00 per hour for non-Joslin users. Services should be ordered from the iLab Application Suite’s Bioinformatics & Biostatistics Core page.

The Core members will make the best efforts to accommodate all requests. The average turnaround time is approximately one week.

Contact Information for Core Staff

For consultations and questions about our Core, please email Core Director Dr. Jonathan Dreyfuss and Senior Bioinformatician Dr. Hui Pan.