The aim of the statistical bioinformatics seminar is to provide a forum for people working within the broad area of computation and statistics and their application to various aspects of biology to present their work and showcase their ongoing projects. It is intended to foster the exchange of ideas and build potential collaborations across multiple disciplines. Monday November 20, 2017 (PLEASE NOTE: Special location - Level 5 Large Meeting Room, Usual time: 1pm - 2pm) Speaker: Elizabeth Mason (The University of Melbourne) Title: Modelling transcriptional variability in single cell RNA-seq data during human embryogenesis captures changes in the regulation of critical developmental genes Abstract: Human development is a temporally and spatially ordered series of events that occur with remarkable precision; the same DNA blueprint gives rise to more than 250 sharply defined cell phenotypes. At the functional phenotype level embryogenesis appears predictable because we observe the average behaviour of many individual cells, even as the number of cells, the range of phenotypes and transcriptional complexity increases during the course of development. When we evaluate single molecules and transcripts that the stochastic nature of gene expression is revealed, for example in single cell RNA-seq experiments (scRNA-seq). Current methods reduce scRNA-seq data to a well-defined trajectory based on the abundance of key regulators of phenotype, and differential abundance between cells in a given phenotype is used to identify sub-populations. Here we present an alternative approach: that measuring the transcriptional variability at the gene level informs the level of regulation imposed on it, reflecting an intrinsic property of development that is often overlooked. While linear models have been a successful framework to characterize differences in abundance between phenotypes on average, they do not account for stochastic differences captured by scRNA-seq experiments. Accurately determining abundance and variability is further complicated by the sparseness of non-zero expression values. To address these challenges and evaluate gene expression during human pre-implantation embryogenesis, we applied a statistical mixture model to scRNA-seq data. Fitting the model on a gene-by-gene basis allowed us to evaluate shifts in the proportion of cells expressing a given gene (λ), and also the mean (μ) and standard deviation (Ï) of expression. From here, a correlation based analysis evaluated whether abundance (μ) and variability (Ï) capture different aspects of transcriptional regulation. While each metric largely identified the same genes, the number and nature of relationships between them differed. Indeed, genes sharing correlated patterns of variability during development were enriched for motifs associated with developmental transcription factors (e.g. HIC2, PPARG, E2F4 and ZNF692). Variability was more effective than abundance at specifically detecting regulatory relationships during development, and with less redundancy. Our approach provides a gene-centric platform to evaluate population-based parameters of gene expression, while preserving the complexity of scRNA-seq data. About the speaker: Lizzi began her career in human genomics as a laboratory manager and laboratory technician with Professor Greg Gibson (Centre for Integrative Genomics, Georgia Tech University). She conducted 2 investigations in Australia which identified maternal influences on development of the neonate immune system, and uncovered population structure of the leukocyte transcriptome. Together with scientists at Emory University, Greg and Lizzi initiated the CIGâs involvement in the WHOLE (Wellness and Health Omics Linked to the Environment) study of Predictive Health Genomics in Atlanta (USA) which is currently in its 6th year. Lizzi has recently completed a PhD in systems biology of human stem cells at the Australian Institute for Bioengineering and Nanotechnology at the University of Queensland. Her PhD project formed an international collaboration with Professor Christine Wells (University of Melbourne AUS), stem cell biologists Professors Martin Pera (Jackson Laboratory USA) and Ernst Wolvetang (University of Queensland AUS), biostatistician Assistant Professor Jessica Mar (Albert Einstein College of Medicine, USA) and computational biologist Professor John Quackenbush (Harvard University, USA). Her primary focus is evaluating whether molecular variability in stem cell populations describes an important, but until now hidden predictor of cellular behaviour and phenotype. Phenotypic heterogeneity in clonally derived cell populations is ubiquitous, and biologically relevant information is often masked by using population-averaging techniques, versus individual cell based measurements. She has developed new network approaches which incorporate gene expression variance, with the goal of identifying genetic elements which stabilize a cell phenotype, and push a cell to transition between phenotypes. During her PhD Lizzi has been invited to present her work in departmental seminars at the Harvard Stem Cell Institute, the Lieber Brain Institute at Johns Hopkins University, and the Black Family Stem Cell Institute at Mt Sainai Hospital New York. She was also one of 12 international scientists who were invited to participate in the Radcliffe Exploratory Workshop for Variation at Harvard University in 2011. She is currently based with Professor Christine Wells in the Centre for Stem Cell Systems at the University of Melbourne, where she is working on applied statistical methods to evaluate molecular variability in single cell RNA-seq data.