The aim of the statistical bioinformatics seminar is to provide a forum for people working within the broad area of computation and statistics and their application to various aspects of biology to present their work and showcase their ongoing projects. It is intended to foster the exchange of ideas and build potential collaborations across multiple disciplines. The seminars will be held at 1:00 pm on Monday in Charles Perkins Centre Seminar Room (Level 3, large meeting room). The format of the talk is 30~45 minutes plus questions. Speaker: Beth Signal (Garvan Institute of Medical Research) Title: Machine learning annotation of branchpoints and in silico modelling of functional splicing events. Abstract: RNA splicing is a key component of mature RNA transcript formation, required for the removal of intronic regions and subsequent ligation of exonic regions. This process can also allow for alternative splicing to occur, where different exonic regions are ligated together to produce alternative RNA products. The branchpoint element is one of the splicing sequence elements, required for the first lariat-forming reaction in splicing. However current catalogues of human branchpoints remain incomplete due to the difficulty in experimentally identifying these elements. To address this limitation, we have developed a machine-learning algorithm - branchpointer - to identify branchpoint elements solely from gene annotations and genomic sequence. Using branchpointer, we annotate branchpoint elements in 85% of human gene introns with sensitivity (61.8%) and specificity (97.8%). In addition to annotation, branchpointer can evaluate the impact of SNPs on branchpoint architecture to inform functional interpretation of genetic variants. Branchpointer identifies all published deleterious branchpoint mutations annotated in clinical variant databases, and finds thousands of additional clinical and common genetic variants with similar predicted effects. While alternative splicing can produce alternative RNA products, a large proportion of these have little functional impact on open reading frames or transcript stability. To address this limitation in the functional interpretation of differential splicing analyses, we have developed software to model events in silico and interpret their functional impact. About the speaker: Beth is a PhD Student in the Clinical Genome Informatics group at the Garvan Institute. Her current research is focused on developing bioinformatics methods to understand how transcript splicing and expression is controlled. She has a particular interest in using machine learning techniques to study transcriptomic behaviour.