Abstract
Alternative splicing selection is inherently competitive and the probability for a given splice site to be used depend strongly on the strength of neighbouring sites. Here we present a new model named Competitive Splicing Site Model (COSSMO) that improves on the start of the art in predicting splice site selection by explicitely modelling these competitive effects. We model an alternative splicing event as the choice of a 3’ acceptor site conditional on a fixed upstream 5’ donor site, or the choice of a 5’ donor site conditional on a fixed 3’ acceptor site. Our model is a custom architecture that uses convolutional layers, communication layers, LSTMS, and residual networks, to learn relevant motifs from sequence alone. COSSMO is able to predict the most frequently used splice site with an accuracy of 70% on unseen test data, which compares to only around 35% accuracy for MaxEntScan.