Abstract
The precise identification of Human Leukocyte Antigen class I (HLA-I) binding motifs plays a central role in our ability to understand and predict (neo-)antigen presentation in infectious diseases and cancer. Here, by exploiting co-occurrence of HLA-I alleles across ten newly generated as well as forty publicly available in-depth HLA peptidomics datasets, we show that we can rapidly and accurately identify HLA-I binding motifs and map them to their corresponding alleles without any a priori knowledge of HLA-I binding specificity. Our novel approach uncovers new motifs for several alleles that up to now had no known ligands. HLA-ligand predictors trained on such data substantially improve neo-antigen predictions in four melanoma and two lung cancer patients, indicating that unbiased HLA peptidomics data are ideal for in silico identification of (neo-)antigens. The new motifs further reveal allosteric modulation of HLA-I binding specificity and we unravel the underlying mechanisms by protein structure analysis, mutagenesis and in vitro binding assays.