Abstract
Post-translational modifications (PTMs) on proteins can be targeted by antibodies associated with autoimmunity. Despite a growing appreciation for their intrinsic role in disease, there is a lack of highly multiplexed serological assays to characterize the fine specificities of PTM-directed autoantibodies. In this study, we used the programmable phage display technology, Phage ImmunoPrecipitation Sequencing (PhIP-Seq), to profile rheumatoid arthritis (RA) associated anti-citrullinated protein antibody (ACPA) reactivities. Using both an unmodified and peptidylarginine deiminases (PAD)-modified phage display library consisting of ~250,000 overlapping 90 amino acid peptide tiles spanning the human proteome, PTM PhIP-Seq robustly identifies antibodies to citrulline-dependent epitopes. PTM PhIP-Seq was used to quantify key differences among RA patients, including PAD isoform specific ACPA profiles, and thus represents a powerful tool for proteome-scale antibody-binding analyses.
Introduction
Phage ImmunoPrecipitation Sequencing (PhIP-Seq) is a programmable phage display technology that enables unbiased analysis of antibody binding specificities.1,2,3 An oligonucleotide library encoding the complete human proteome as ~250,000 overlapping 90 amino acid peptides was cloned into the mid-copy T7 phage display vector (~5-15 copies of displayed peptide per phage particle) and is immunoprecipitated by serum antibodies for analysis by high throughput DNA sequencing. This enables serum antibody profiling against hundreds of thousands of peptide epitopes at low cost.2,4 However, since the phage library is produced in E. coli, the displayed peptides lack post-translational modifications (PTMs) relevant to human disease.5,6 Here, we enzymatically modify the phage-displayed human peptidome, such that PTM-dependent autoantibody reactivities can be precisely assessed by quantitative comparison to the unmodified library.
Citrullination is the post-translational conversion of arginine in proteins to citrulline, which in humans is catalyzed by the calcium-dependent peptidylarginine deiminase (PAD) family of enzymes.7,8 This modification can impart antigenicity to self-proteins, and in the case of rheumatoid arthritis (RA), results in the production of anti-citrullinated protein antibodies (ACPAs).9 These antibodies are believed to participate in disease pathogenesis by binding citrullinated proteins in the synovial joint.10,11 The discovery of autoantibodies to peptidylcitrulline epitopes in RA serum led to the development of a diagnostic Enzyme Linked Immunosorbent Assay (ELISA) using synthetic cyclic citrullinated peptides (CCP).12 Since this anti-CCP antibody test can achieve diagnostic sensitivity of ~80% and a specificity up to 98%, anti-CCP antibodies have become a routinely utilized biomarker for RA.13,14
Of the five known PAD isoforms, PAD2 and PAD4 make up the majority of expressed PAD in inflamed synovial tissue.15 Studies using recombinantly expressed enzymes suggest that PAD2 and PAD4 can generate distinct, yet overlapping sets of epitopes that can be targeted by ACPAs.16,17,18 ACPAs that recognize these substrates are known to cross-react, making it difficult to distinguish the relative contributions of the different PAD isoforms to the expression of ACPAs.19,20,21 While aberrant citrullination is best known for its role in RA pathology, it is also implicated in other disease processes, including multiple sclerosis, Alzheimer’s disease, and cancer.22,23 Therefore, unbiased identification of antibody reactivities to citrulline epitopes is of broad interest.
Results
Citrullination of the T7 phage displayed human peptidome library
In a typical PhIP-Seq experiment, the phage display library is mixed with 0.2 μl of serum or plasma and immunoprecipitated (IPed) using magnetic beads coated with protein A and protein G, which together capture all IgG subclasses. The DNA sequences from the immunoprecipitated phage clones are then amplified by PCR and the abundance of each clone is quantified by high throughput DNA sequencing. To detect citrullinated peptide autoreactivities, we first developed a method to enzymatically citrullinate the human 90-mer phage library24 in vitro using recombinant human PAD2 and PAD4 enzymes (Methods). Sequencing read count data from PhIP-Seq with a citrullinated library can be directly compared against count data from PhIP-Seq with an unmodified library. Fig. 1 illustrates how this type of analysis is able to identify citrulline-dependent autoantibody binding specificities.
Phage display library modification and quality assessment
We reasoned it would be important to establish a rigorous and generalizable quality control pipeline for assessing the integrity of PTM phage libraries. We first confirmed that buffer exchange using standard dialysis into two common buffers (PBS and TBS) did not impact phage viability (Fig. S1a). Second, we assessed phage viability after enzymatic modification. The 90-mer library, a control clone that does not display a peptide, and a second control clone that displays a peptide that is not a PAD substrate, were all found to retain their infectivity after treatment with PAD enzyme (Fig. S1b). We also confirmed by western blot that the buffer and conditions used to citrullinate the phage display library were compatible with robust PAD activity (Fig. S1c).
We next sought to confirm that PAD does not create reactive citrulline epitopes on the phage coat proteins. If it did, the entire library might be non-specifically immunoprecipitated by ACPAs. To this end, we assessed the binding of anti-CCP+ versus anti-CCP- sera to a phage clone without a displayed peptide, which was either subjected to PAD4 modification or not. Anti-CCP+ serum did not precipitate this modified phage clone (Fig. S1d). Lastly, we confirmed that treatment with PAD enzyme does not interfere with PCR amplification of the library inserts (Fig. S1e). In summary, the phage library can be successfully citrullinated in vitro using recombinant PAD enzyme without loss of viability and without the formation of phage capsid citrulline epitopes, while remaining compatible with downstream amplification for sequencing. A flow chart illustrating this quality control pipeline, which is readily generalizable to other PTM PhIP-Seq studies, is shown in Fig. S2.
We next analyzed negative control (“mock”) IPs that omitted antibody input in order to characterize the background binding of each modified and unmodified library peptide to the protein A/G-coated magnetic beads. The mock IPed libraries were sequenced to a depth of ~20-fold. The mock IPs of the PAD2- and PAD4-modified libraries exhibited an essentially indistinguishable background binding profile compared to the unmodified library (Fig. S3). Conveniently, pairwise comparison of modified and unmodified libraries is therefore not confounded by bias in background binding to the beads, simplifying downstream analyses.
PAD PhIP-Seq robustly identifies antibodies to citrulline modified epitopes
We next sought to detect binding of PAD-dependent peptide epitopes by ACPAs from anti-CCP+ RA sera. Unmodified, PAD2-modified, and PAD4-modified libraries were separately immunoprecipitated with RA51, an anti-CCP+ serum sample. The PhIP-Seq read count data of RA51 was then subjected to a pairwise analysis pipeline25 to identify PAD-dependent antibody reactivities. Comparison of the unmodified versus PAD-modified libraries identified 294 PAD2- and 306 PAD4-dependent reactivities (Fig. 2a-b, Table S1). In total, 427 unique peptides were found to exhibit PAD-dependent reactivity; 121 modified peptide reactivities were specific to PAD2, whereas 133 were specific to PAD4. The remaining 173 modified peptide reactivities were shared among both enzymes (Fig. 2c).
All RA51 serum reactive peptides were searched for shared sequences using the shared epitope detection algorithm epitopefindr.26 The results are visualized as a network graph in which peptides are represented as nodes and a potentially shared epitope as edges (Fig. 2d). Peptides in the largest cluster were then analyzed using the motif discovery software MEME,27 which revealed a conserved sequence modified by both PAD2 and PAD4. The motif features -RGGGGK- and likely contains the citrullinated arginine (Fig. 2e).27 Crystallographic data suggests that PAD4 recognizes five successive residues with a ØXRXX consensus where Ø represents amino acids with small side-chains and X denotes any amino acid.28 This is in agreement with an in silico -RXXXXK-PAD2 model and a -RGXXXX-PAD4 model (with a strong preference for glycine at the +1, +2, and +3 positions),29 as well as the experimentally determined PAD4 citrullination sequences -RG/RGG- of FET proteins30 and -SGRGK- of histone H2A.31 PTM PhIP-Seq can thus be used to characterize the substrate specificity of enzymes imparting post-translational modifications.
Longitudinal analysis of ACPA fine specificities using PAD PhIP-Seq
ACPA are expressed at all stages of RA and can precede the onset of clinical disease by over a decade.32 Clinically, ACPAs are routinely used to aid in the diagnosis of RA and provide prognostic information about disease severity and phenotype (e.g., risk of erosive disease and interstitial lung disease).33 Moreover, changes in anti-CCP levels correlate with treatment responses.34 By monitoring ACPA fine specificities, however, it may be possible to increase the sensitivity and specificity of this diagnostic modality. As a proof of concept, we longitudinally profiled serum from an anti-CCP+ RA patient with high disease activity. PAD4 PhIP-Seq was employed to probe the individual’s ACPA repertoire at seven timepoints, both pre- and post-treatment with the B cell depletion therapy, rituximab (Figs. 3, S4). Reactivity profiles obtained prior to treatment harbored consistent levels of PAD4-dependent reactivity, directed at ~100 citrullinated peptides comprising ~80 distinct protein targets. Subsequent to B cell depletion, the number of PAD4-dependent reactivities declined at a rate of approximately 9 peptides per week for 9 weeks (R2=0.8 during this time period). This fall in citrulline-dependent antibody levels and diversity was followed by clinical improvement of synovitis as measured by Clinical Disease Activity Index (36 on Day 1 and 10.5 a month after the first rituximab treatment). These data indicate that PAD PhIP-Seq can be used to characterize the temporal evolution of citrulline-dependent autoantibody fine specificities in response to treatment.
PAD PhIP-Seq positivity is concordant with clinical CCP test results
We next asked how well the PAD PhIP-Seq assay correlates with clinical anti-CCP serostatus. Unmodified, PAD2-modified and PAD4-modified libraries were separately immunoprecipitated using sera from a cohort of 30 RA patients and 10 demographically matched healthy controls. Upon sample unblinding, 20 of 21 anti-CCP+ patients were found to exhibit significant reactivity to a large set of PAD2 and/or PAD4-dependent peptide epitopes (Fig. 4a-b, Table S1). Using a cutoff of ≥15 PAD-dependent reactive peptides, PAD PhIP-Seq positivity was highly concordant with the clinical antigen-agnostic CCP assay; only one anti-CCP+ serum recognized neither PAD2 nor PAD4 PhIP-Seq peptides, and one additional anti-CCP+ serum recognized PAD4 PhIP-Seq peptides, but did not recognize PAD2 PhIP-Seq peptides above threshold (Table S1). It is important to note that the observed discordance may be due to CCP testing and PAD PhIP-Seq analysis not always being performed on the same sample timepoints. None of the sera from the 10 healthy controls, nor that of a Sjogren’s Syndrome patient, reacted with PAD PhIP-Seq peptides above threshold (Table S1). In this cohort, the PAD PhIP-Seq assay performed with a sensitivity of 90-95% and a specificity of 100%, when compared to the standard-of-care clinical assay that globally captures ACPAs using artificial CCP peptides.
PAD PhIP-Seq reveals PAD2- and PAD4-dependent autoantibody reactivity signatures
Small molecule inhibitors of PAD enzymatic activity have emerged as promising candidates for the treatment RA.35,36,18 Pan-PAD inhibitors and those with distinct inhibitory profiles against PAD2 and PAD4 have been developed.37 It is therefore of great interest to determine whether subgroups of RA patients harbor ACPAs that exhibit a discernable preference for PAD2-versus PAD4-modified epitopes. Isoform-preferring antibody signatures, if they exist, may have utility in distinguishing subgroups of individuals more likely to benefit from one class of PAD inhibitors over another.
To identify potential PAD2- and PAD4-specific antibody signatures, we hierarchically clustered the anti-CCP+ patient subset based on their reactivity profiles (Fig. 5a). While we observe diverse reactivity to both PAD2- and PAD4-modified peptides, the clustering suggested the existence of distinct, patient-specific reactivity preferences among PAD2 versus PAD4 substrates. A total of 1,161 PAD-dependent peptides were reactive to at least three sera; 303 were unique to PAD2, 272 were unique to PAD4, and 293 peptides were reactive after modification by either enzyme (Fig. 5b). Among these dominant peptides are the known PAD substrates hornerin (HNNR)38, keratin (KRT4)39, and collagen (COL2A1)40. Other well-known substrates, however, including histone, vimentin, and fibrinogen, were not represented. The absence of these likely reflect a requirement for a higher level of conformational structure to serve as a PAD substrate and/or an antibody epitope, compared to the 90 amino acid peptides presented on the surface of the T7 phage.
We observed that some of the dominant peptides appeared to be differentially reactive, depending on whether they were modified by PAD2 or PAD4. The number of anti-CCP+ individuals reactive to a given PAD modified peptide is shown in Fig. 6a, illustrating peptide dominance as a function of PAD2 versus PAD4 modification. Approximately 2.5% percent of all peptides were reactive in at least one anti-CCP+ individual. The most dominant PAD-dependent peptide reactivities were substrates that could be modified by either PAD2 or PAD4. Several peptides, however, were converted into citrullinated epitopes exclusively by one or the other enzyme: 65 by PAD2 and 50 by PAD4. We reasoned that these particular peptides could be used to probe the PAD specificity profile of ACPAs. A second potential application for these peptides would be as sensors for PAD2 or PAD4 enzymatic activity in a biological matrix.
The of PAD2- and PAD4-specific peptide reactivities across all the anti-CCP+ individuals are shown in Fig. 6b-c. Many peptides are strongly reactive after modification by one, but not the other, PAD enzyme (“mono-reactive”). A shared epitope network graph was used to characterize the degree of sequence homology among these 115 mono-reactive peptides. Approximately 50% of the mono-reactive peptides form nodal networks of >3 edges with the largest network consisting of 42 peptides (Fig. 6d, Fig. S5). A total of 32 mono-reactive peptides remains as singletons and do not align to any other mono-reactive peptide. A comparison of the number of PAD2-versus PAD4-specific reactivities detected in each anti-CCP+ individual is shown in Fig. 7. Overall preference for PAD2 versus PAD4 was quantified using an exact binomial test. A significant differential PAD2/4 “preference” was detected in some anti-CCP+ individuals (Table S2). These data support further investigation into specific PAD isoform-associated ACPA profiles and their potential relevance for prognostication and/or PAD-inhibitor companion diagnostic testing.
Discussion
Serum antibody profiling can inform understanding of human immune processes, as well as suggest disease-specific antibody biomarkers. Despite advances in highly multiplexed serological analyses, there exists an unmet need for methods that can characterize antibodies to post-translationally modified or haptenized epitopes. Here, we adapted the antibody profiling platform PhIP-Seq to the quantitative analysis of antibody reactivity towards an enzymatically citrullinated human peptidome library. This allowed us to characterize and track the fine specificities of ACPA responses of individual RA patients over time, without bias, in high-throughput and at relatively low cost. The use of PTM PhIP-Seq is broadly applicable to studies of diverse PTMs, potentially adaptable to studies of glycosylation,41 carbamylation,42 phosphorylation,43 deamidation,44 and other modifications.6 PTMs of the phage displayed human peptidome may also be used to profile protease activity via the SEPARATE assay.45,46 PTM PhIP-Seq is therefore a powerful addition to the programmable phage display toolkit with broad utility for genome-wide proteomic analyses.
Methods
Human subjects
Sera from 30 individuals with RA from a cross-sectional cohort and 10 demographically matched healthy controls were studied. Patient serum was collected under a study protocol approved by the Office of Human Subjects Research Institutional Review Boards of the Johns Hopkins University School of Medicine; written consent was obtained prior to participation. Healthy control sera were obtained from volunteers who self-identified, were not pregnant, and did not have a history of cancer, autoimmune disease, or active tuberculosis/HIV/hepatitis infection. RA patients fulfilled 2010 ACR classification criteria for RA, as described.47 Anti-CCP antibody levels were determined by the QUANTA Lite® CCP3 IgG ELISA (Inova Diagnostics, # 704535). For one RA patient, serum was available at several time points pre- and post-treatment with the chimeric anti-CD20 monoclonal antibody rituximab. At baseline (day 0), the patient was treated with methotrexate and glucocorticoids with insufficient control of arthritis symptoms. Two doses of rituximab were administered on days 22 and 35, respectively. On day 0, anti-CCP and RF normalized units were 252 and 139 respectively.
Expression and purification of peptidylarginine deiminases
Recombinant human PAD4 and PAD2 containing N-terminal 6 x histidine tags were expressed in BL21(DE3) pLysS competent cells and purified by immobilized metal affinity chromatography with a nickel resin, as previously described.48,49
Phage modification quality control
A negative control clone that lacks a displayed peptide (due to a premature stop codon, introduced as a frameshift mutation of the DNA insert) and a positive control clone that displays a 90-mer arginine-containing peptide from POU3F1 were isolated by randomly picking plaques from the human peptidome phage library. Peptide sequences were determined by Sanger sequencing. PAD modification was performed as described below. Dialysis, plaque assay, and western blotting were performed according to standard molecular biology protocols. Histone citrullination was detected using anti-citrullinated histone antibody (citrulline R2 + R8 + R17, Abcam #ab5103) 1:1,000 in TBST and visualized by chemiluminescence. Quantitative PCR assays were performed using SYBR green on a 1:1,000 dilution of product obtained from 10 cycles of PCR1.1
Human peptidome T7 phage library citrullination
Citrullination was carried out in 42 mL containing buffer alone or 150 μg of either recombinant PAD4 or recombinant PAD2, 1012 pfu human peptidome T7 phage library (pre-dialyzed into 100 mM Tris-HCl, pH 7.5 using a Spectra/Por 7 RC 50 kDa molecular weight cut off dialysis membrane, Spectrum Labs, #132130), 5 mM CaCl2 and 2 mM DTT. The reactions were incubated at 37 °C for 3 hours with end-over-end rotation and quenched by dialysis against 100 mM Tris-HCl, pH 7.5 in order to remove free Ca2+ ions using a 10 kDa molecular weight cutoff dialysis membrane (Thermo Fisher, #88245). A second round of dialysis against TBS, pH 7.4 was performed in order to prepare the libraries for antibody binding and immunoprecipitation. Phage viability after modification was assessed by plaque assay.
Phage ImmunoPrecipitation Sequencing (PhIP-Seq)
Screening, PCR, and peptide read count data generation was performed as previously described.1 Briefly, 0.2 μL of sample (~2 μg IgG) was added to the phage library (1010 pfu) in TBS, pH 7.4. The phage/serum mixtures were rotated overnight at 4°C after which 40 μL of a 1:1 Protein A/G coated magnetic bead mixture (Invitrogen, #10002D and #10004D) was added to each well and rotated for an additional 4 hours at 4°C to capture all IgG. The beads were then washed three times with TBS, pH 7.4 containing 0.1% NP-40 and resuspended in 20 μL of a Herculase II Fusion Polymerase (Agilent, #600679) PCR1 master mix to amplify library inserts. Sample-specific barcoding and the Illumina P5/P7 adapters were then incorporated during a subsequent PCR2 reaction. PCR2 amplicons were pooled and sequenced using an Illumina NextSeq 500 SE 1×50 protocol.
Pairwise analysis of PTM PhIP-Seq data
We used pairwise enrichment analysis to identify peptides that were differentially reactive between treatment conditions. Robust regression of the top 1,000, by abundance, unmodified read counts (Unmod) was used to calculate the ‘expected’ modified read counts (PAD2 & PAD4). The observed modified read counts minus the expected modified read counts for each peptide was calculated to determine peptide residuals. Peptides were grouped in bins and a standard deviation was calculated between all peptides in each bin. From these binned standard deviations, a linear regression was developed and used to assign each peptide an expected standard deviation. Each peptide’s residual was normalized to its expected standard deviation, in order to calculate a ‘pairwise z-score’. Peptide reactivities were considered positive for pairwise z-scores ≥7.
Statistical analyses
Differences among the total number of peptide enrichments between sample groups was calculated using a Wilcoxon test (Fig. 4). Differences among peptide read counts were calculated using a Kruskal-Wallis test with Dunn’s correction (Fig. S4). Differences between the total number of anti-CCP+ patient specific PAD2- and PAD4-unique reactivities were determined using an exact binomial test assuming a null probability of 0.5 for independent PAD2/4-specific peptides (Fig. 5e). Independent sets of peptides were determined using the maximal independent vertex set function from the R package igraph.
Competing Interests
HBL is a founder of Portal Bioscience, Alchemab and ImmuneID, and is an advisor to TScan Therapeutics. ED is author on a licensed patent (US patent no. 8,975,033) and licensed provisional patent (US patent no. 62/481,158) related to the use of antibodies to PAD3 and PAD2, respectively, in identifying clinically informative disease subsets in RA, has received consulting fees from Celgene and Bristol Myers Squibb, and has received research support from Pfizer, Celgene, and Bristol Myers outside of this work.
Author Contributions
Conceptualization: GDRM, RSQ, MA, ED, HBL; Data curation: GDRM, DRM; Formal analysis: GDRM, DRM, MFK, ED, HBL; Funding acquisition: MA, HBL; Methodology: GDRM, HBL; Resources: JMM, RSQ, MFK, MA, ED, HBL; Software: DRM; Visualization: GDRM, DRM, MFK, ED, HBL; Writing - original draft: GDRM, HBL; Writing - review and editing: DRM, JMM, MFK, MA, ED, HBL.
Supplemental Figures
Acknowledgements
This study was made possible by a National Institute of General Medical Sciences (NIGMS) grant R01 GM136724 (HBL) and funding from the Intelligence Advanced Research Projects Activity (HBL, MA). MFK was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) grant T32AR048522. We are grateful to Stephen J. Elledge (Harvard Medical School) for generously providing the human peptidome library used in this study.