Motivation: Phosphoproteomic experiments are increasingly used to study the changes in signalling occurring across different conditions. It has been proposed that changes in phosphorylation of kinase target sites can be used to infer when a kinase activity is under regulation. However, these approaches have not yet been benchmarked due to a lack of appropriate benchmarking strategies. Results: We curated public phosphoproteomic experiments to identify a gold standard dataset containing a total of 184 kinase-condition pairs where regulation is expected to occur. A list of kinase substrates was compiled and used to estimate changes in kinase activities using the following methods: Z-test, Kolmogorov Smirnov test, Wilcoxon rank sum test, gene set enrichment analysis (GSEA), and a multiple linear regression model (MLR). We also tested weighted variants of the Z-test, and GSEA that include information on kinase sequence specificity as proxy for affinity. Finally, we tested how the number of known substrates and the type of evidence (in vivo, in vitro or in silico) supporting these influence the predictions. Conclusions: Most models performed well with the Z-test and the GSEA performing best as determined by the area under the ROC curve (Mean AUC=0.722). Weighting kinase targets by the kinase target sequence preference improves the results only marginally. However, the number of known substrates and the evidence supporting the interactions has a strong effect on the predictions.