While methods for annotation of genes are increasingly reliable the exact identification of the translation initiation site remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. Using these unique signatures we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic samples, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and elongations of annotated coding sequences. These re-annotations are supported by the presence of Shine-Dalgarno sequences, structural and sequence based features and N-terminal peptides. Finally, our model identifies 61 novel genes previously undiscovered in the genome.