Abstract
Genome sequencing is a key strategy in the surveillance of SARS-CoV-2, the virus responsible for the COVID-19 outbreak. Latin America is the hardest hit region of the world, accumulating almost 25% of COVID-19 cases worldwide. Costa Rica was first exemplary for the region in its pandemic control, declaring a swift state of emergency on March 16th that led to a low quantity of cases, until measures were lifted in early May. From the first detected case in March 6th to November 30th almost 140 000 cases have been reported in Costa Rica, 99.5% of them from May onwards. We analyzed the genomic variability during the SARS-CoV-2 pandemic in Costa Rica using 138 sequences, 52 from the first months of the pandemic, and 86 from the current wave.
Three GISAID clades (G, GH, and GR) and three PANGOLIN lineages (B.1, B.1.1, and B.1.291) are predominant, with phylogenetic relationships that are in line with the results of other Latin American countries suggesting introduction and multiple re-introductions from other regions of the world. The sequences from the first months of the pandemic grouped in lineage B.1 and B.1.5 mainly, suggesting low undetected circulation and re-introductions of new lineages not detected in the country during early stages of the pandemic due to the extreme lockdown measures. The wholegenome variant calling analysis identified a total of 177 distinct variants. These correspond mostly to non-synonymous mutations (54.8%, 97) but 41.2% (73) corresponded to synonymous mutations. The 177 variants showed an expected power-law distribution: 106 single nucleotide mutations were identified in single sequences, only 16 single nucleotide mutations were found in >5% sequences, and only three single nucleotide mutations in >25% genomes. These mutations were distributed all over the genome. However, 61.5% were present in ORF1ab, and 15.0% in Spike gene and 9.6% in the Nucleocapsid. Additionally, the prevalence of worldwide-found variant D614G in the Spike (98.6% in Costa Rica), ORF8 L84S (1.5%) is similar to what is found elsewhere. Interestingly, the prevalence of mutation T1117I in the Spike has increased during the current pandemic wave beginning in May 2020 in Costa Rica, reaching 14.5% detection in the full genome analyses in August 2020. This variant has been observed in less than 1% of the GISAID reported sequences in other countries. Structural modeling of the Spike protein with the T1117I mutation suggest a possible effect on the viral oligomerization needed for cell infection. Nevertheless, in-vitro experiments are required to prove this in-silico analyses. In conclusion, genome analyses of the SARS-CoV-2 sequences over the course of COVID-19 pandemic in Costa Rica suggests re-introduction of lineages from other countries as travel bans and measures were lifted, similar to results found in other studies, but the Spike-T1117I variant needs to be monitored and studied in further analyses as part of the surveillance program during the pandemic.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Modification of text due to some typo errors.