June 5, 2022

Michael Hills

Tracing COVID-19 variants with SARS-CoV-2 sequencing

The Centers for Disease Control and Prevention, sometimes known as the CDC, in the United States is using SARS-COV-2 sequencing to track the development of SARS-CoV-2 mutations that induce COVID-19. Multiple transmission lines were created by the Centers for Disease Control and Prevention (CDC) to link COVID sequencing information from the CDC, public health research facilities and commercial testing centers to publicly available records organized by the National Center for Biotechnology Information (NCBI), as well as the Global Initiative for Sharing Avian Influenza Data (GISAID). 

As part of the National SARS-CoV-2 Strain Surveillance (NS3) System, health centers provide de-identified samples to the CDC and this system aims to offer a representative group of viruses for sequencing. There are four primary processes involved in the process of generating SARS-CoV-2 sequencing data from all of these specimens and then making this information available to the general public.

The sequencing approach for SARS-COV-2 is carried out using the following steps:

1. Specimen collection and first processing: Specimens are gathered and registered into the laboratory information management system once the initial processing of the specimens has been completed.

2. The preparation of the specimens and their sequencing: SARS-CoV-2 RNA is obtained, transformed to complementary DNA, then enriched, and then placed in next-generation sequencing equipment.

3. The production of sequence data and the gathering of sequence data: Samples are sequenced while data is collected from sequencers and early quality control operations are carried out. At this point, a parallel process of merging sequencing data from commercial laboratories with CDC data for processing has begun. This integration will take place in tandem with the other procedure.

4. Submit sequence data to public repositories: Scientists do quality control checks. Sequences that were initially rejected by public repositories are evaluated and may be re-sequenced in preparation for resubmission. Public repositories make published data available to scientists worldwide.

The Application of Bioinformatics to Analysis of SARS-COV-2 Sequencing Data

Through public repositories, data that is available to the public may be accessed by researchers all around the world. Additionally, the CDC routinely receives genomic sequence data from several sources to assist in the process of national surveillance. In the third phase of the process, data on genetic sequences are provided by commercial laboratories to the CDC. The CDC then analyses the data and submits it for publication. Public health laboratories, research institutes, and academic institutions all contribute data to public repositories by submitting it to them directly. Scientists from the CDC conduct in-depth studies of the sequencing data to discover variations and actively monitor the prevalence of their occurrence. This is done to evaluate the potential impact these variations could have on crucial SARS-CoV-2 countermeasures such as vaccinations, treatments, and diagnostics.

The Centers for Disease Control and Prevention (CDC) encourages state public health laboratories to “classify” the sequence data they create and post in public databases so that their output can be included in CDC analysis thereby allowing the CDC to better understand the relationship between different sequences. Our ability to look for, analyze, and share data that was produced anywhere in the United States is bolstered by the addition of systematic and coherent tagging information on sequences that have been contributed.

SARS-CoV-2 Strain Surveillance (NS3): Spearheading National Systems

The data collected from the SARs-COV-2 virus is being gathered and examined through the NS3 program where this program enables a comprehensive and population-based system for the US to be able to track changes of the virus over time and then be able to address new variants of the has the potential to restrict diagnostics, therapeutics, or vaccines or affect COVID-19’s infectiousness.

The CDC seeks up to 750 specimens per week from across states and municipalities for COVID sequencing and additional viral analysis in partnership with the state as well as local public health organizations, thus, the NS3 now has three key objectives and these are the following:

1. National-level surveillance of the virus: Every week or every other week, US public health laboratories transmit SARS-CoV-2–positive clinical specimens to the CDC in support of government efforts to decode, genetically analyze, and phenotypically define the viruses circulating in our population throughout time. Additionally, this permits the establishment of a database for SARS-CoV-2 sequencing data and specimens.

2. The surveillance is improved or increased: Since the advent of SARS-CoV-2 sequencing programs, variations in the SARS-CoV-2 genome caused by transmission as well as evolution in people and animals have been detected. Several aspects of public health, along with disease transmission, severity, diagnoses, treatments, and immunizations, might be significantly impacted by these genetic alterations.

Thus the CDC may ask for additional specimens from public health laboratories as part of the NS3 monitoring in order to properly examine variants of interest, variants of concern, or another specific type of viral categories along with cases of vaccination breakthrough.

3. Virus characterization: Using genomic analysis, SARS-CoV-2 variants are extracted from positive specimens submitted to public health laboratories in the United States. These isolated viruses are evaluated by CDC laboratories to establish their potential impact on existing immunizations, therapies, and diagnostics, and also their overall threat to public health.

As laboratories increase their sequencing capability, the CDC strives to strengthen and expand its technology infrastructure and procedures to facilitate rapid submission of sequence information to public repositories, which contain sequence data for scientists to study and are accessible to the public. COVID sequencing is a multi-step process that requires laboratory and bioinformatics procedures. From the time a specimen is obtained at the CDC until a sequence is produced and deemed eligible for submission to public databases, approximately 10 days pass. Frequently, state, local, academic, and business partners adhere to the same timetable.

Notifying the Public Health Departments of the Results of SARS-CoV-2 Sequencing

A SARS-CoV-2 genetic sequencing result must be included in an existing electronic laboratory report to be sent to state, municipal, tribal, or territorial health officials. Sequencing results for SARS-CoV-2 should be submitted to the same public health organization that received the first positive viral test result. The electronic reporting of sequencing data should include all of the patient’s initial demographic information, the content of the viral test report, and the second requested test with discovered viral genetic lineage.