Comparative Analysis of Wuhan-Hu-1 and Delta-1/2021 Strains in SARS-CoV-2 RNA Sequences

Problem Statement

The genetic material of the SARS-CoV-2 virus, responsible for COVID-19, is composed of RNA. RNA sequences play a vital role in the virus's replication and protein synthesis. Understanding the specific RNA sequences is crucial for tracking the virus's evolution and identifying emerging variants. This project focuses on comparing the RNA sequences of the Wuhan-Hu-1 (reference) and Delta-1/2021 strains, examining variations that may impact the virus's behavior, transmissibility, and immune response evasion.

Background

The coronavirus's RNA serves as a template for viral protein synthesis, and changes in the RNA sequence can significantly influence the virus's properties. The study involves extracting relevant insights from a dataset containing information on different SARS-CoV-2 variants. By analyzing and comparing RNA sequences, researchers can gain valuable information for developing targeted interventions, including vaccines and treatments.

Project Overview

Data Collection: The dataset of coronavirus variants was obtained from the NCBI website.

import pandas as pd

# Read the dataset
df = pd.read_csv("ncbi_datasets.csv", low_memory=False)

Data Cleaning and Exploration (EDA): Examined data types, missing values, and performed exploratory data analysis.

# Data cleaning
df['Collection Date'] = df['Collection Date'].apply(parse_date)
df["Continents"] = df["Geo Location"].str.replace(";.+", "", regex=True)

# Exploratory Data Analysis
# (e.g., checking null values, data types, and summary statistics)

Comparative Analysis: Analyzed the nucleotide lengths, number of samples collected per month, and performed comparisons between different strains.
```
# Comparative Analysis
# (e.g., nucleotide length statistics, samples per month, strain comparisons)
```
Sequence Alignment: Utilized BioPython to align RNA sequences and calculate the similarity percentage.
```
# Sequence Alignment
# (e.g., using BioPython to align RNA sequences and calculate similarity)
```
Mismatches Visualization: Visualized sequence mismatches to identify deletions, substitutions, and insertions.
```
# Mismatches Visualization
# (e.g., color-coded display of sequence mismatches)
```

Results and Conclusions

The project provides a comprehensive analysis of the Wuhan-Hu-1 and Delta-1/2021 SARS-CoV-2 strains, highlighting variations in RNA sequences. The percentage similarity, deletions, substitutions, and insertions were examined to understand the impact of mutations. This information is crucial for monitoring the virus's evolution and guiding public health strategies.

Acknowledgments

The project acknowledges the National Center for Biotechnology Information (NCBI) for providing the dataset and tools used in the analysis.

References

Cleveland Clinic. (2020). Image source: clevelandclinic.org
Images: biologydictionary.net, National Human Genome Research Institute, Thomas Splettstoesser, freepik.com

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Genomic Analysis of Mutations in SARS-CoV-2 Variants.ipynb		Genomic Analysis of Mutations in SARS-CoV-2 Variants.ipynb
README.md		README.md
RNA.svg		RNA.svg
dvr.jpg		dvr.jpg
rna_image.jpg		rna_image.jpg
sequences.txt		sequences.txt
vaccine.png		vaccine.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparative Analysis of Wuhan-Hu-1 and Delta-1/2021 Strains in SARS-CoV-2 RNA Sequences

Problem Statement

Background

Project Overview

Results and Conclusions

Acknowledgments

References

About

Releases

Packages

Languages

darshan-panchal1/Genomic-Analysis-of-Mutations-in-SARS-CoV-2-Variants

Folders and files

Latest commit

History

Repository files navigation

Comparative Analysis of Wuhan-Hu-1 and Delta-1/2021 Strains in SARS-CoV-2 RNA Sequences

Problem Statement

Background

Project Overview

Results and Conclusions

Acknowledgments

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages