pyABraOM

Summary

Introduction
Installation
Search Types
Batch search
BibTeX entry
Acknowledgement

Introduction

PyABraOM is an API developed to facilitate the access to human variant data from ABraOM database to access data from GRCh37/hg19 and GRCh38/hg38 genome version. The package offers support to three kinds of different searches and the possibility to search in batches.For plotting the data, please take a look at the BIOVARS package. If you have scientific interests or want to use our package in formal reports, we kindly ask you to cite us in your publication: Carneiro, P., Colombelli, F., Recamonde-Mendoza, M., and Matte, U. (2022). Pynoma, PyABraOM and BIOVARS: Towards genetic variant data acquisition and integration. bioRxiv.

Installation

Currently there is not a PyPI version for the PyABraOM API, so the installation needs that you clone this repository and install it as local package.

$ git clone https://github.com/bioinfo-hcpa/pyABraOM.git
$ pip install -e pyABraOM #outside the directory

Search Types

There are 3 kinds of search by using pyABraOM and all returned in the same format: Gene, Genomic Region and Variant search.

Since the ABraOM database has the two latest human genome versions, the user must specify the version in GRCh37/hg19 or GRCh38/hg38 to retrieve data from the database for the specific genome version. Besides, when searching by genes, genomic regions or variant for chromosomes X or Y, a new column "Number of Hemizygous" will be added to the output dataframe, so the user should have caution when performing pandas concatenation operations, resulting in table cells with NaN values. Futhermore, the GATK and CEGH filter provided from ABraOM are available, but latter is not in the default output dataframe.

Search by gene

The search method expects the gene name to be considered in the function Search_gene(version:str,gene:str,CEGH_Filter= False,Variant_ID= False, Process= True).

version (str):the reference genome version (either "GRCh37/hg19" or "GRCh38/hg38")
gene (str): the gene name to search in database
CEGH_Filter (bool): whether to provide in-house quality control filter CEGH-USP
Variant_ID (bool): whether to provide variant identifier
Process (bool): whether to provide data in JSON format or pandas dataframe

import pyabraom
from pyabraom import Seach_gene
df = Search_gene('hg38', 'ACE2')

Search by genomic region

The genomic region method expects the genome region to be considered in the function Search_region(version:str,chromosome,start,end,CEGH_Filter= False,Variant_ID= False, Process= True). The X and Y chromosome must be provided as character in the chromosome parameter.

version (str):the reference genome version (either "GRCh37/hg19" or "GRCh38/hg38")
chromosome(obj): chromosome
start (int): where the region of interest starts
end (int):where the region of interest ends
CEGH_Filter (bool): whether to provide in-house quality control filter CEGH-USP
Variant_ID (bool): whether to provide variant identifier
Process (bool): whether to provide data in JSON format or pandas dataframe

import pyabraom
from pyabraom import Search_region
df = Search_region('hg38',4,980883,984868)

Search by variant

The variant search method expects the variant identifier rsID i the function Variant_ID(version:str,variant:str,CEGH_Filter= False,Process= True):

version (str):the reference genome version (either "GRCh37/hg19" or "GRCh38/hg38")
variant (str): the variant identifier rsID to search in database
CEGH_Filter (bool): whether to provide in-house quality control filter CEGH-USP
Variant_ID (bool): whether to provide variant identifier
Process (bool): whether to provide data in JSON format or pandas dataframe

import pyabraom
from pyabraom import Variant_ID
df = Variant_ID('hg38','rs115134980')

Batch Search

The batch search method expects as parameter a list of one provided type of search in the function Searches(searches_list:list)

import pyabraom
df = Searches([Search_gene('hg38','ACE2'),Search_gene('hg38','TMPRSS2')])

Features

Not available yet.

Variant Filter

BibTeX entry

@article {Carneiro2022.06.07.495190,
	author = {Carneiro, Paola and Colombelli, Felipe and Recamonde-Mendoza, Mariana and Matte, Ursula},
	title = {Pynoma, PyABraOM and BIOVARS: Towards genetic variant data acquisition and integration},
	elocation-id = {2022.06.07.495190},
	year = {2022},
	doi = {10.1101/2022.06.07.495190},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2022/06/09/2022.06.07.495190},
	eprint = {https://www.biorxiv.org/content/early/2022/06/09/2022.06.07.495190.full.pdf},
	journal = {bioRxiv}
}

Acknowledgement

This research was supported by the National Council for Scientific and Technological Development (CNPq) and the Research Incentive Fund (FIPE) from Hospital de Clínicas de Porto Alegre.

References

Naslavsky, Scliar, Yamamoto, et al., 2022. Whole-genome sequencing of 1,171 elderly admixed individuals from São Paulo, Brazil.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
docs/examples		docs/examples
pyabraom		pyabraom
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pyABraOM

Summary

Introduction

Installation

Search Types

Search by gene

Search by genomic region

Search by variant

Batch Search

Features

BibTeX entry

Acknowledgement

References

About

Releases 1

Packages

Contributors 2

Languages

License

bioinfo-hcpa/pyABraOM

Folders and files

Latest commit

History

Repository files navigation

pyABraOM

Summary

Introduction

Installation

Search Types

Search by gene

Search by genomic region

Search by variant

Batch Search

Features

BibTeX entry

Acknowledgement

References

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages