gds2bgen: Format Conversion from BGEN to GDS

GNU General Public License, GPLv3

Description

This package provides functions for format conversion from bgen files to SeqArray GDS files.

Version

v0.9.3

Package Maintainer

Dr. Xiuwen Zheng (zhengxwen@gmail.com)

Installation

Requires R (≥ v3.5.0), gdsfmt (≥ v1.20.0), SeqArray (≥ v1.24.0)

Installation from Github:

library("devtools")
install_github("zhengxwen/gds2bgen")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

Or manually intall the package

git clone https://github.com/zhengxwen/gds2bgen
cd gds2bgen/src
unzip bgen_v1.1.8.zip
cd bgen_v1.1.8
python2 ./waf configure
python2 ./waf
cp build/libbgen.a ..
cp build/3rd_party/zstd-1.1.0/libzstd.a ..
rm -rf build
sleep 1; touch ../libbgen.a
cd ../../..
R CMD INSTALL gds2bgen

Copyright Notice

This package includes the sources of the bgen library (https://enkre.net/cgi-bin/code/bgen/dir?ci=trunk), Boost (the C++ libraries, https://www.boost.org) and Zstandard (https://zstd.net).

Citations for GDS

Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.

Examples

library(gds2bgen)

seqBGEN_Info()  # bgen library version
## "bgen_lib_v1.1.8"

bgen_fn <- system.file("extdata", "example.8bits.bgen", package="gds2bgen")
# or bgen_fn <- "your_bgen_file.bgen"
seqBGEN_Info(bgen_fn)

## File: gds2bgen/extdata/example.8bits.bgen
## # of samples: 500
## # of variants: 199
## Compression method: zlib
## Layout version: v1.2
## Unphased: TRUE
## # of bits: 8
## Ploidy: 2
## sample id: sample_001, sample_002, sample_003, sample_004, ...


# example.8bits.bgen ==> example.gds, using 4 cores
seqBGEN2GDS(bgen_fn, "example.gds",
    storage.option="LZMA_RA",  # compression option, e.g., ZIP_RA for zlib or LZ4_RA for LZ4
    float.type="packed8",      # 8-bit packed real numbers
    geno=FALSE,     # 2-bit integer genotypes, stored in 'genotype/data'
    dosage=TRUE,    # numeric alternative allele dosages, stored in 'annotation/format/DS'
    prob=FALSE,     # numeric genotype probabilities, stored in 'annotation/format/GP'
    parallel=4      # the number of cores
)


# show file structure
library(SeqArray)
(f <- seqOpen("example.gds"))
seqClose(f)

## File: example.gds (137.7K)
## +    [  ] *
## |--+ description   [  ] *
## |--+ sample.id   { Str8 500 LZMA_ra(7.02%), 393B } *
## |--+ variant.id   { Int32 199 LZMA_ra(33.9%), 277B } *
## |--+ position   { Int32 199 LZMA_ra(60.6%), 489B } *
## |--+ chromosome   { Str8 199 LZMA_ra(15.7%), 101B } *
## |--+ allele   { Str8 199 LZMA_ra(11.8%), 101B } *
## |--+ genotype   [  ] *
## |--+ phase   [  ]
## |--+ annotation   [  ]
## |  |--+ id   { Str8 199 LZMA_ra(18.6%), 321B } *
## |  |--+ qual   { Float32 199 LZMA_ra(11.8%), 101B } *
## |  |--+ filter   { Int32 199 LZMA_ra(11.3%), 97B } *
## |  |--+ info   [  ]
## |  \--+ format   [  ]
## |     |--+ DS   [  ] *
## |     |  \--+ data   { PackedReal8U 500x199 LZMA_ra(55.6%), 54.0K } *
## \--+ sample.annotation   [  ]

Also See

seqVCF2GDS() in the SeqArray package, conversion from VCF files to GDS files.

seqBED2GDS() in the SeqArray package, conversion from PLINK BED files to GDS files.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
R		R
inst/extdata		inst/extdata
man		man
src		src
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
NEWS		NEWS
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gds2bgen: Format Conversion from BGEN to GDS

Description

Version

Package Maintainer

Installation

Copyright Notice

Citations for GDS

Examples

Also See

About

Releases 2

Packages

Contributors 2

Languages

zhengxwen/gds2bgen

Folders and files

Latest commit

History

Repository files navigation

gds2bgen: Format Conversion from BGEN to GDS

Description

Version

Package Maintainer

Installation

Copyright Notice

Citations for GDS

Examples

Also See

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages