Meta-analysis of genome-wide association data from UK Biobank and FinnGen highlights risk loci for pregnancy complications

Code

Launch meta-analysis

To tun meta-analysis using METAL: ./1.1_parallel.sh data/file_mapping.csv analysis_n And then using MTAG: ./1.1_parallel_mtag.sh data/file_mapping.csv analysis_n (should be launched in the specific MATG's environment)

(Also, to postprocess all files, you can use: ./1.3_process_mtag_output.py)

For this script you also need tsv-mapping file with 4 columns:

name_of_trait
path_to_ukb_file
path_to_finngen_file
n_samples_in_finngen_file (from manifest)

To draw Manhattan and Q-Q plots for all FinnGen traits: 1.2_parallel_draw_fg.sh

Auxilary scripts:

1.0.0_pipe_go_R6.py - script for meta-analysis using metal tool for specific trait
1.0.0_pipe_go_R6_mtag.py - the same analysis, but with MTAG tool (this script uses output of previous one).
1.0.1_DRAW.R - script for drawing sketches of Manhattan and QQ plots

Genetic correlations

Run the same meta-analysis on lots of pairs of traits.
The considered traits should be located in one directory (PREG_DATA in our case) and other in the directories, starting with analysis_n_*.

In case you re-launch calculations of correlations, first, clean-up old correlations:

find . -type f -name '*.cors' -delete
find . -type f -name '*.cors.full' -delete
find . -type f -name '*.labels' -delete
find . -type f -name '*.overlap' -delete
find . -type f -name '*.progress' -delete

Launch in parallel for lots of pairs of traits: 2.1.0_parallel_prepare_traits_for_ldak.sh (here 2.0.0_prepare_traits_for_ldak.py is aucilary script).

Launch for considered traits in PREG_DATA directory: ./2.1.1_prepare_pregnancy_for_ldak.py

Launch ldak itself:

./2.2_parallel_launch_ldak.sh

Assemble all correlations:

TEMP_T=("GEST_DIABETES1" "I9_HYPTENSPREG1" "O15_PRETERM1")

for t in "${TEMP_T[@]}" ; do for i in analysis_n* ; do ls ${i}/data/${t}*.cors 2> /dev/null ; cat ${i}/data/${t}*.cors 2> /dev/null | grep Cor_All | awk '{  if ($2 > 2.57*$3)  print }' | grep -v nan ; done | grep -B 1 Cor_All > data/cor_${t}.txt ; done

for t in "${TEMP_T[@]}" ; do for i in analysis_n* ; do ls ${i}/data/${t}*.cors 2> /dev/null ; cat ${i}/data/${t}*.cors 2> /dev/null | grep Cor_All | awk '{ print }' | grep -v nan ; done | grep -B 1 Cor_All > data/cor_full_${t}.txt ; done

As a result we will have cor_full_*.txt with all not-na genetic correlations for specific trait and cor_*.txt files with filtered by significance genetic correlations.

Draw genetic correlation plot:

Launch 2.3.1_make_table_for_r.ipynb and 2.3.1_make_table_for_r_FG.ipynb to prepare tables for meta-analysis GWAS / FG GWAS respectively
Launch 2.3.2_draw_gen_cor.R - to draw the forrest plots and heatmaps of genetic correlations.

Select and annotate top snps

3.1_making_top_snp_table_FG.ipynb and 3.2_making_top_snp_table_META.ipynb -- selecting and annotation of top SNPs for FinnGen-only and meta-analysis respectively.

Final Manhattan and Q-Q plots

4_final_mh_qq.R - for drawing final versions of Q-Q and Manhattan plots.

Images

All images are located in img directory.

img/*_gen_cor.pdf and img/*_gen_cor_heatmap.pdf - forrest plots and heatmaps (respectively) with genetic correlations:
- meta_* - for meta-analysis GWAS;
- fg_supp_ - for FG GWAS and supported by researches traits;
- fg_not_supp_ - for FG GWAS and not supported by researches traits.
img/QQplot.pval__*.pdf - Q-Q plots of significant traits:
- img/QQplot.pval__FG_*.pdf - for FinnGen data only;
- img/QQplot.pval__MET_*.pdf - for meta-analysis data only.
img/Rectangular-Manhattan..pval__*.pdf - Manhattan plots of significant traits:
- img/Rectangular-Manhattan..pval__FG_*.pdf - for FinnGen data only;
- img/Rectangular-Manhattan..pval__MET_*.pdf - for meta-analysis data only.

Data

All data is located in data directory:

Selected summary statistics:
- data/f_special/ - directory with filtered FinnGen GWAS summary statistics (only selected as significant).
- data/f_special/ - directory with meta-analysis summary statistics (only selected as significant).
Genetic correlations:
- data/cor_full_<trait>.txt - file with all genetic correlations for selected traits
- data/cor_<trait>.txt - file with significant genetic correlations for selected traits
- data/meta_feature.csv - annotated table with significant genetic correlations for meta-analysis
- data/fg_feature.csv - annotated table with significant genetic correlations for FG GWAS:
  - fg_feature_supp.csv - selected only supported by researches traits
  - fg_feature_not_supp.csv - selected only not supported by researches traits
Annotated SNPs:
- data/finn_top.csv - significant annotated summstats from FinnGen GWAS.
- data/finn_top_short.csv - significant and filtered (selected 1 per loci) annotated summstats from FinnGen GWAS.
- data/meta_top.csv- significant annotated summstats from meta-analysis.
- data/meta_top_short.csv - significant and filtered (selected 1 per loci) annotated summstats from meta-analysis.
Other:
- data/file_mapping - mapping of selected 24 traits files and N_samples for finngen
- All summary statistics can be found here:
  - maf_fg_*.tsv - FinnGen summary statistics filtered by MAF.
  - extended_*.TBL - summary statistics from meta-analysis by METAL.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-analysis of genome-wide association data from UK Biobank and FinnGen highlights risk loci for pregnancy complications

Code

Launch meta-analysis

Genetic correlations

Select and annotate top snps

Final Manhattan and Q-Q plots

Images

Data

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
img		img
.gitattributes		.gitattributes
.gitignore		.gitignore
1.0.0_pipe_go_R6.py		1.0.0_pipe_go_R6.py
1.0.0_pipe_go_R6_mtag.py		1.0.0_pipe_go_R6_mtag.py
1.0.1_DRAW.R		1.0.1_DRAW.R
1.1_parallel.sh		1.1_parallel.sh
1.1_parallel_mtag.sh		1.1_parallel_mtag.sh
1.2_parallel_draw_fg.sh		1.2_parallel_draw_fg.sh
1.3_process_mtag_output.py		1.3_process_mtag_output.py
2.0.0_prepare_traits_for_ldak.py		2.0.0_prepare_traits_for_ldak.py
2.1.0_parallel_prepare_traits_for_ldak.sh		2.1.0_parallel_prepare_traits_for_ldak.sh
2.1.1_prepare_pregnancy_for_ldak.py		2.1.1_prepare_pregnancy_for_ldak.py
2.2_parallel_launch_ldak.sh		2.2_parallel_launch_ldak.sh
2.3.1_make_table_for_r.ipynb		2.3.1_make_table_for_r.ipynb
2.3.1_make_table_for_r_FG.ipynb		2.3.1_make_table_for_r_FG.ipynb
2.3.2_draw_gen_cor.R		2.3.2_draw_gen_cor.R
3.1_making_top_snp_table_FG.ipynb		3.1_making_top_snp_table_FG.ipynb
3.2_making_top_snp_table_META.ipynb		3.2_making_top_snp_table_META.ipynb
4_final_mh_qq.R		4_final_mh_qq.R
README.md		README.md

bioinf/pregnancy_meta_analysis

Folders and files

Latest commit

History

Repository files navigation

Meta-analysis of genome-wide association data from UK Biobank and FinnGen highlights risk loci for pregnancy complications

Code

Launch meta-analysis

Genetic correlations

Select and annotate top snps

Final Manhattan and Q-Q plots

Images

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages