A data aggregation strategy to model specific transcriptional processes based on transcription factor and cell-state specificity.
Publication in American Journal of Human Genetics: IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors
Publication in Nature Genetics: Improving the trans-ancestry portability of polygenic risk scores by prioritizing variants in predicted cell-type-specific regulatory elements
- A motif finding algorithm, such as HOMER as used in our implementation of IMPACT.
- R version 3.2.2 or higher
- R package GenomicRanges
Transcription factor binding motif PWMs (position weight matrices) and genome-wide positions of motifs
Motifs/
Epigenomic and sequence features used in the IMPACT framework to predict cell-state-specific regulatory elements.
Features/
- Create positive (regulatory) and negative (non-regulatory) training sets:
Training/ Pipeline.sh "TF" "cell type" "number input (ChIP-seq) files"
- Run model on a per-TF basis, to get betas on epigenomic and sequence features:
GenomeWide/IMPACT_modelfit/ENet_fit7_TFArgument.R "TF"
- Predict cell-state-specific regulatory elements genomewide based on model fits:
GenomeWide/GenomeTracks/Predictvalues_TFarg.R "TF"
- Then compute minimum and maximum values per chromosome:
GenomeWide/GenomeTracks/MinMaxValues_TFarg.R "TF"
- Finally, scale logistic regression output to be between 0 and 1:
GenomeWide/GenomeTracks/Scale_TFarg.R "TF"
- Create IMPACT annotations formatted to be run through S-LDSC to compute heritability estimates:
GenomeWide/ldsc/CreateAnnotations_TFarg.R "TF"
- Run S-LDSC in European and East Asian populations:
sLDSC/
Compute AUC and sensitivity of IMPACT to predict TF binding of a motif:
Performance/
Compute AUC and sensitivity after parameter variation and categorical feature knock outs:
KnockOuts/
Find loci with predicted IMPACT regulatory elements that do not have evidence of TF binding:
CandidateLoci/
Compare IMPACT predictions at TF target genes, defined by ChIP-seq, to TF non-target genes.
TargetvNonTargets/