-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VCFfiles: seperated or merged #105
Comments
Hi, Thanks for sharing your experience. For Q1, I would expect both "separated" and "merged" to give very similar results, if the configuration (coverage, n_cell per donor, balance of donors, etc) is within a reasonable range. Similarly, if the configuration is fine, I would say the "separated" is good enough, as the number of SNPs is often sufficient. However, if the number of cells for each donor (or some minor donor) is very limited (e.g., <100 cells), merging multiple time points may help increase the cell numbers for each donor, while merging batches may use different sets of SNPs. I would run cellsnp-lite on all batches together, followed by vireo, for the "merged" option. Alternatively, for the problematic batch, you can simply run vireo without reference genotype and see whether it is better aligned to the "separated" or "merged". For Q2, this is a less commonly used option. It is similar to mode 1 without genotype, but only using the donor genotype as prior, it can be updated in the estimation. If you feel your genotype has high noise (e.g., from very shallow bulk RNA-seq), you may consider trying it. Yuanhua |
Thanks very much for your valuable reply. It helps me a lot. |
Hi developers,
Thanks for developing this helpful tool. I encountered two questions when I used vireo.
As I have different time experiment data from 10x scrna-seq (batch1 for D1 , batch2 for D2 ,...).
Cellsnp-lite was used to call common SNP for each batch, followed by vireo to demultiplex .
Here we have a donor.vcf.gz .
Q1:I wonder which one could get more resonble result:
1)Seperated :
CELL_FILE : .cellSNP.cells.vcf.gz for each batch from cellsnp-lite (like batch1.cellSNP.cells.vcf.gz )
DONOR_FILE: bcftools view donor.vcf.gz -R batch1.cellSNP.cells.vcf.gz -Oz -o donors.sub_Batch1.vcf.gz
~/miniconda3/bin/vireo -c batch1.cellSNP.cells.vcf.gz -d donors.sub_Batch1.vcf.gz -o ${re} -N $n --randSeed 2
CELL_FILE : "bcftools merge" was used to merge cellSNP.cells.vcf.gz for each batch from cellsnp-lite ,generated all.cellSNP.cells.vcf.gz.
DONOR_FILE: bcftools view donor.vcf.gz -R all.cellSNP.cells.vcf.gz -Oz -o donors.sub_All.vcf.gz
~/miniconda3/bin/vireo -c all.cellSNP.cells.vcf.gz -d donors.sub_All.vcf.gz -o ${re} -N $n --randSeed 2
As I tried ,even though --randSeed was set to the same, cells in batch1 was demultiplexed to different donors in Seperated or Merged.
Could you tell me which one could get more resonble result and why .Many thanks.
Q2: Mode4 in vireo was applicable when with genotype but not confident (or only for subset of SNPs).
The command is : vireo -c $CELL_DATA -d $DONOR_GT_FILE -o $OUT_DIR --forceLearnGT.
Could you give some examples for this mode?Sorry for my questions.
The text was updated successfully, but these errors were encountered: