-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variant not detected within 16bp from the start/end #257
Comments
Could you please try Clair3 with the |
Sure, here's the VCF output for each vcf.gz in the output dir:
|
Clair3 was seeing no read at 2265. Clair3 filters the alignments with the following four flags: |
I filtered the alignments using Filtered BAM bcftools call results:
Unfiltered BAM bcftools call results:
Is there anything else that might be going on? |
That's challenging. To make things easier, would you mind sending us a minibam with reads covering 2265. |
Sure thing! Please find attached the samtools view filtered BAM: Let me know if you need anything else! |
We found that 2265 is the last 16th base in your reference. Clair3 doesn't call variants in the first 16bp and last 16bp of a sequence because of 1) algorithmic limit, and 2) usually degenerated coverage and alignment performance in the head and tail of a sequence that makes variant calling unreliable. A solution to your case is to add say 10 'N' to the tail of your reference genome so 2265 gets out of the tail 16bp limit. |
Thanks. This is very good to know. For Influenza virus sequencing, there's a set of primers targeting the conserved 5' and 3' ends of each of the 8 genome segment generating amplicons that result in good sequencing coverage even at the ends of each segment. |
Understood. Leaving this issue open and will come back later with a better solution. |
Hello, What is Clair3 filter criteria for supplementary alignments? I see mismatches in IGV that were not called as SNP by Clair3. I clicked on the reads and almost most of them are supplementary alignments. |
@lagphase By default, Clair3 would discard all supplementary alignments. So you might disable the supplementary alignments in IGV to see if the variant is evident. |
Hello,
I noticed that Clair3 is not calling a variant near the end of a short reference sequence (position 2265 of Influenza A virus PB2 sequence OP597571.1; sequence length=2280).
Bcftools calls at a G->A variant at position 2265. It's also clear from looking at the read alignment in IGV that there's a high AF variant at the position.
Clair3
merge_output.vcf.gz
:Bcftools mpileup and call output:
Clair3 command:
Other details:
Are there any parameters that need to be adjusted for variant calling of RNA virus sequence data?
Any help would be much appreciated!
Thanks!
The text was updated successfully, but these errors were encountered: