Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error about Transcript * appears in the reference but did not appear in the BAM #946

Open
YIGUIz opened this issue Jul 18, 2024 · 0 comments

Comments

@YIGUIz
Copy link

YIGUIz commented Jul 18, 2024

Hi, I hope you're well. Here is my question:

[Bulk mode] Error: Transcript * appears in the reference but did not appear in the BAM
I want to obtain the ONT data expression by alignment-based mode, The command:
singularity exec ${code_path}/singularity_images/salmon:1.10.3--h6dccd9a_2 salmon quant \ --ont -p 16 -t ${ref_trans_fa} -l U -a ${LR_bam} -o ${output_tmp1}

I changed a lot of transcripts.fa file, but it still reports "Transcript * appears in the reference but did not appear in the BAM".

  1. Firstly, I used the transcripts.fa provided by the NCBI - GCF_002263795.3_ARS-UCD2.0_genomic.fna

  2. Secondly, I used gffread to obtain the transcripts.fa, But "Error: no valid ID found for GFF record". So I converted the gtf file (version2.2) by shell command as you recommended. the command:

singularity exec /public/home/b20223040336/Workspace/long_read_rna/02code/singularity_images/gffread:0.12.7--hdcf5f25_4 gffread -w GCF_002263795.3_ARS-UCD2.0_transcripts.fa -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_genomic.gtf
grep -P '\btranscript_id\s+"[^"]+"' GCF_002263795.3_ARS-UCD2.0_genomic.gtf > GCF_002263795.3_ARS-UCD2.0_genomic_fixed.gtf
singularity exec /public/home/b20223040336/Workspace/long_read_rna/02code/singularity_images/gffread:0.12.7--hdcf5f25_4 gffread GCF_002263795.3_ARS-UCD2.0_genomic_fixed.gtf -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_transcripts_gtf.fa

3.Finally, I used the gff3 files provided by NCBI to obtain the transcripts.fa, the command:
GCF_002263795.3_ARS-UCD2.0_genomic.gff -g GCF_002263795.3_ARS-UCD2.0_genomic.fna -w GCF_002263795.3_ARS-UCD2.0_transcripts_gff.fa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant