Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: sequence length is not enough in R1 read #285

Open
calcaidecabello opened this issue May 14, 2024 · 7 comments
Open

IndexError: sequence length is not enough in R1 read #285

calcaidecabello opened this issue May 14, 2024 · 7 comments

Comments

@calcaidecabello
Copy link

Hi,

I am trying to run celescope. I built the reference genome and I have created the shell scripts, but I obtain this error:

sh Bnsortednuclei1.sh
2024-05-14 14:13:14,148 - celescope.tools.sample.sample - INFO - start...
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', outdir='.//Bnsortednuclei1/00.sample', sample='Bnsortednuclei1', thread='16', debug=False, fq1='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz', chemistry='auto', wells=384, func=<function sample at 0x7fa2b78e0820>)
2024-05-14 14:13:14,209 - celescope.tools.sample.run - INFO - start...
2024-05-14 14:13:14,872 - celescope.tools.barcode.check_chemistry - INFO - start...
2024-05-14 14:13:14,872 - celescope.tools.barcode.check_chemistry - INFO - /netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz
2024-05-14 14:13:14,872 - celescope.tools.barcode.get_chemistry - INFO - start...
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/sample.py", line 68, in sample
runner.run()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/sample.py", line 36, in run
chemistry = ch.check_chemistry()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 56, in check_chemistry
chemistry = self.get_chemistry(fastq1)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 131, in get_chemistry
chemistry = self.seq_chemistry(seq)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 87, in seq_chemistry
linker_flv_rna = Barcode.get_seq_str(seq, self.pattern_dict_flv_rna["L"])
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 276, in get_seq_str
raise IndexError(f"sequence length is not enough in R1 read: {seq}")
IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', chemistry='auto', pattern=None, whitelist=None, adapter_3p='AAAAAAAAAAAA', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', outFilterMatchNmin='50', soloCellFilter='EmptyDrops_CR 3000 0.99 10 45000 90000 500 0.01 20000 0.001 10000', starMem='32', STAR_param='', SAM_attributes='', soloFeatures='Gene GeneFull_Ex50pAS', fq1='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz', fq2='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R2.fastq.gz', outdir='.//Bnsortednuclei1/01.starsolo', sample='Bnsortednuclei1', thread='16', debug=False, func=<function starsolo at 0x7fc688421ee0>)
2024-05-14 14:13:21,219 - celescope.tools.barcode.check_chemistry - INFO - start...
2024-05-14 14:13:21,220 - celescope.tools.barcode.check_chemistry - INFO - /netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz
2024-05-14 14:13:21,220 - celescope.tools.barcode.get_chemistry - INFO - start...
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/starsolo.py", line 185, in starsolo
with Starsolo(args) as runner:
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/starsolo.py", line 41, in init
chemistry_list = ch.check_chemistry()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 56, in check_chemistry
chemistry = self.get_chemistry(fastq1)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 131, in get_chemistry
chemistry = self.seq_chemistry(seq)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 87, in seq_chemistry
linker_flv_rna = Barcode.get_seq_str(seq, self.pattern_dict_flv_rna["L"])
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 276, in get_seq_str
raise IndexError(f"sequence length is not enough in R1 read: {seq}")
IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT
2024-05-14 14:13:26,722 - celescope.rna.analysis.analysis - INFO - start...
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', matrix_file='.//Bnsortednuclei1/outs/filtered', outdir='.//Bnsortednuclei1/02.analysis', sample='Bnsortednuclei1', thread='16', debug=False, func=<function analysis at 0x7fc0797f64c0>)
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', matrix_file='.//Bnsortednuclei1/outs/filtered', outdir='.//Bnsortednuclei1/02.analysis', sample='Bnsortednuclei1', thread='16', debug=False, func=<function analysis at 0x7fc0797f64c0>)
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/rna/analysis.py", line 64, in analysis
runner.run()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/rna/analysis.py", line 38, in run
with analysis_wrapper.Scanpy_wrapper(self.args, display_title=self.display_title) as scanpy_wrapper:
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/analysis_wrapper.py", line 59, in init
self.adata = sc.read_10x_mtx(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 490, in read_10x_mtx
adata = read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 554, in _read_v3_10x_mtx
adata = read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 112, in read
return _read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 737, in _read
raise FileNotFoundError(f'Did not find file {filename}.')
FileNotFoundError: Did not find file Bnsortednuclei1/outs/filtered/matrix.mtx.gz.
(Cris_singleron_snseq) calcaide@dell-node-4:/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/shell$ cd ../
(Cris_singleron_snseq) calcaide@dell-node-4:/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460$ sh ./shell/Bnsortednuclei1.sh
2024-05-14 14:20:03,762 - celescope.tools.sample.sample - INFO - start...
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', outdir='.//Bnsortednuclei1/00.sample', sample='Bnsortednuclei1', thread='16', debug=False, fq1='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz', chemistry='auto', wells=384, func=<function sample at 0x7fe755b11820>)
2024-05-14 14:20:03,835 - celescope.tools.sample.run - INFO - start...
2024-05-14 14:20:04,488 - celescope.tools.barcode.check_chemistry - INFO - start...
2024-05-14 14:20:04,489 - celescope.tools.barcode.check_chemistry - INFO - /netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz
2024-05-14 14:20:04,489 - celescope.tools.barcode.get_chemistry - INFO - start...
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/sample.py", line 68, in sample
runner.run()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/sample.py", line 36, in run
chemistry = ch.check_chemistry()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 56, in check_chemistry
chemistry = self.get_chemistry(fastq1)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 131, in get_chemistry
chemistry = self.seq_chemistry(seq)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 87, in seq_chemistry
linker_flv_rna = Barcode.get_seq_str(seq, self.pattern_dict_flv_rna["L"])
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 276, in get_seq_str
raise IndexError(f"sequence length is not enough in R1 read: {seq}")
IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', chemistry='auto', pattern=None, whitelist=None, adapter_3p='AAAAAAAAAAAA', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', outFilterMatchNmin='50', soloCellFilter='EmptyDrops_CR 3000 0.99 10 45000 90000 500 0.01 20000 0.001 10000', starMem='32', STAR_param='', SAM_attributes='', soloFeatures='Gene GeneFull_Ex50pAS', fq1='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz', fq2='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R2.fastq.gz', outdir='.//Bnsortednuclei1/01.starsolo', sample='Bnsortednuclei1', thread='16', debug=False, func=<function starsolo at 0x7fb7396a1ee0>)
2024-05-14 14:20:10,529 - celescope.tools.barcode.check_chemistry - INFO - start...
2024-05-14 14:20:10,530 - celescope.tools.barcode.check_chemistry - INFO - /netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/00_rawdata/Bnsortednuclei_R1.fastq.gz
2024-05-14 14:20:10,530 - celescope.tools.barcode.get_chemistry - INFO - start...
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/starsolo.py", line 185, in starsolo
with Starsolo(args) as runner:
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/starsolo.py", line 41, in init
chemistry_list = ch.check_chemistry()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 56, in check_chemistry
chemistry = self.get_chemistry(fastq1)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 131, in get_chemistry
chemistry = self.seq_chemistry(seq)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 87, in seq_chemistry
linker_flv_rna = Barcode.get_seq_str(seq, self.pattern_dict_flv_rna["L"])
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/barcode.py", line 276, in get_seq_str
raise IndexError(f"sequence length is not enough in R1 read: {seq}")
IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT
2024-05-14 14:20:15,806 - celescope.rna.analysis.analysis - INFO - start...
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', matrix_file='.//Bnsortednuclei1/outs/filtered', outdir='.//Bnsortednuclei1/02.analysis', sample='Bnsortednuclei1', thread='16', debug=False, func=<function analysis at 0x7f3cca87e4c0>)
CeleScope version: 2.0.7 Args: Namespace(subparser_assay='rna', genomeDir='/netscratch/dep_coupland/grp_turck/People/Cristina/GC_6460/ZS11/', matrix_file='.//Bnsortednuclei1/outs/filtered', outdir='.//Bnsortednuclei1/02.analysis', sample='Bnsortednuclei1', thread='16', debug=False, func=<function analysis at 0x7f3cca87e4c0>)
Traceback (most recent call last):
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/bin/celescope", line 8, in
sys.exit(main())
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/celescope.py", line 54, in main
args.func(args)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/utils.py", line 45, in wrapper
result = func(*args, **kwargs)
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/rna/analysis.py", line 64, in analysis
runner.run()
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/rna/analysis.py", line 38, in run
with analysis_wrapper.Scanpy_wrapper(self.args, display_title=self.display_title) as scanpy_wrapper:
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/celescope/tools/analysis_wrapper.py", line 59, in init
self.adata = sc.read_10x_mtx(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 490, in read_10x_mtx
adata = read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 554, in _read_v3_10x_mtx
adata = read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 112, in read
return _read(
File "/netscratch/dep_coupland/grp_turck/lib/micromamba/envs/Cris_singleron_snseq/lib/python3.9/site-packages/scanpy/readwrite.py", line 737, in _read
raise FileNotFoundError(f'Did not find file {filename}.')
FileNotFoundError: Did not find file Bnsortednuclei1/outs/filtered/matrix.mtx.gz.

It seems that the main error is "IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT"
I don't know if it is related with a sequencing problem or I should change some parameters.

Can you help me, please?

Thank you in advance!

@zhouyiqi91
Copy link
Collaborator

It looks like the R1 fastq file is truncated, which causes an error when automatically detecting chemistry. Chemistry can be specified explicitly to skip automatic detection of chemistry:

multi_rna \
 --chemistry scopeV3.0.1 \
..

@calcaidecabello
Copy link
Author

calcaidecabello commented May 15, 2024 via email

@calcaidecabello
Copy link
Author

calcaidecabello commented May 15, 2024 via email

@zhouyiqi91
Copy link
Collaborator

zhouyiqi91 commented May 15, 2024

Would you please check the length read of R1 fastq reads? Are their lengths 61bp like the read in the error report?
scopeV3.0.1 requires a minimum R1 read length of 72bp

It seems that the main error is "IndexError: sequence length is not enough in R1 read: TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT"

@calcaidecabello
Copy link
Author

calcaidecabello commented May 15, 2024 via email

@zhouyiqi91
Copy link
Collaborator

It seems that at least one R1 read is 61bp in length, which does not meet the minimum requirement of 72bp.
You can use grep to find this read.
zcat {R1 fastq.gz} | grep -B 1 -A 2 TGCCTGATCCGAACATGTAGGTCTCTGTCGGTGTGACTACGTATTAGCATGTGTCTGAACT

In theory, the length of each read of the raw data should be the same. Why does this short read appear? Maybe

  • The fastq file is truncated.
  • The sequencing software has done some special processing.

While it may be important to find the reason for the short read length, here are some ways to bypass this error:

  • Use the new nextflow-based pipeline: https://github.com/singleron-RD/scrna ; no R1 read minimum length check is performed.
  • Use the v2.0.7_no_R1_len_check branch where R1 read minimum length check is disabled.
git clone -b v2.0.7_no_R1_len_check --single-branch https://github.com/singleron-RD/CeleScope.git
conda activate celescope
pip uninstall celescope
cd CeleScope
pip install .

@tuqiang2014
Copy link

tuqiang2014 commented May 24, 2024

您好,请问当测序R1 pattern 是这样的 C9L16C9L16C9U12=71bp,该如何正确运行celescope呢?

回复: 参考#276 这里。设置 以下参数 :--chemistry customized --pattern --whitelist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants