Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple errors trying to analyze flv_trust4 data #288

Open
panapapa14 opened this issue May 31, 2024 · 5 comments
Open

Multiple errors trying to analyze flv_trust4 data #288

panapapa14 opened this issue May 31, 2024 · 5 comments

Comments

@panapapa14
Copy link

As I mentioned above, I am trying to find out what is going on with some flv_trust4 data totally unsuccesfully.
On top of the lack of efficient guidance from the company, we made numerous trials getting repeatedly the following errors:

multi_flv_trust4
--mapfile ./vdj.mapfile \
--ref GRCm38
--thread 8
--seqtype TCR \
--mod shell

CONDA_DEFAULT_ENV is not set. sjm mode may not available.
2024-05-31 13:38:17,140 - celescope.tools.multi.parse_mapfile - INFO -
start...
Allowed R1 patterns:
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C2311300011.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT
.fastq.gz/C2311300011.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT
.fastq.gz/C2311300011.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT
.fastq.gz/C2311300011.fastq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT
.fastq.gz/C231130001R1.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fastq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fastq.gz
Traceback (most recent call last):
File "/home/diopap/.local/bin/multi_flv_trust4", line 8, in
sys.exit(main())
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/flv_trust4/multi_flv_trust4.py", line 80, in
main
multi.run()
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/multi.py",
line 420, in run
self.prepare()
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/multi.py",
line 199, in prepare
self.fq_dict, self.col4_dict, self.col5_dict =
self.parse_mapfile(self.args.mapfile, self.col4_default,
self.args.use_R3)
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/utils.py",
line 45, in wrapper
result = func(args, **kwargs)
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/multi.py",
line 149, in parse_mapfile
fq1, fq2 = get_fq(library_id, library_path, use_R3)
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/multi.py",
line 452, in get_fq
fq1_list = get_read(library_id, library_path, read='1')
File
"/home/diopap/.local/lib/python3.10/site-packages/celescope/tools/multi.py",
line 442, in get_read
raise Exception(
Exception:
Invalid Read1 path!
library_id: C231130001
library_path:
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_
.fastq.gz

diopap@DIO:/mnt/c/users/dipap/Documents/tcr$ ls

@zhouyiqi91
Copy link
Collaborator

https://github.com/singleron-RD/CeleScope/blob/master/doc/assay/multi_flv_trust4.md#arguments

When running multi_flv_trust4, the mapfile needs 4 columns.
1st column: Fastq file prefix
2nd column: Fastq file directory path

The full path of the R1 fastq files will be {Fastq file directory path}/{Fastq file prefix}_*_{R1 Fastq file suffix}. There are several allowed R1 Fastq file suffix, e.g. R1.fq.gz, R1.fastq.gz, etc. You can find all the valid R1 fastq pattern in the log. It seems that the 2nd column should be /mnt/c/users/dipap/Documents/tcr/fastq_files_backup/, without C231130001_sCT_.fastq.gz

Allowed R1 patterns:
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C2311300011.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT.fastq.gz/C2311300011.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT.fastq.gz/C2311300011.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT.fastq.gz/C2311300011.fastq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT.fastq.gz/C231130001R1.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1.fastq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fq.gz
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fastq
/mnt/c/users/dipap/Documents/tcr/fastq_files_backup/C231130001_sCT_.fastq.gz/C231130001R1_001.fastq.gz

@diopapamath
Copy link

the error seems to have gone away but although multi_flv_trust4 runs without any errors it finishes pretty fast and without producing any output, meaning that something is still wrong. I am sharing below my files in order to pinpoint which file or configuration leads to the quick no output completion of the command.

this is what i get:
(tcr_analysis) diopap@DIO:/mnt/c/users/dipap/Documents/tcr$ multi_flv_trust4 --mapfile ./vdj.mapfile --ref GRCm38 --thread 8 --seqtype TCR --mod shell
2024-06-01 19:36:20,839 - celescope.tools.multi.parse_mapfile - INFO - start...
2024-06-01 19:36:20,924 - celescope.tools.multi.parse_mapfile - INFO - done. time used: 0:00:00.085204

my path to the files is : /mnt/c/users/dipap/Documents/tcr/fastq_files_backup
the name of the fastqs : 'C231130001_sCT_*R1_001.fastq.gz' 'C231130001_sCT_*R2_001.fastq.gz' (this is one sample of 2 paired end read files , to test my code)
my vdj.mapfile with these 4 columns : C231130001 /mnt/c/users/dipap/Documents/tcr/fastq_files_backup/ C231130001 /mnt/c/users/dipap/Documents/tcr//matched_dir
(matched.dir exists in the ./tcr directory)

Thanks in advance for the help

@singleron-RD singleron-RD deleted a comment from Chenjunjie1996 Jun 3, 2024
@zhouyiqi91
Copy link
Collaborator

Have you run the scRNA-Seq data? You need the scRNA-Seq cell barcodes to run the flv_trust4 pipeline.

multi_{assay} only generate the shell scripts in the shell folder, not actually running the pipeline.

@diopapamath
Copy link

diopapamath commented Jun 3, 2024

Hello, i thought demultiplexing was part of the multi_flv_trust4. We have performed siCircle for 6samples in singleron germany and we have the fastq files.
Yes i see the commands in the shell script but i dont have anything else apart from the fastqs and the html report.
Thank you!!

@panapapa14
Copy link
Author

Hello there! Our problem with @diopapamath is that we only find guidelines for the multi_{assay} that, as you said, creates a plain shell folder. What about the actual pipeline in terms of the script that we need to execute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants