Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) #26

Open
BJ-Chen-Eric opened this issue Jun 7, 2021 · 14 comments
Open

Segmentation fault (core dumped) #26

BJ-Chen-Eric opened this issue Jun 7, 2021 · 14 comments

Comments

@BJ-Chen-Eric
Copy link

Hi, thanks for developing the tool. As title, when running the rattle cluster it return Segmentation fault (core dumped).
This is the code.

rattle/rattle cluster -i ~/Analysis/data/process/rna1/rna1.filter.fastq.gz -o ~/Analysis/tool/isoform_detection/rat/ --iso --rna

and the output is

RNA mode: 1
Reading fasta file... Done
Segmentation fault (core dumped)

The input file is less than 500 thousands reads and the device is 16 cores/32 threads with 1T memory. From previous discussion, the limited memory might the problem but I think my input reads has much lower amount. Hope anyone could discuss about it.

@novikk
Copy link
Collaborator

novikk commented Jun 7, 2021

Hi, from what I see from the command you are trying to use a compressed fastq file, which RATTLE doesn't support as of now. You will need to uncompress it first.

Also, be sure to filter out small reads (we usually filter out those smaller than 150bp).

Best,
Ivan

@BJ-Chen-Eric
Copy link
Author

Thanks for your rapid response. I tried it with an uncompressed and filtered file again but the result is the same. Another update is that when I adopt the split.fastq file as rattle input it can process but there is nothing in the output. Should I offer the fastq file to help figure out the problem?
unnamed
image

Best wishes

@novikk
Copy link
Collaborator

novikk commented Jun 8, 2021

Hi! Yes, please send me the fastq file if that's possible to ivan.delarubia@upf.edu

@ziweiwuzw
Copy link

I encounter the same issuses.
image

@ziweiwuzw
Copy link

I have CPU 128 with 1T memory, but I still can not run a command. Could you help me?

@eileen-xue
Copy link
Contributor

Hi there,

Do you run into any problems when using RATTLE with the example toyset dataset? Can you please check whether your reads contain any invalid bases? RATTLE could run into this issue when generating kmers with reads containing invalid bases.

Hope this helps,
Eileen

@ziweiwuzw
Copy link

ziweiwuzw commented Jun 6, 2023

I did not encounter any issues while using RATTLE with the example toyset dataset. I utilized fastp to filter out low-quality bases. However, both the original fastq file (20G) and the trim.fastq file (12G) faced the same problems when using RATTLE. What does 'invalid bases' refer to? Does it mean base 'N'?
image

@ziweiwuzw
Copy link

Hi there,

Do you run into any problems when using RATTLE with the example toyset dataset? Can you please check whether your reads contain any invalid bases? RATTLE could run into this issue when generating kmers with reads containing invalid bases.

Hope this helps, Eileen

I encountered no issues while utilizing RATTLE with the toyset dataset as an example. To eliminate low-quality bases, I employed fastp. Nevertheless, both the initial fastq file (20G) and the trim.fastq file (12G) encountered identical problems during the application of RATTLE. Additionally, upon counting the bases in my fastq file, I did not observed the presence of the "N" base. Could you help me check this issue? :)

@eileen-xue
Copy link
Contributor

Hi,

Valid bases are A,T,C,G,U. All other bases in reads are considered as invalid, including 'N'. No need to worry about 'N', RATTLE will filter it out.

Could you please provide your RATTLE command? If it is possible, can you please run RATTLE with your dataset with '--verbose' flag and provide the progress bar? This could provide me more information and identify why and where RATTLE went wrong.

Thanks,
Eileen

@EduEyras
Copy link
Member

EduEyras commented Jun 7, 2023 via email

@ziweiwuzw
Copy link

ziweiwuzw commented Jun 7, 2023

Dear Eileen,
Thank you for your assistance! I have retried using fastp to filter the data. Fortunately, I successfully obtained the desired result when I run the same commands. However, when attempting to process the original fastq file with and applied the command you suggested, I encountered difficulties and couldn't understand the cause. It is possible that my original fastq file contains duplicate bases and low-quality bases, which could be the reason for the issues. The following figures is my command and error result.
image
image

@ziweiwuzw
Copy link

Do you have very short or extremely long reads in your input? E.

On Wed, 7 Jun 2023 at 12:51, Wu Ziwei @.> wrote: Hi there, Do you run into any problems when using RATTLE with the example toyset dataset? Can you please check whether your reads contain any invalid bases? RATTLE could run into this issue when generating kmers with reads containing invalid bases. Hope this helps, Eileen I encountered no issues while utilizing RATTLE with the toyset dataset as an example. To eliminate low-quality bases, I employed fastp. Nevertheless, both the initial fastq file (20G) and the trim.fastq file (12G) encountered identical problems during the application of RATTLE. Additionally, upon counting the bases in my fastq file, I did not observed the presence of the "N" base. Could you help me check this issue? :) — Reply to this email directly, view it on GitHub <#26 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCZKBYQE3BREWONVKUBFRDXJ7UBTANCNFSM46HK3SQQ . You are receiving this because you are subscribed to this thread.Message ID: @.>

Yes, you might be correct. I checked my fastq file and confirmed that it contains reads longer than 150. However, I neglected to determine the length of the longest reads.

@improudofmyself
Copy link

Facing this issue:==========================================
SLURM_JOB_ID = 2797830
SLURM_NODELIST = hm02

Starting at Tue May 21 18:49:46 CDT 2024
Job name: Rattle, Job ID: 2797830
I have 4 CPUs on compute node hm02
RNA mode: true
Reading fasta file...
Reads: 10527128
Done
/var/spool/slurmd/job2797830/slurm_script: line 82: 2113221 Killed ./rattle cluster -i "$filtered_file" -o "$output_folder" --rna -B 0.5 -b 0.3 -f 0.2
Reading fasta file... Done
Reading fasta file... Done
Using bigmem partition of our cluster:
$ slurminfo
QUEUE FREE TOTAL FREE TOTAL RESORC OTHER MAXJOBTIME CORES NODE GPU
PARTITION CORES CORES NODES NODES PENDING PENDING DAY-HR:MN /NODE MEM-GB (COUNT)

  bigmem     48     96      0      2        0        0     7-00:00       48    1500 -         
    My slurm looks like this:

@improudofmyself
Copy link

echo "Starting at $(date)"

echo "Job name: ${SLURM_JOB_NAME}, Job ID: ${SLURM_JOB_ID}"

echo "I have ${SLURM_CPUS_ON_NODE} CPUs on compute node $(hostname -s)"

Navigate to Porechop directory

cd /home/rkumar/Porechop || exit

Define input and output paths

input_file="/scratch/g/........./Nanopore_cDNA/A3HE/A3HE.fastq"

output_folder="/scratch/g/...../Nanopore_cDNA/A3HE/Rattle_A3HE"

Check if input file exists

if [ ! -f "$input_file" ]; then

echo "Error: Input file not found!"

exit 1

fi

Create output directory if it does not exist

mkdir -p "$output_folder/clusters"

Step 1: Filter reads by length (if needed, adjust according to your data)

filtered_file="${input_file%.fastq}_filtered.fastq"

porechop -i "$input_file" -o "$filtered_file" --discard_middle --min_split_read_size 150

Check if filtered file was created

if [ ! -f "$filtered_file" ]; then

echo "Error: Filtered file not created!"

exit 1

fi

Navigate to Rattle directory

cd /home/rkumar/RATTLE || exit

Step 2: Run the RATTLE commands

./rattle cluster -i "$filtered_file" -o "$output_folder" --rna -B 0.5 -b 0.3 -f 0.2

./rattle cluster_summary -i "$filtered_file" -c "$output_folder/clusters.out" > "$output_folder/cluster_summary.tsv"

./rattle extract_clusters -i "$filtered_file" -c "$output_folder/clusters.out" -o "$output_folder/clusters" --fastq

Step 3: Correct reads

./rattle correct -i "$filtered_file" -c "$output_folder/clusters.out" -o "$output_folder"

Step 4: Merge consensi files and run polishing step

consensi_file="$output_folder/consensi.fq"

cat "$output_folder"/*/consensi.fq > "$consensi_file"

Check if consensi file was created

if [ ! -f "$consensi_file" ]; then

echo "Error: Consensi file not created!"

exit 1

fi

./rattle polish -i "$consensi_file" -o "$output_folder" --rna

echo "Finished at $(date)"

Periodically log memory usage

while true; do

echo "Memory usage at $(date):"

free -h

sleep 600  # Log every 10 minutes

done &

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants