Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update 04-snippy's lablog #377

Open
victor5lm opened this issue Dec 16, 2024 · 0 comments
Open

Update 04-snippy's lablog #377

victor5lm opened this issue Dec 16, 2024 · 0 comments
Assignees

Comments

@victor5lm
Copy link
Contributor

A few adjustments should be made to the lablog file from 04-snippy:

  1. In line 8, you should add the command that allows the inclusion of external genomes, with their corresponding .FASTA files, to the generation of input.tab. This line should be commented, since it will only be useful when .FASTA files are included in the analysis.

    cat ../samples_id.txt | while read in; do echo -e "${in}\t${scratch_dir}/../02-preprocessing/${in}/${in}_R1_filtered.fastq.gz\t${scratch_dir}/../02-preprocessing/${in}/${in}_R2_filtered.fastq.gz"; done >> input.tab

  2. You should change this block, so that it is easier to understand. More importantly, you should include the command that allows for the removal of low coverage reads, since right now the only command that appears in this file is the one relative to the removal of complex variants.

    # Execute core genome SNIPPY
    # CODE CONTEXT: this block was used in the service: AZORHIZOBIOUMOUTBREAK01 on november 2022
    # Comment las line from _00_snippy.sh
    # echo "grep \"complex\" ./*/snps.vcf | cut -f 1,2,4,5 | cut -d \":\" -f 2 | sort -u | awk '{pos1=\$2; len_ref=length(\$3); printf \"%s\t%s\t%s\n\", \$1, pos1-1, pos1+len_ref+1}' | grep -v \"^#\" > mask_complex_variants.bed" > _01_snippy_core.sh
    # ls ${scratch_dir}/../../../REFERENCES | xargs -I %% echo "snippy-core --debug --mask ./mask_complex_variants.bed --mask-char 'N' --ref '../../../REFERENCES/%%' $(cat ../samples_id.txt | xargs)" >> _01_snippy_core.sh

  3. Say what this is for:

    # awk 'BEGIN{FS="[> ]"} /^>/{val=$2;next} {print val,length($0)}' phylo.aln

  4. Make sure this is still correct:

    #code to compare samples inpairs
    # awk '$4 != $5 || $4 != $6 || $5 != $6' core.tab > differences.txt

@victor5lm victor5lm self-assigned this Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant