Influenza reassortment detection program (FluReD)

This repository analyses influenza genomes for reassortment detection (currently available for influenza A and B), to understand one of the main causes of influenza evolution and spread.

Genotype is assigned independently for each segment through a method based on distances.

Important

Clade assignment is based on available sequences. For some clades, not all segments are available. For all segments of a reference strain, clade assigment is based on HA clade defined.

Available references

References datasets by season are available here. A guide for own reference dataset creation is also provided.

Installation

Prerequisites

Ensure that the following programs/packages are installed on your system before running the pipeline:

Nextflow (developed with version 23.10.1)
Python 3.x and the following modules:
- pandas
- sys
- re
- os
MAFFT
distmat

Installation Steps

Clone the repository:

git clone https://github.com/ValldHebron-Bioinformatics/FluReD.git
cd FluReD

Install nextflow (tool developed in version 23.10.1) Follow the nextflow installation guide
Install the required Python packages. Can be installed with pip install pandas sys re os
Install MAFFT for sequence alignment. Follow the MAFFT installation guide.
Install distmat (for generating distance matrices). You can refer to the distmat installation guide.

Execution details

Input

Parameters:

--fluType: Type of influenza to be analysed. Available options are [ A/H1 | A/H3 | B/Vic | B/Yam | Zoonotic ]. If you want to analyse an influenza A dataset (taking into account both H1 and H3), you can define A/H1 or A/H3 as fluType parameter.
--outdir: Output directory to store the results.
--fasta: FASTA format file with the segments to be analysed. Currently, the tool is adapted for the GISAID format, which is necessary for its operation. The headers must have the format >Isolation_name|Collection_date|Clade|Segment. The options ‘Replace spaces with underscores in FASTA header’ and ‘Remove spaces before and after values in FASTA header’ must be selected.
--Npercent: Threshold for sequence filtering based on the percentage Ns. Default is 0.05.
--chunk: Chunk size for parallelisation. Default is 100.
--segments: Segments for analysis. Default is ‘all’, but a single segment can be selected.

Output

A folder named 'results' will be created in the directory defined by the user. Results will be stored in this folder in CSV format.

Example:

nextflow run flured.nf --fluType A/H1 --outdir ./outputDir --fasta ./sequences.fasta --chunk 100 --references ./references/2023-2024/A-H1N1 --segments all

Credits

If you use this tool in your research, please consider citing the repository.

Get in touch

To report a bug, error, or feature request, please open an issue.

For questions, email us at alejandra.gonzalez@vallhebron.cat; we're happy to help!

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
bin		bin
modules		modules
references		references
LICENSE.txt		LICENSE.txt
README.md		README.md
flured.nf		flured.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Influenza reassortment detection program (FluReD)

Available references

Installation

Prerequisites

Installation Steps

Execution details

Input

Output

Example:

Credits

Get in touch

About

Releases

Packages

Languages

License

ValldHebron-Bioinformatics/FluReD

Folders and files

Latest commit

History

Repository files navigation

Influenza reassortment detection program (FluReD)

Available references

Installation

Prerequisites

Installation Steps

Execution details

Input

Output

Example:

Credits

Get in touch

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages