-
Notifications
You must be signed in to change notification settings - Fork 22
About the quality standard
Briefly, the following metrics and criteria are to be considered in particular:
Contiguity
- Contig N50
- Scaffold N50
Correctness
- Quality Value (QV)
Completeness
- Single-copy genes presence (BUSCO)
- Kmer Completeness
Other criteria
- % Sequence assigned to candidate chromosomal sequences
- Gaps/Gbps
- Sex chromosome identification
Contiguity and Correctness metrics can be summarised with the EBP quality format: log10(Cont_N50).log10(Scaf_N50).Q(QV_value)
log10(10,000)=4, log10(100,000)=5, log10(1,000,000)=6, log10(10,000,000)=7 ...
If N50 is "chromosomal scale", then the log10 value is replaced by a C.
Minimum requirements are different depending on sample characteristics (sufficient tissue for DNA extraction) and genomic size (chromosomal N50 < 1 Mbp). Please have a look at the EBP Report on Assembly Standards for further details. The following table summarises the proposed metric values in each case:
Sometimes, not all the metrics can be met (e.g., low BUSCO score due to a taxonomic group not being well represented).