flextaxd error: "ValueError: not enough values to unpack (expected 2, got 1)" #70

morien · 2024-02-08T02:18:15Z

I'm attempting to follow along with this part of the tutorial/wiki, to get a better understanding of how to create my own custom DB. Things are okay until I get to the database creation step:

# flextaxd -db 16S_database.db -tf GTDB_arc_bact_taxo_tree_unique.txt -tt CanSNPer --genomeid2taxid g2id.txt --dump --dbprogram kraken2 -o taxonomy --verbose --logs logs/zenodo
2024-02-07 18:08:45,291 custom_taxonomy_databases [INFO ]  FlexTaxD logging initiated!
Warning: 16S_database.db already exists, overwrite? (y/n): y
2024-02-07 18:08:49,303 custom_taxonomy_databases [INFO ]  Loading module: ReadTaxonomyCanSNPer
2024-02-07 18:08:49,352 DatabaseConnection [INFO ]  16S_database.db opened successfully.
2024-02-07 18:08:49,353 ReadTaxonomyCanSNPer [INFO ]  GTDB_arc_bact_taxo_tree_unique.txt
2024-02-07 18:08:49,353 ReadTaxonomyCanSNPer [INFO ]  Fetching root name from file
2024-02-07 18:08:49,353 ReadTaxonomyCanSNPer [INFO ]  Adding, cellular organism node
2024-02-07 18:08:49,354 ReadTaxonomyCanSNPer [INFO ]  Adding root node root!
2024-02-07 18:08:49,355 custom_taxonomy_databases [INFO ]  Parse taxonomy
2024-02-07 18:08:49,355 ReadTaxonomyCanSNPer [INFO ]  Parse CanSNP tree file
2024-02-07 18:08:49,902 ReadTaxonomyCanSNPer [INFO ]  New taxonomy ids assigned 12929
Traceback (most recent call last):
  File "/home/nnnnnn/mambaforge/lib/python3.9/site-packages/flextaxd/modules/ReadTaxonomy.py", line 153, in parse_genomeid2taxid
    genomeid,taxid = row.strip().split("\t")
ValueError: not enough values to unpack (expected 2, got 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nnnnnn/mambaforge/bin/flextaxd", line 8, in <module>
    sys.exit(main())
  File "/home/nnnnnn/mambaforge/lib/python3.9/site-packages/flextaxd/custom_taxonomy_databases.py", line 330, in main
    read_obj.parse_genomeid2taxid(args.genomeid2taxid)
  File "/home/nnnnnn/mambaforge/lib/python3.9/site-packages/flextaxd/modules/ReadTaxonomy.py", line 156, in parse_genomeid2taxid
    genomeid,taxid,reference = row.strip().split("\t")
ValueError: not enough values to unpack (expected 3, got 1)

Here's the first few lines of my two input files:

# head g2id.txt 
GB_GCA_000010565.1      Pelotomaculum thermopropionicum
GB_GCA_000018565.1      Herpetosiphon aurantiacus
GB_GCA_000024525.1      Spirosoma linguale
GB_GCA_000091165.1      Methylomirabilis oxyfera_B
GB_GCA_000146855.1      Peptoanaerobacter margaretiae
GB_GCA_000147015.1      Zinderia insecticola
GB_GCA_000163995.1      Campylobacter_D jejuni_A
GB_GCA_000165065.1      Longicatena sp000165065
GB_GCA_000166295.1      Marinobacter adhaerens
GB_GCA_000168735.1      Endoriftia persephone

 # head GTDB_arc_bact_taxo_tree_unique.txt 
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;Aenigmatarchaeales;Aenigmatarchaeaceae;Aenigmatarchaeum;Aenigmatarchaeum_subterraneum
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;CG10238-14;CG10238-14;CG10238-14;CG10238-14_sp002789635
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;CG10238-14;CG10238-14;RBG-16-49-10;RBG-16-49-10_sp001784635
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;CG10238-14;EX4484-224;EX4484-224;EX4484-224_sp002254545
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;CG10238-14;SCSR01;SCSR01;SCSR01_sp004297575
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;GW2011-AR5;GCA-2688965;GCA-2688965;GCA-2688965_sp002688965
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;GW2011-AR5;GW2011-AR5;GW2011-AR5;GW2011-AR5_sp000806115
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;GW2011-AR5;GW2011-AR5;GW2011-AR5;GW2011-AR5_sp10154u
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;QMZP01;QMZP01;QMZP01;QMZP01_sp003663225
root;Archaea;Aenigmatarchaeota;Aenigmatarchaeia;QMZP01;QMZP01;QMZY01;QMZY01_sp003663415

I'd like to use this tool so any help is greatly appreciated

The text was updated successfully, but these errors were encountered:

davve2 · 2024-02-13T12:06:55Z

Hi Morien,

It looks like the header may be the problem (if they are included in the files). If not I think the best option is if you could supply the head of your files as a text files, then we can replicate the error locally. The error itself tells says that the program finds too few columns separated by . What do you use for separation in your files? the default separator is \t

morien · 2024-02-14T00:09:03Z

g2id.txt.gz
GTDB_arc_bact_taxo_tree_unique.txt.gz
Okay great. Yes, the default separator is \t and that's what I see reflected in my input files. Should it be . instead? Here's my input files (entire files, gzipped).

davve2 self-assigned this Feb 13, 2024

davve2 added the question Further information is requested label Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flextaxd error: "ValueError: not enough values to unpack (expected 2, got 1)" #70

flextaxd error: "ValueError: not enough values to unpack (expected 2, got 1)" #70

morien commented Feb 8, 2024

davve2 commented Feb 13, 2024

morien commented Feb 14, 2024

flextaxd error: "ValueError: not enough values to unpack (expected 2, got 1)" #70

flextaxd error: "ValueError: not enough values to unpack (expected 2, got 1)" #70

Comments

morien commented Feb 8, 2024

davve2 commented Feb 13, 2024

morien commented Feb 14, 2024