Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing Error for VariantTable, requires values to be monotonically increasing #384

Open
npb596 opened this issue Jul 23, 2022 · 0 comments

Comments

@npb596
Copy link

npb596 commented Jul 23, 2022

Hello,

I have been receiving the below error:

    vcf_first_vt = allel.VariantTable({'CHROM' : vcf_first['variants/CHROM'], 'POS' : vcf_first['variants/POS'], 'REF' : vcf_first['variants/REF'], 'ALT' : vcf_first['variants/ALT'][:,0], 'GT' : vcf_first['calldata/GT'][:,0,0], 'PS' : vcf_first['calldata/PS'][:,0]}, index=('CHROM','POS'))
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4517, in __init__
    self.set_index(index)
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4542, in set_index
    index = SortedMultiIndex(self[index[0]], self[index[1]],
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4036, in __init__
    l1 = SortedIndex(l1, copy=copy)
  File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 3384, in __init__
    raise ValueError('values must be monotonically increasing')
ValueError: values must be monotonically increasing

For some clarity, my python script has the vcf_first_vt definition given above, and this causes the subsequent errors. It seems I can avoid this error so long as I use lexicographic sorting of numbers (e.g. chr1, chr10, chr2 instead of chr1, chr2, chr10) and remove chromosome names without numbers (e.g. chrX and chrY). This is odd to me as I assume something like "chr1" should be treated as a string (as per the example here: https://scikit-allel.readthedocs.io/en/stable/model/ndarray.html?highlight=sortedmultiindex#sortedmultiindex). I suppose the lexicographic sorting makes sense when the numbers are treated as strings, though I don't understand why they necessarily need to be sorted in any particular order at all. Is there a way of defining a VariantTable that I'm missing that would allow chromosomes to be sorted in any particular order? If not, would it be possible to make this kind of issue more explicit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant