Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immunedeconv terminates for (non-)duplicate gene symbols #146

Open
axelgschwind opened this issue May 15, 2023 · 7 comments
Open

Immunedeconv terminates for (non-)duplicate gene symbols #146

axelgschwind opened this issue May 15, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@axelgschwind
Copy link

axelgschwind commented May 15, 2023

Hello,

I'm struggling with some samples for which immunedeconv throws errors.

The error message says:

Error in .rowNamesDF<-(x, value = value) :
duplicate 'row.names' are not allowed
Calls: ... row.names<- -> row.names<-.data.frame -> .rowNamesDF<-
In addition: Warning message:
non-unique value when setting 'row.names': ‘entry withdrawn’
Execution halted

However, I have double checked the input dataframes. There are no duplicates concerning the HUGO gene symbols in the input data.
I also tried converting the HUGO symbols to ENSEMBL, removing duplicate entries again and back transforming it to HUGO symbols (see https://bioinformatics.stackexchange.com/questions/19584/error-of-duplicated-rownames-although-there-are-no-duplicates for explanation).
However, I still get the same error for many samples.

I have attached a tar.gz file with two dataframes as examples. One dataframe works (="df_working.Rda") while the other does not (="df_not_working.Rda) with quantiseq.

I would be glad if you could have a look at these files and check what is going wrong with them.

Cheers,

Axel


Brief description of the problem

library(immunedeconv)
load("df_not_working.Rda")
res <- immunedeconv::deconvolute(data_not_working, "quantiseq", tumor=TRUE)

Versions

R version 4.2.0 (2022-04-22) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS/LAPACK: /home/ukt.ad.local/ahgscha1/miniconda3/envs/deconvolution/lib/libopenblasp-r0.3.21.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] annotables_0.2.0 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[5] dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[9] tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0 optparse_1.7.3

loaded via a namespace (and not attached):
[1] magrittr_2.0.3 hms_1.1.3 tidyselect_1.2.0 munsell_0.5.0
dataframes.tar.gz

[5] getopt_1.20.3 timechange_0.2.0 colorspace_2.1-0 R6_2.5.1
[9] rlang_1.1.1 fansi_1.0.4 tools_4.2.0 grid_4.2.0
[13] gtable_0.3.3 utf8_1.2.3 cli_3.6.1 withr_2.5.0
[17] lifecycle_1.0.3 tzdb_0.3.0 vctrs_0.6.2 glue_1.6.2
[21] stringi_1.7.6 compiler_4.2.0 pillar_1.9.0 generics_0.1.3
[25] scales_1.2.1 pkgconfig_2.0.3

@axelgschwind axelgschwind added the bug Something isn't working label May 15, 2023
@mlist
Copy link
Collaborator

mlist commented May 15, 2023

I also don't see duplicates but could the problem be that some signature genes are missing in your data? The number of genes differs between both data frames...

> table(duplicated(rownames(data_not_working)))

FALSE 
18945 

> table(duplicated(rownames(working)))

FALSE 
22663 
> 

@LorenzoMerotto
Copy link
Collaborator

@axelgschwind I got the same error as you had.
The problem is related to quantiseq. We will dig into that, but in the meantime you can use other methods. For instance, EPIC is working.

@czackl you curated the inclusion of quantiseqr right?

@fernangomezv
Copy link

Hi!
I have the same problem. However, I confirm that EPIC analysis works.
Thanks in advance.

@axelgschwind
Copy link
Author

@LorenzoMerotto Thank you very much. I have some additional information:
I ran the non-working sample a second time. This time I used R=4.3.0 and surprisingly it worked with the newer version of R.
However, in the requirements say R >= 4.1.

Best,

Axel

@huyukai0126
Copy link

I got the same error :
Error in .rowNamesDF<-(x, value = value) :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique value when setting 'row.names': ‘entry withdrawn’
I am sure,my "row.names" are not duplicated.
when i use the thousand lines (exprMatrix[1:1000,]),it can Deconvolution successful!
I wonder if you could help me solve this problem,thanks in advance.

@JohannesKersting
Copy link

I get the same error.
However, it works when I use the intersection of symbols in my expression data (which I checked for uniqueness) and the symbols in the example data rownames(immunedeconv::dataset_racle$expr_mat)

This is why I assume that only certain symbols cause trouble.
quantiseq outputs the following line:
Gene expression normalization and re-annotation (arrays: FALSE)

Is it possible that the re-annotation step introduces duplicate row names despite unique row names in the input data?

@suhuanhou
Copy link

rownames(df_bulk) <- toupper(rownames(df_bulk))

I had the same problem, and I capitalized the gene name and it worked. It's probably related to the human genetic pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants