You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, I have a conceptual comment/question regarding CellRank's current handling of aggregated terminal states. Say I compute the macrostates A, B, C, D, E, and I want to aggregate them as terminal states 1: A and B, 2: C and D, 3: E. This can be done conveniently in the set_terminal_states_from_macrostates method. Now, I think under the hood, this method selects the 30 most confidently assigned cells for each aggregated terminal state and uses these to compute fate probabilities, when I call g.compute_absorption_probabilities. I think this is not really the intended behavior: say macrostate A is really dominant, then aggregated terminal state 1 will have almost exclusively A cells, and won't really represent the combination of A and B. The same holds for fate probabilities, these won't really be representative of the aggregated terminal state, but of whatever individual macrostate is dominant. I'm just observing this behavior in one data example and I find it a bit troubling.
Instead, what would be potentially better is to keep all 30 cells from both A and B, to use 60 cells to represent terminal state 1, and the same for all aggregated terminal states. What do you think @michalk8@WeilerP ? An alternative would be to randomly sample from these 60 cells until we have 30, but I'm not sure that's what we want.
The text was updated successfully, but these errors were encountered:
After aggregation using g.set_terminal_states_from_macrostates(names=[ 'Excretory_gland, AMso', 'ASH, AWC', 'RIM, SIB, AVK']), I get for terminal states:
I would argue that these cells are not fully representative of the terminal states I would like to use.
Hi all, I have a conceptual comment/question regarding CellRank's current handling of aggregated terminal states. Say I compute the macrostates A, B, C, D, E, and I want to aggregate them as terminal states 1: A and B, 2: C and D, 3: E. This can be done conveniently in the
set_terminal_states_from_macrostates
method. Now, I think under the hood, this method selects the 30 most confidently assigned cells for each aggregated terminal state and uses these to compute fate probabilities, when I callg.compute_absorption_probabilities
. I think this is not really the intended behavior: say macrostate A is really dominant, then aggregated terminal state 1 will have almost exclusively A cells, and won't really represent the combination of A and B. The same holds for fate probabilities, these won't really be representative of the aggregated terminal state, but of whatever individual macrostate is dominant. I'm just observing this behavior in one data example and I find it a bit troubling.Instead, what would be potentially better is to keep all 30 cells from both A and B, to use 60 cells to represent terminal state 1, and the same for all aggregated terminal states. What do you think @michalk8 @WeilerP ? An alternative would be to randomly sample from these 60 cells until we have 30, but I'm not sure that's what we want.
The text was updated successfully, but these errors were encountered: