-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring ideas for log_rewards
#200
Comments
IMHO, solution 2 seems more reasonable to me. I think the possible issue can be resolved by removing |
The reason we have `log_rewards` in these containers is to prevent
re-computing it.
The log reward can be very expensive to recompute - or very cheap. It
completely depends on the environment.
I'm open to refactoring this somehow (not sure how atm) but -- we must
cache these results somewhere, it's very inefficient to not, and I think
where it is currently is a decent spot.
Joseph Viviano
@josephdviviano <https://twitter.com/josephdviviano>
viviano.ca
…On Fri, Oct 18, 2024 at 5:37 AM Sanghyeok Choi ***@***.***> wrote:
IMHO, solution 2 seems more reasonable to me. I think the possible issue
can be resolved by removing log_reward from State with further
modifications (I've quickly checked, and it seems not very tricky).
—
Reply to this email directly, view it on GitHub
<#200 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA7TL2U535VPMUJJYAMPNXTZ4DJF3AVCNFSM6AAAAABP62TH2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRRHE3TQNJXGI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I investigate it further, and it seems torchgfn/src/gfn/containers/trajectories.py Lines 94 to 98 in 9c9e1af
which ensure log_rewards is always not None. So, computation here is never triggered: torchgfn/src/gfn/containers/trajectories.py Lines 155 to 165 in 9c9e1af
I checked that this is the case in this commit (tests run correctly): https://github.com/younik/torchgfn/tree/test-log-rewards-comp This allows to easily do the solution 2, and straightly remove the env dependency. It also allows for a bunch of code cleaning (in some places we check if log_rewards is None). |
Hi @younik I think the easiest fix is to replace line 157 with the appropriate check (checks whether I'm curious what you think? |
I believe the semantics of empty should be" the trajectory is empty", and it shouldn't happen that we have n states with an empty log reward tensor. To indicate something must be computed, it is better to use However, it looks like we don't need 158 assert self._log_rewards.shape == (self.n_trajectories,) |
Yes, I agree with this. Sorry for the lag in my reply. |
Computing
log_rewards
requires access to the environment. However,Transitions
,Trajectories
andStates
providelog_rewards
, making a complicated dependency among these class.I propose two solutions:
log_rewards
inTransitions
,Trajectories
andStates
. I supposelog_rewards
is only needed inGFlowNet
classes, we can compute it directly there. The only exception isPrioritizedReplayBuffer
, which we can add a scoring function attribute or a score for each added object.This solution has the drawback of removing caching mechanism (is
log_rewards
computed multiple time not the same object? Is it a heavy computation?)log_rewards
at the initialization ofTransitions
,Trajectories
andStates
without accepting None. This is problematic forStates
asenv.log_reward
. work on states, making it a chicken-and-egg problem.The text was updated successfully, but these errors were encountered: