Skip to content

Commit

Permalink
fixed entropy for multi-discrete action spaces
Browse files Browse the repository at this point in the history
  • Loading branch information
MarcoMeter committed Jun 25, 2024
1 parent 3297da4 commit 8ff1edf
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion cleanrl/ppo_trxl/ppo_trxl.py
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ def get_action_and_value(self, x, memory, memory_mask, memory_indices, action=No
log_probs = []
for i, dist in enumerate(probs):
log_probs.append(dist.log_prob(action[:, i]))
entropies = torch.stack([dist.entropy() for dist in probs], dim=1)
entropies = torch.stack([dist.entropy() for dist in probs], dim=1).sum(1).reshape(-1)
return action, torch.stack(log_probs, dim=1), entropies, self.critic(x).flatten(), memory

def reconstruct_observation(self):
Expand Down

0 comments on commit 8ff1edf

Please sign in to comment.