-
Notifications
You must be signed in to change notification settings - Fork 193
Dependent Actions in MultiDiscrete Action Space #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello, Otherwise, maybe @vwxyzjn has an idea? |
Hi, have you found the solution? I have the same problem. |
Hi, we were facing such a similar situation too for our particular use-case. Maybe someone should start looking into this. I'm also open to collaborate on this. |
I directly modified the |
Any chance for @bbarisbaturay to report a code snippet of the dependent action masking process? |
I need the same thing! |
❓ Question
I'm currently working on a project with my team, developing a MaskablePPO reinforcement learning model with MultiDiscrete action space. Since, our action space is really large, we wanted to create independent actions. However, in our case we've encountered a challenge: our model needs to incorporate action masking that handles dependent actions. This means if one action is selected, it might invalidate other actions, a feature our current model setup doesn't support as it only allows for independent action masking.So briefly, if we use Discrete action space we can handle those action dependencies but our action space becomes massive to handle. If we decide to separate actions and create a MultiDiscrete action space, we won't be able handle dependent actions such as; if I choose [1,,] 1 as my first cction we should be only allowing second and third actions to be 2 or 3. So, [1,2,3] --> is valid while [1,1,3] is not and it should be masked.
I'm keen to find efficient solutions or workarounds for this issue. I'd appreciate any suggestions or advice. I'm also open to discussing this further if you're interested in collaborating. Thank you for your time!
Checklist
The text was updated successfully, but these errors were encountered: