-
Notifications
You must be signed in to change notification settings - Fork 49
MCTS
Peter Shih edited this page Jun 19, 2017
·
7 revisions
- Requirements
- Chance nodes
- Two random outcomes might be in identical states
- So, we don't want to create two different tree nodes in MCTS
- Share nodes
- play A then play B, equals to, play B then play A
- so, we need find identical states within MCTS
- Hidden information
- The cards in opponent's hand is hidden
- Partially-observable moves
- Play secret cards
- Chance nodes
- Conclusions
- Find identical states
- A hash table to quickly look things up
- Need to look up this table to look up an identical state
- Information set
- Use two MCTS to track two different point-of-views
- Find identical states
- Idea: Do NOT construct a tree
- [BAD IDEA?] The path encodes the game play history
- If all the nodes are flattened, it means the only thing we care is the current state of the board.
- This might be good for aggressive decks, but not mid-range or control deck
- Construct tree nodes
- Each node has values to be used in the MCTS-Selection phase
- Reward
- Visiting counts
- There's no edge between them
- Since we need to find identical states after we conduct an action
- Each node corresponds to an information set
- [BAD IDEA?] The path encodes the game play history
- Improvements
- RAVE
- Detect identity nodes in game tree
- Use a hash table to detect identity node, and jump to that node in traversal
- Implementation detail
- Information on one node:
- Total simulations passing through this node
- Total wins of all simulations passing through this node
- The AMAF value
- Maybe we can store this in another big table
- So we don't need to update so many nodes in back-propagation
- The RAVE weight
- Information on one edge
- The total playouts which choose this action
- Information on one node: