VK Dice By: Kenny Gutierrez, Vyvy Pham Mentors: Sarah Eichhorn, Robert Campbell.
-
Upload
amos-obrien -
Category
Documents
-
view
218 -
download
0
Transcript of VK Dice By: Kenny Gutierrez, Vyvy Pham Mentors: Sarah Eichhorn, Robert Campbell.
Rules
Variation of the game Sequences
On each turn, a player rolls 6 dice
Player is given option to reroll once but all 1s must be kept
Larger sequences are worth more points
Three or more 1s = Restart Score
Repeats of a certain number is counted once
Winner is the first player to reach 100
Scoring
1 – 5 points
1,2 –10 points
1,2,3 –15 points
1,2,3,4 –20 points
1,2,3,4,5 –25 points
1,2,3,4,5,6 –35 points
Objective
Optimal StrategyWhen to RerollWhich dice to keep/reroll
Computer Adaptive Learning Program simulate one million rolls for each run. Programmed to run 5 times simultaneouslyDetermined which actions repeated most frequently
for all game states
Repeated actions of the computer are compared to the Expected Values for each game state.
Description
6^6= 46,656 game states 462 don’t include
repetition
Different states are grouped into sections according to the same numbers, regardless of repetition
[2, 4, 4, 5, 5, 6][2, 4, 4, 5, 6, 6][2, 2, 4, 4, 5, 6][2, 2, 4, 5, 5, 6][2, 2, 4, 5, 6, 6]
[2, 4, 5, 5, 5, 6][2, 4, 5, 5, 6, 6][2, 4, 5, 6, 6, 6][2, 4, 4, 4, 5, 6][2, 2, 2, 4, 5, 6]
[1, 2, 2, 2, 2, 6][1, 2, 2, 2, 6, 6][1, 2, 2, 6, 6, 6][1, 2, 6, 6, 6, 6]
[1, 2, 2, 2, 3, 4][1, 2, 3, 3, 4, 4][1, 2, 2, 3, 3, 4]
[1, 2, 2, 3, 4, 4][1, 2, 3, 3, 3, 4][1, 2, 3, 4, 4, 4]
Probability for Game States
Sections are further divided into: one 1, two 1s, no 1s
The probability is the same within each section
Probability is calculated for every reroll option.Game States without 1s (Reroll 1-6 Dice)Game States with one 1 (Reroll 1-5 Dice)Game States with two 1s (Reroll 1-4 Dice)
Cases For Each Reroll
Reroll 6 Dice (No 1s)
-Get 1,2,3,4,5,6,
-Get 1, 2,3,4,5 not 6
-Get 1,2,3,4 not 5, not 1,1,1
-Get 1,2,3 not 4, not 1,1,1
-Get 1,2 not 3, not 1,1,1
-Get 1 not 2, not 1,1,1
Reroll 5 Dice (one 1)
-Get 2,3,4,5,6
-Get 2,3,4,5 not 6
-Get 2,3,4 not 5, not 1,1,1
-Get 2,3 not 4, not 1,1,1
Get 2, not 3, not 1,1,1
Ex. Of Calculating ProbabilityReroll 5 for Initial Game States Without 1s
Case A: Get 1,3,4,5,6 5! (1/6)5 Probability= 5/324= 1.54%Case B: Get 1,3,4,5 not 6 *1 3 4 5 (4) = 5! = 120*1 3 4 5 (1 or 3 or 4 or 5) = 4(5!/2!) = 240Probability= 350/6^5 = 5/108 = 4.63%Case C: Get 1,3,4 not 5, not 111 *1 3 4 (2,2) or (6,6) = 2(5!/2!) = 120 * 1 3 4 (2 and 6) = 5! = 120 *1 3 4 (2 or 6) (1 or 3 or 4) = 6(5!/2) = 360 *1 3 4 (4,4) or (3,3) = 2(5!/3!) = 40 *1 3 4 (1,3) or (1,4) or (4,3) = 3(5!/(2! 2!)) = 90 Probability= 730/6^5 = 365/3888 = 9.39%Case D: Get 1, 3, not 4, not 1,1,1 *1 3 (2,2,2) or (5,5,5) or (6,6,6) = 3(5!/3!) = 60 *1 3 (2,5,6) = 5! = 120 *1 3 (2,2) or (5,5) or (6,6) (Different: 2 or 5 or 6) = 6(5!/2!) = 360 *1 3 (2,2) or (5,5) or (6,6) (1 or 3) = 6(5!/(2! 2!)) = 180
*1 3 (3,3,3) = 5!/4! = 5 *1 3 (3,3) (2 or 5 or 6) = 3(5!/3!) = 60 *1 3 (1,3) (2 or 5 or 6) = 3(5!/4) = 90 *1 3 (2,2) or (5,6) or (4,6) (1 or 3) = 6(5!/2!) = 360 *1 3 (1,3,3) = 5! (3! 2!) = 10 Probability = 1245/6^5 = 415/2592 = 16.0%
Case E: Get 1, not 3, not 1,1,1 *1 (2,2,2,2) or (4,4,4,4) or (5,5,5,5) or (6,6,6,6) = 4(5!/(2! 2!)) = 20 *1 (2,2,2) or (4,4,4) or (5,5,5) or (6,6,6) (Different: 2,4,5,6) = 12(5/3!) = 240 *1 (2,4,5,6) = 5! = 120 *1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (6,2)(6,4)(4,5)(4,2)(6,5)(5,2)) = 12(5!/2!) = 720 *1 (4,4) or (5,5) or (6,6) or (2,2) Differ: (4,4) (5,5) (6,6) (2,2) = 12(5!/(2! 2!)) = 360 *1 (1) (4,4,4) or (5,5,5) or (6,6,6) or (2,2,2) = 4(5!/(3! 2!)) = 40 *1 (1) (2,2) or (4,4) or (5,5) or (6,6) Differ(2,4,5,6)= 12(5!/ (2! 2!)) = 360 *1 (1) (2,4,5) or (2,4,6) or (4,5,6) = 3(5!/2!) = 180 Probability = 2040/65 = 85/324 = 26.23%
Finding Expected ValuesSum of all possible values each multiplied by the
probability of its occurrence
Example: [1,2,3,4,4,5]Reroll 0 25.00
Reroll 1Keep:2,3,4,5
26.6667
Reroll 2Keep:2,3,4
21.5278
Reroll 3Keep:2,3 16.9676
Reroll 4Keep:2
12.7199
Reroll 5Reroll all 9.0766
Reroll 0 25
Reroll 1 ((5/6)*(25)+(1/6)*(35))
Reroll 2
((1-(1/18+1/4+1/36))*(20)+(1/18)*(35)+(1/4
)*(25))
Reroll 3
((1-(1/36+1/9+61/216+2/27))*(15)+(1/36)*(
35)+(1/9)*(25)+(61/216)*(20))
Reroll 4
((1-(1/54+7/108+91/648+311/1296+19/144)
)*(10)+ (1/54)*(35)+(7/108)*(25)+(91/648)*(20)
+(311/1296)*(15))
Reroll 5
((1-(5/324+5/108+25/324+95/648+29/144+
1466 / 7776))*(5)+(5/324)*(35)+(5/108)*(25)+(25/324)*(20)+(95/648)*(15)+(29/144)*
(10))Legend Pink: Probability of each cases multiplied by scoreYellow: Probability of getting the same score and not 1,1,1
Inside the Program Runs five times
Each Run 1,000,000 dice rolls Prints computer’s
actions for all game states
Learns based on result of each roll through reward and punish system
How Does It Work?
Set of six numbers for each initial game state. Each number pertains to one of the six dice Initially, each number in the list contains 50
Program generates random number between 1-100 for each number.
In order to reroll a die, the random number must be between the range 1-(list of number)
Ex
Game State: [1, 2, 3, 5, 5, 6]
List: [55, -10, 32, 87, 98, 103]
Random #s: [60, 39, 47, 37, 12, 18]
Action: [Keep, Keep, Keep, Reroll, Reroll, Reroll]
Rewarding & Punishing
Reward: Certain number of points based on
score after re-roll
IF final score > initial score Increase probability of repeating
that action by either adding or subtracting Adds when the computer
rerolled, subtracts when it kept dice
Punish: Only when the re-rolls end with at
least three 1s Decrease probability to avoid
that action by either adding or subtracting Adds when the computer
kept dice, subtracts when it rerolled.
TABLE FOR PUNISH & REWARDReward & Punish
5 - ±1
10 - ±2
15 - ±3
20 - ± 4
25 - ±5
35 - ±7
1,1,1 - ±5
Rewarding & Punishing
Computer will never reroll 1, regardless of the number in the list
After subtracting and adding to each list, the numbers will eventually go into the negatives or above 100Negatives= Always Keep (N)Over 100= Always Reroll (Y)
Between 1-100Undetermined (U)
Best Move Mechanic
Mechanism implemented to help computer learn the optimal strategy
Before keeping a die, the computer checks if there is a better option
Ex. [ 1, 2, 3, 5, 6, 6] If it wants to keep two 6s, it will change to keep 2 and 3.
Comparing Program with Theoretical Probability
Examples of each Initial Game States: Without any 1sWith one 1With two 1s
Adaptive learning program- used the actions of the dice most common out of the five runs
Initial State without 1s
Theoretical Expected Values
Optimal Move: Reroll 3 dice; Keeping 2,3,4
Adaptive Learning Program
After 5 runs:
Most Common Move: Reroll 3 dice; Keeping 2,3,4
Example: [2, 3, 4, 4, 6, 6]
Reroll 0 0
Reroll 1Keep: 2,3,4
3.3333
Reroll 2Keep:2,3,4,6
7.5
Reroll 3Keep: 2,3,4
9.3056
Reroll 4Keep:2,3
8.9969
Reroll 5 Keep: 2
8.6002
Reroll 6Reroll All
6.1368
Conclusion: Expected Values matched EXACTLY to the Adaptive Learning Program
[2, 3, 4, 4, 6, 6]
[N, N, Y, N, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
[2, 3, 4, 4, 6, 6]
[N, N, N, Y, Y, Y]
Initial State With one 1
Theoretical Expected Values
Optimal Move: Reroll 1 Dice;
Keep 1,2,3,4,6
Adaptive Learning Program
5 Sample Runs:
Most Common Move: Reroll 2 Dice; Keeping 1,2,3,4
Example: [1, 2, 3, 3, 4, 6]
Conclusion: The expected values and the results from the program were similar. The computer chose
the 2nd best action
Reroll 0 20
Reroll 1
Keep:2,3,4,6 22.5
Reroll 2Keep:2,3,4
21.5278
Reroll 3Keep:2,3 16.9676
Reroll 4Keep:2 12.7199
Reroll 5Reroll all
9.0766
[1, 2, 3, 3, 4, 6][N, N, N, Y, N,Y][1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y][1, 2, 3, 3, 4, 6]
[N, N, Y, N, N, Y][1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, N][1, 2, 3, 3, 4, 6]
[N, N, N, Y, N, Y]
Initial State With two 1s
Theoretical Expected Values
Optimal Move: Reroll 1 Die; Keeping
1,1,2,4,5
Adaptive Learning Program
After 5 runs:
Most Common Move: Uncertain
Example: [1, 1, 2, 4, 5, 5]
Conclusion: There is a high probability of rerolling a 1 so the move is undetermined and needs more
runs
Reroll 0 10.00
Reroll 1Keep:2, 4, 5
10.8333
Reroll 2 Keep:2,4 9.7222
Reroll 3Keep:2 7.8935
Reroll 4Reroll All
5
[1, 1, 2, 3, 5, 5][Y, Y, N, N, Y, N][1, 1, 2, 3, 5, 5][Y, Y, U, U, U, U][1, 1, 2, 3, 5, 5][Y, Y, U, U, U, U][1, 1, 2, 3, 5, 5][U, U, N, N, Y, N][1, 1, 2, 3, 5, 5][Y, Y, N, N, U, U]
Conclusion
Expected Values were found for ALL game states
Adaptive Learning Program with 5 runs and created a list of actions for the 6 dice for every game state.
Most common move from the program were compared to the expected values for each game state
Program’s common moves were either the best or 2nd best action indicated by the expected valuesGame states with double 1s