[Leetcode] Min-max problem (Game Theory)

Min-max problem

How to tell if it’s a min-max problem:

2 players
Take turns
Both play optimally (note: optimally doesn’t mean greedy, optminally is for the final score, not current score)
Use recursion or DP (for optimization)

Since the search space is huge O(d^b), we need to use memoization. Try to find the maximum relative score given a start position.

Time complexity: O(|problems| * |actions|)
Problems: ths start position: O(m)
Actions: take 1 or 2 or 3 stones: O(1)

Time: O(n) in total
Space: O(n) -> O(1) buttom up

877. Stone Game

Leetcode: https://leetcode.com/problems/stone-game/
Video: https://www.youtube.com/watch?v=xJ1Rc30Pyes

Similar question:

Predict Winner: https://leetcode.com/problems/predict-the-winner/

maxDiff returns relative score

O(2 ^ n)

def PredictTheWinner(self, nums: List[int]) -> bool:
    def maxDiff(l, r):
        if l == r:
            return nums[l]
        selectLeft = maxDiff(l + 1, r)
        selectRight = maxDiff(l, r - 1)
        return max(nums[l] - selectLeft, nums[r] - selectRight)
    return maxDiff(0, len(nums) - 1) >= 0

With memoization: Use dp[l][r] to store the best score the current player can achieve

Time: O(n^2)
Space: O(n^2)

def PredictTheWinner(self, nums: List[int]) -> bool:
    memo = {}
    def maxScore(l, r):
        nonlocal memo
        if (l, r) in memo:
            return memo[(l, r)]
        if l == r:
            return nums[l]
        selectLeft = maxScore(l + 1, r)
        selectRight = maxScore(l, r - 1)
        ans = max(nums[l] - selectLeft, nums[r] - selectRight)
        memo[(l, r)] = ans
        return ans
    return maxScore(0, len(nums) - 1) >= 0

DP solution

dp[i][j]: best relative score of subarry s[i] ~ s[j]

def stoneGame(self, piles: List[int]) -> bool:
    n = len(piles)
    dp = [[0 for _ in range(n)] for _ in range(n)]
    for i in range(n):
        dp[i][i] = piles[i]
    for length in range(2, n + 1): # at least 2 elements
        for l in range(n - length + 1):
            r = l + length - 1
            selectLeft = piles[l] - dp[l + 1][r]
            selectRight = piles[r] - dp[l][r - 1]
            dp[l][r] = max(selectLeft, selectRight)
    return dp[0][n - 1] > 0

Optimize Space:

def stoneGame(self, piles: List[int]) -> bool:
    n = len(piles)
    dp = [0 for _ in range(n)]
    for l in range(l, n + 1):
        for i in range(n - l + 1):
            dp[i] = max(piles[i] - dp[i + 1], piles[i + l - 1] - dp[i])
    return dp[0] > 0

1140. Stone Game II

Leetcode: https://leetcode.com/problems/stone-game-ii/
Video: https://www.youtube.com/watch?v=e_FrC5xavwI

For the same rest of array and the same M, the result should be the same -> use memoization
Recursion + memoization

def stoneGameII(self, piles: List[int]) -> int:
    res = 0
    M = 1
    memo = {}
    def maxRelativeScore(index, M):
        if index >= len(piles):
            return 0
        M = min(M, len(piles))
        if (index, M) in memo:
            return memo[(index, M)]
        if index + 2 * M >= len(piles): # the current player can take all piles left
            memo[(index, M)] = sum(piles[index:])
            return sum(piles[index:])
        maxScore = -float('inf')
        curr = 0
        for take in range(1, M * 2 + 1):
            curr += piles[index + take - 1]
            maxScore = max(maxScore, curr - maxRelativeScore(index + take, max(take, M)))
        memo[(index, M)] = maxScore
        return maxScore 
    return int((sum(piles) + maxRelativeScore(0, 1)) / 2)

1406. Stone Game III

Leetcode: https://leetcode.com/problems/stone-game-iii/
Video: https://www.youtube.com/watch?v=uzfsrChj8dM

Bottum up DP
dp[i]: max relative score current player can get if the game starts at i-th stone.
dp[i] = max(sum[s[i: i + k]] - dp[i + k]) while k in (1,2,3)
Explain: if A starts at i, and dp[i] is the max score A can get if A start at i, then when we move to i - 1 and let A start from there, if A pick to only have 1 step, then B would start at dp[i], which already saved the optminal score of the player who start at dp[i]

def stoneGameIII(self, stoneValue: List[int]) -> str:
    dp = [-1001 for i in range(len(stoneValue))]
    dp[-1] = stoneValue[-1]
    for i in reversed(range(len(stoneValue) - 1)):
        curSum = 0
        for k in range(3):
            if i + k < len(stoneValue):
                curSum += stoneValue[i + k]
                if i + k + 1 == len(stoneValue):
                    dp[i] = max(dp[i], curSum)
                else:
                    dp[i] = max(dp[i], (curSum - dp[i + k + 1]))
    if dp[0] < 0:
        return "Bob"
    elif dp[0] > 0:
        return "Alice"
    else:
        return "Tie"