Machine Learning with Python Projects - Rock Paper Scissors

I am stuck at the part of abbey.I can’t seem to find a solution to beat abbey .Please help.I cleared remaining three with 99% win but cannot find solution for abbey

**def player(prev_play, count=[0],opponent_history=,mostcommon=“P”,predict=[“R”,“R”,“R”],predictkris=[“R”,“R”,“R”],numgames=[0],play_order=[{
“RR”: 0,
“RP”: 0,
“RS”: 0,
“PR”: 0,
“PP”: 0,
“PS”: 0,
“SR”: 0,
“SP”: 0,
“SS”: 0,


#guess = random.choice(play)
if len(opponent_history) <= 3:
  guess ="R"

if len(opponent_history) > 3:
  if opponent_history[1]=="R" and opponent_history[2]=="P"and opponent_history[3]=="P":
    choices = ["S", "R", "P", "P", "S"]
    guess = choices[count[0] % len(choices)]
  elif opponent_history[1]=="R" and opponent_history[2]=="R"and opponent_history[3]=="P":

    last_ten = predict[-10:]
    most_frequent = max(set(last_ten), key=last_ten.count)

    if most_frequent == '':
      most_frequent = "P"



  elif opponent_history[1]=="P" and opponent_history[2]=="P"and opponent_history[3]=="P":
    if not prev_play:
      prev_play = 'R'

    last_two = "".join(opponent_history[-2:])

    if len(last_two) == 2:
      play_order[0][last_two] += 1

    potential_plays = [
    prev_play + "R",
    prev_play + "P",
    prev_play + "S",
    sub_order = {
    k: play_order[0][k]
    for k in potential_plays if k in play_order[0]
    prediction = max(sub_order, key=sub_order.get)[-1:]

    ideal_response = {'P': 'S', 'R': 'P', 'S': 'R'}
    guess= ideal_response[prediction]

if numgames[0]==1000:
return guess**

Your browser information:

User Agent is: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0.

Challenge: Rock Paper Scissors

Link to the challenge:

1 Like

Have exactly the same problem :frowning:
Actually, now I found a solution: Use exactly the same strategy as Abbey, but look back longer than she does :wink:

1 Like

Hi @Mikw. I have the same problem and tried to copy abbey’s strategy but it doesn’t seem to be working. No matter how I tweak, it’s either 100% her win or 30% accross the board. Could you please share your code for this part. Thank you.

Unfortunately I don’t seem to have saved a copy :frowning:
But basically you use the last n moves to predict what Abby will do next. The important step here ist to basically let your programme dynamically/on the fly build up a list of combinations containing the last n steps + entry n+1 (Abby’s reaction) and their counts. So you start with an empty list, and every time new data rolls in, you check, if you have this entry in the list: If yes increase it’s count by 1, otherwise set it to 1
Say, we have the following example (n=2 for simplicity, to beat Abby you’ll need to increase n): Incoming data: [P,P,R,S,P,P,R…]
Initially list contains
When we have [P,P,R],(length=n+1) we enter ‘PPR’ = 1 (elements 0 to n of incoming data) into our list
Then ‘PRS’ = 1 (elements 1 to n+1 of incoming data)
Then ‘RSP’ = 1 (elements 2 to n+2 of incoming data)
Then ‘SPP’ = 1
Then we see, we already have ‘PPR’ in our list, so we increase it to 2…
Hope this helps, otherwise please feel free to ask! Sorry, I don’t seem to have the code anymore :frowning:

Hey farhad, very good news: I was able to access my code again, so here’s my solution. If you have any questions, please don’t hesitate to ask!

wtf = {}

def player(prev_play, opponent_history=[]):
  global wtf

  n = 5

  if prev_play in ["R","P","S"]:

  guess = "R" # default, until statistic kicks in

  if len(opponent_history)>n:
    inp = "".join(opponent_history[-n:])

    if "".join(opponent_history[-(n+1):]) in wtf.keys():

    possible =[inp+"R", inp+"P", inp+"S"]

    for i in possible:
      if not i in wtf.keys():
        wtf[i] = 0

    predict = max(possible, key=lambda key: wtf[key])

    if predict[-1] == "P":
      guess = "S"
    if predict[-1] == "R":
      guess = "P"
    if predict[-1] == "S":
      guess = "R"

  return guess

That was brilliant @Mikw . I was trying to come up with four different strategies with four if clause. Thank you :wink:

I think you’re supposed to use q-learning to beat it. Took me like 20 tries to find a parameter set for alpha and gamma that worked. And I used double q-learning not just q-learning.

Really elegant solution, I modified the value of n to get a more stable result.
Thank you for sharing.