Understanding 'Abbey' and Markov chains

I’m trying to wrap my head around ML and Markov chains to make an RPS bot for the first ML project. I need examples so I tried studying Abbey’s code and I think I understand it somewhat until the sub_order variable comes into play, then I’m lost. I see that sub_order is a dictionary but I don’t understand what the code inside it does?

Just to establish a baseline, a Markov chain is a sequence of outcomes (or just things) that appear (and may be) random but we are assuming have some kind of structure of which we are not aware, hoping that if we chain enough outcomes (events, things, whatever) together that they will be able to predict the next outcome.

Abbey uses two steps to do this: the (opponent’s) previous play and the possible current plays. Her code tries to find the most likely two step response to choose the current play. So, assuming the last play was P, there are three possible current plays: PR, PP, and PS. The best play is the play that counters the most likely of these three (the most common in the history). The two step history is held and updated in play_order. The potential_plays is built each time from the last play plus the three possibilities for this play. The sub_order is a new dict built from the potential_plays that are already present in play_order, so sub_order is just the relevant part of play_order that has the history of the current potential plays.

If you edit the code for Abbey and add

    sub_order = {
        k: play_order[0][k] for k in potential_plays if k in play_order[0]
    }

    print(sub_order)

you will get output like

{'RR': 109, 'RP': 109, 'RS': 111}
{'PR': 110, 'PP': 110, 'PS': 110}
{'SR': 110, 'SP': 111, 'SS': 111}

This just makes it convenient to find the most likely two-play sequence without having to filter out the other sequences from play_order some other way.

As for a literal explanation of the code it just means create a new dict by looping over the items in potential_plays (the for k in potential_plays) if the item is already in play_order (the if k in play_order[0]), making the keys in the dict the items k (the k: ) and the values the value in play_order[0][k] (the play_order[0][k]). If you don’t like list comprehensions, you can always unroll it like

    sub_order = {} # new dict
    for play in potential_plays: # loop over plays
        if play in play_order[0]: # is the play in play_order?
            sub_order[play] = play_order[0][play] # create the entry

Once you understand the length 2 chains, try modifying the code for length 3 or length n and you can look at how the chain length affects the execution time.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.