A Diplomacy DeepMind?

A long time ago, I made the mistake of buying Mindscape’s Diplomacy PC game. And, yes, it was a mistake!

Now, if you’ve tried it, you probably know what I mean. From the point of view of a proper Dip game – you know, with communication – the “AI” in the game is, well, pathetic. You tell your opponent you’re going to do one thing and then do something completely different. It’s simple, just don’t tell an opponent you’re going to attack them and you can get away with anything.

For Gunboat Diplomacy, when there are no communications allowed, I’m sure it’s OK. I don’t really know: I didn’t play it. I’m not a big fan of Gunboat Diplomacy anyway: in a game called Diplomacy, playing without Diplomacy feels like a 10% game.

No disrespect to those who love Gunboat: it’s just not my thing. If you’re wanting to learn the rules or basic tactics, it’s great. And there are people who specialise in this variant. Go on, enjoy it, it’s what it was invented for!

What the people behind DeepMind are aiming for, however, is to develop an AI that can play Diplomacy proper.

I was a little disappointed with the article I’ve linked to above, I have to say. It seems that they haven’t yet got very far. In other words, it’s a This-is-what-we’re-looking-into articles rather than any news about what they’ve been able to do.

Of course, it was published over a year ago: it would be interesting (possibly) to catch-up with where they are now.

We know about AIs on social media. Twitter is awash with fake accounts run by AIs. I’ve played around with a couple. It seems that the only way you can have anything like a conversation with them is if you repeat back what they’ve told you. Occasionally ask a question and you’ll get an answer… just don’t make it too complicated a question. “What’s the weather like where you are?” will allow the AI to give an answer. Open questions, though, fry the AI’s brains.

And this is the problem with an AI playing Diplomacy: it requires a nuanced approach to play the game well, to be able to read between the lines of the correspondence. Often, you have to compare what a player is telling you with what the situation is on the board.

With the previously mentioned computer game, you could have three units bordering a space occupied by the AI and tell them that you’re not going to attack that space. It would ‘believe’ you. That just isn’t good enough. In a real world game, in the same situation, your human opponent would probably end up PMPLing at such a promise.

So, what does an AI have to be able to do to play Diplomacy? According to the developers the following:

It has to “constantly reason about who to cooperate with and how to coordinate actions.”
It has to understand the “tension between cooperation and competition in Diplomacy.”
It needs to how to “establish trust, but might also exploit that trust to mislead their co-players and gain the upper hand.”

Now, in fairness, the developers are trying to use Diplomacy to train AIs with the skills to operate and cooperate with humans and each other. Diplomacy is the tool being used for this, rather than developing an AI that is a good Dip player (although I’m sure they’ll find a way to exploit this if they can).

Diplomacy is often used as a tool for teaching and research. Teachers use it to teach history, international relations, social studies, and negotiation skills. Both Playdiplomacy and webDiplomacy allow Diplomacy to be used for school games.

It was also used as a tool to understand betrayal and treachery. The report itself is an interesting piece (probably even more interesting if you understand the metalanguage of this type of research. I’m going to write a post on it later.

It’s interesting that the big brains at DeepMind are looking at using Diplomacy for establishing cooperation in a competitive environment, then, rather than studying competition alone. This indicates something a lot of Dip players understand: that negotiation is about finding win-win solutions and establishing cooperation rather than the Trump-like negotiation of simply bullying or lying.

But will an AI ever be able to play Diplomacy well?

Perhaps the Gunboat variant. This is more a tactical game, understanding how to best move pieces on the board. AIs have proved themselves capable of this in one-on-one games such as Chess and Go. However, even in Gunboat Diplomacy, a game involves some kind of understanding of cooperation and finding ways to communicate that players want to cooperate.

The report linked to above shows just how complex it is to train an AI to play even Gunboat Dip. It’s not a one-on-one game; it involved simultaneous moves, so relies on making decisions without complete knowledge of what other players will do; and it has a huge combinational action space – metalanguage for the possible ways players could cooperate. This last they estimate as being a game-tree size of 10⁹⁰⁰ – with between 10²² and 10⁶⁴ joint actions per turn. I’m no mathematician but those a pretty large numbers.

Now let’s look at what else an AI has to learn to play real Dip.

Players have to work out who is going to be a good ally. Effectively they have to take into consideration how trustworthy someone is going to be in various circumstances. This can be done by analysing the board: if someone looks like they’re going to attack you, they probably will. It seems to me – a non-techy – that this is something an AI can probably calculate.

But that’s not it all. The board can be deceiving. Just because someone looks as if they’re going to attack doesn’t mean they will. You have to take into account the correspondence they have with you. And this is where the other report I’ve linked to in this post comes into play: Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game.

How has the player’s communications changed? Are there any tells in their correspondence? Again, this seems as if an AI could be taught to recognise this, and even compare it to what it has read on the board.

Still, though, this isn’t the whole story. Instinct comes into it – and paranoia. Does it feel as if a stab is coming? And if it does, is this simply your paranoia – because everyone in Dip is out to get you, after all – and can you manage it?

Can an AI be taught instinct? Or is being able to calculate a large set of variables, of different types, a sufficient replacement for instinct?

I suppose one thing the AI could do is analyse past games involving players to learn what they’ve done before in similar situations. If a player proves to be someone who has an established pattern of behaviour, then the AI could learn how to predict a player’s actions in given situations.

This is indeed a good thing for a Dip player to do. I’ve done the same thing, in a way: I’ve identified players in a game that are likely to run away when things get tough, and I’ve looked at how they handle playing specific powers. But this is only useful when the data is there – and no use at all in an anonymous game!

The AI also needs to understand cooperation on a competitive environment. In other words, it needs to understand when it is better to cooperate despite the fact that the goal is to secure a solo victory.

This is confusing enough for humans. Because of this, you’ll come up against Carebears, players who play cooperatively to secure not a win but a draw. These players commonly look on achieving a draw as a “shared win”, a nonsensical concept in the Diplomacy game.

You’ll also come across Unusists, who play to solo only, seeing a draw as a failure (I briefly discussed Unusism in this post). This can lead to players abandoning any logical strategy because the chance to solo has gone; instead, they can quickly turn into Kingmakers (players who work to help another player win the game) or Abes (short for ‘abnegaters’, players who give up on the game but keep playing it).

This, in itself, is a problem for an AI, which presumably plays to get the best result possible from the game. How does the AI deal with players who simply don’t play to the objectives of the game?

2-player, non-cooperative games are easy for an AI. Multi-player, collaborative but competitive games..? Does the AI become a Carebear? Does is become a Unusist? Does it flip-flop between alliances, making it non-collaborative? This feels like it requires a greater understanding of on human motivations, rather than human nature.

Finally, is a manipulative AI what we really want to design? Not to go all Terminator but do we want an AI that learns how to manipulate people? This is what we’re talking about when recognising that playing Diplomacy involves establishing trust but also using that trust for personal gain.

In a game, that’s fine. If we’re only talking about developing an AI that can play Diplomacy, go for it. If we’re talking about using learning to play Diplomacy as a tool to create an AI that can use these skills IRL, then we’re going into potentially dangerous areas.

So, OK, yeh… I went all Terminator after all.

This, though, is a real problem when designing even a Dip-playing AI. The two concepts – building trust and betrayal – are contradictory. In Diplomacy, the urge to break an alliance because it is an opportunity needs to be managed, balanced with the long-term view which suggests that maintaining the alliance is the best way forward… at least, until it isn’t.

How close are we to achieving this? I don’t know. It feels, though, as if this is a little pipedreamish. Something that is a goal too far out of reach. Is an ethical AI that betrays people possible? I guess that depends on the ethics of the human programmer or controller and that’s scary enough!

It’s an interesting idea. It feels like we’re a long way from realising it, though. And, perhaps, we should be.

A Diplomacy DeepMind?

A long time ago, I made the mistake of buying Mindscape’s Diplomacy PC game. And, yes, it was a mistake!

Share this:

Leave a comment Cancel reply