No menu items!

    Dota 2 – What can we learn from OpenAI?

    There is a misconception in our community about the purpose of OpenAI—many wrongly assume that it was created to play Dota. That isn’t the case. Dota is a testing ground for a team of talented developers, who want to gauge how well their algorithms can learn to solve problems. Turns out, they do it pretty damn well.

    The funny thing about the whole experiment, however restricted it was in terms of game rules, is that for the most part the AI converged on the same strategies and tactics as human players did the last 15 years. One hero in mid, double or triple sidelanes, rune control etc.—all that is pretty in line with what we as humans do when we play Dota. It looks and feels like a regular match.

    Of course OpenAI turns things like coordination, focus fire, and macro decision-making up to eleven and it looks glorious, but this isn’t something humanly possible, and it’s not going to be a focal point of our blog. There are other things the AI does differently. Things we, as players, might want to consider and possibly incorporate in our process of learning Dota.

    Early laning

    Human players really prefer their static laning in most games. They get there, looking or hoping for a favorable matchup, maybe switch once and then mostly stand there farming, trading blows and doing regular Dota things. OpenAI disagrees with this approach.

    Rewatching the replays of both games, it is easy to see how frequently the AI switches lanes, rotating in or out to look for an edge. As soon as some hero gets to level 3, it starts looking for an opportunity to pressure and punish, and for AI it doesn’t matter, whether it is a core or a support who rotates—the only thing that matters is the success of the rotation.

    From a strictly theoretical perspective it makes a lot of sense: if you rotate a hero out of a dual lane, you still have a hero in lane to soak up XP, so you are not losing economically on the map. Killing an enemy hero is almost definitely going to yield higher returns than the cost of the TP scroll and then, after the gank, you can rotate one or two heroes out, potentially starting another gank on the other side of the map.

    It is not an XP-starved trilane that stayed in lane from the start, sharing XP, but rather fully capable three heroes, who converge on two enemy heroes with equal farm and level. And then, almost immediately, depending on the usage of resources, the same can be done on the second lane.

    Naturally, humans are limited in this regard, as they can’t necessarily calculate precisely the chance of success and compare the expected values of rotation vs. non-rotation, but we’ve seen a similar approach back in 2015 from a team called CDEC and their star player, Agressif. Incidentally, his signature hero for the tournament was Gyrocopter as well, and he would also constantly rotate and maximize his hero’s efficiency. Up to a point it certainly worked and no one gets to the second place at TI by chance.

    Farm Distribution

    The outcome of this chaotic laning stage is that the spread of levels and gold on the OpenAI team is a lot more even, than what we are accustomed to, but the important part is that the sum of it is generally higher than that of the human team.

    Unfortunately, the observers didn’t show the experience advantage of teams in the early game, but simply looking at levels alone, OpenAI always seemed to have a lead. They didn’t have “sacrificed” supports, running around level three 10 minutes into the game without boots. They preferred to have five capable and battle-ready heroes.

    There is a concept of “playing around a hero” in Dota: professional teams in the current meta generally prefer to have two tempo-heroes who come online decently early and can make plays with the help of their supports around the map. OpenAI simply turns every hero they have into a hero that can be played around, at least until 15-20 minutes into the game, drastically increasing the amount of opportunities they have.

    Naturally, over time, the inequality starts to spread and better farming heroes start getting more resources, but the inequality is never as drastic as with human teams. The AI doesn’t want to have a “top net worth” hero that will eventually win them the game. They simply want to maximize their chances of winning and decided that spreading resources is the best way to do it.

    XP priority and buybacks

    Another interesting aspect we’ve touched upon slightly is that OpenAI really likes its XP. Because of it, starting at a certain point, they start buying back frequently and seemingly unnecessarily. Buyback, TP to shrine and start farming jungle is usually a sign of tilt for a human player, but AI doesn’t get toxic or salty. It just wants to win.

    Starting at around level nine, the AI would instantly buyback on most of its heroes to start getting things done on the map. The equal net worth spread ensured that even the “core” buybacks weren’t too costly, while most of their lineup had ways of pushing waves or flash-farming.

    This ensured that the AI retained its XP lead, almost always had its lanes outpushed and was capable of fighting as a team at all times. Ironic, how OG was the team to experience FTW-buybacks from the opponent.

    Closing thoughts

    There is no denying that OpenAI won, for the most part, through its ability to make correct split-second decisions. Looking at how they focus fired targets, used their spells and calculated how long they could survive for in teamfights, it gets complicated to associate ourselves with the bots and believe that there is anything we can learn from them. After all, at this point in time, in a restricted game mode, OpenAI bots are a lot closer to “perfect Dota players” than any human players are.

    Human players are not AI: they can’t know how much HP exactly they will have after an enemy spell usage, how much damage they will take over the course of a disable when being attacked by a certain hero or how long exactly it is going to take for the enemy hero to cover a certain distance.

    The question is, given our knowledge of our own imperfection, is there a reason to try and replicate strategies and tactics OpenAI applies in Dota? AI that is fully aware of its own high capabilities through millions of trials and errors.

    We don’t know if there is a reason and whether there will be a reward for it in the end, but we believe it is absolutely worth trying. Because unlike AI, that gets smarter and progresses from one generation to the next through victories, humans are more than capable of learning from mistakes and losses.

    As seen on Dotabuff

    Latest articles

    Related articles