Why AlphaStar Does Not Solve Gaming’s AI Problems | Design Dive


Hi I’m Tommy Thompson and welcome to Design
Dive here on AI and Games. In this episode let’s talk AlphaStar – DeepMind’s grandmaster
level AI StarCraft 2 player. AlphaStar made headlines throughout 2019 as the competence
of the system grew, first defeating pro StarCraft players TLO and MaNA followed by playing in
public matchmaking on the Battle.net European servers, allowing it to climb to the top 0.15%
of all players. The big question I hear a lot is how can the games industry capitalise
on this and build their own Deep Learning AI players. But it isn’t as straight forward
as that and despite the real innovation and excitement around AlphaStar, this isn’t going
to have an immediate impact on the way game AI is developed – or at least not in the way
you think. And in this video I’m going to explain why… Let me stress that this video isn’t intended
to speak ill of DeepMind and their work. AlphaStar is an incredible achievement that – even in
academic circles – still felt like it was years away. But rather I want to… temper
peoples enthusiasm a little bit. The media sensationalism’s of AI often means understanding
the capability of these systems is difficult to grasp. But the bigger issue is that the
way in which AlphaStar has been built does not make it an easy to adapt and translate
to a game development pipeline. So let’s talk about what this all really means for the video
game industry in the short-term rather than treating this as the next big innovation that
will transform into Skynet and eventually kill us all. **On that note, it is legit both funny and
depressing how everyone and their Aunty knows what SkyNet is yet Terminator movies are bombing
at the box office.** I won’t be talking about how AlphaStar works
in this video, because I did that already over in episode 48 of the main show. So if
you want to get a grasp of what’s actually happening under the hood of these StarCraft
AI players, then go watch that video first. Plus I do make reference to some of the points
raised in that video, but hopefully it’s all easy for you to follow along with. The first issue is that the games industry
needs to see the benefits of adopting this approach for non-player character AI before
it embraces it. This isn’t the first time machine learning has reared its head offering
to fix problems for the video games industry. In fact there was an initial exploration of
machine learning back in the late 90’s and early 2000’s – which led to games like the
original Total War and Black & White using neural networks and genetic algorithms – but
to mixed success. One of the big reasons that machine learning died out was the lack of
control or authority designers and programmers have once they’ve been trained to solve the
task at hand. Deep Learning is creating complex artificial neural networks that carry thousands
if not millions of connections that are given numeric weights. Training the connection weights
is what gives the system intelligence, but when you read that as a human – it’s just
numbers. Lots and lots of numbers. So if you build an AI player using Deep Learning
and it does something weird, you can’t crack it open and debug it. You need to isolate
what’s wrong in the design of the network, the learning process or the training data
that may have caused this erroneous behaviour and re-train it. Then if you want to create
AI that cater to particular situations or configurations, you’d need to build the training
process to reflect that. This isn’t remotely accessible or friendly to game designers who
want control over how an AI character will behave within the the game and are working
with the programming team to make that a reality. If you consider episode 47 of AI and Games
where I looked at Halo Wars 2, that whole system is built in a modular data-driven fashion
to allow designers to have a huge amount of control. Right now Deep Learning technologies
do not cater to that level of interaction and oversight for a designer to work with
it. It’s why behaviour trees are so pervasive in game AI: they’re arguably the most accessible
for both designers and developers, allowing each team to focus on their specialism without
stepping on the toes of others. This isn’t to say machine learning isn’t going
to have an impact within the industry itself, but more specifically I don’t see it being
used pervasively for in-game behaviours. Sure, we’ve seen the likes of Forza Horizon and
MotoGP adopt it for their opposing racers, but those are very bespoke situations that
actually cater quite nicely given the problem space. The industry is still evolving and
adapting to this surge in machine learning once more and while big publishers are investing
in their own AI R&D teams, that isn’t reflected in even AAA studios. Over time we’re going
to see Deep Learning used more and more in games, but not in the ways you might think
and I’d argue rarely for in-game character behaviour. The second issue is that – irrespective of
the technologies capabilities – the requirements for training AlphaStar don’t allow for it
to be easily replicated for games in active development. As mentioned in my other video,
AlphaStar’s first phase of learning is achieved by watching and then mimicking behaviours
from match replays of human players. So this is a chicken and egg problem: given
if you want to train super intelligent AI in your game, you need to have existing examples
of high-level play that it can replicate through supervised learning. If you want that training
data, then you either need to have expert players playing the game before release or
build a separate AI player to bootstrap the machine learning player by creating examples
for it to learn from – and that kinda defeats the point. AlphaStar benefits heavily from
the ecosystem that StarCraft exists within. The game has been out for nearly a decade
and is relatively bug-free, it’s been a popular eSports title for several years, plus Blizzard’s
cult of personality helps maintain an active and lively fanbase around their products.
This means lots of data already exists for AlphaStar to work with. Now all that said, the AlphaStar is still
quite a fickle system. The two version of the AI player were built against two specific
version of StarCraft 2 – with version 1 running on 4.6.2 and version 2 on 4.9.2 of the game.
Now the unspoken problem here is that any changes made to the games design that influence
the multiplayer meta in any significant way will break AlphaStar. The reinforcement learning
trains the bots against that current meta, and that means it can’t just adapt to the
changes brought on by the patch, you need to retrain it. Even the human expert play
it’s bootstrapped against might not even prove applicable anymore in this context. I can’t
say with any certainty, but there’s a small chance that already as of version 4.10 of
StarCraft 2, AlphaStar might not be able to play as well as it once did. The third and most critical element that prevents
AlphaStar being adopted en-masse is cost. Training the AlphaStar agents is an incredibly
expensive process, you need to have dedicated processing systems for the training to run
in a large distributed heterogeneous fashion. DeepMind utilise Google’s own cloud infrastructure
to achieve this and the training was executed on their Cloud Tensor Processing Unit’s or
TPUs. These are custom-developed application specific integrated circuits or ASICs designed
to help accelerate machine learning training and inference. The more recent version of AlphaStar from
November 2019 trained on 384 TPU v3 accelerators – for a period of 44 days. Now if you consider
Google’s public pricing model for using these TPUs, which runs at around $8 an hour for
a single TPU, then even a naiive estimation of cost amounts to $3,072 per hour, $73,728
a day and $3,244,032 in total. Though I’m sure DeepMind got a heavy discount. Now you might think this isn’t a big deal
when some AAA productions have budgets in the tens if not hundreds of millions of dollars,
but $3.5 million to train your AI is a ridiculous amount of money. Sure, publishers like EA,
Take Two, Ubisoft or Activision might have that kind of cash available, but this is just
the cost of running the training, not the staff, the infrastructure, the development
time and all that other critical parts of game development. Bear in mind this is but
one tiny part of a much larger puzzle when building a game of a scale akin to StarCraft.
Plus, as cool as this ridiculous expenditure is, DeepMind are actually haemorraghing money
right now – posting losses for Alphabet (the umbrella company of Google) exceeding $1 billion
dollars in the last three years. This technology is not stable enough at this stage without
further investigation for a AAA publisher to take seriously. Perhaps even more critically, this excludes
all but the top 2% of games studios and publishers even if they could afford it. The training
costs suggested here are bigger than most development budgets for a game. This technology
can’t permeate throughout the industry if it costs that much to train it. And of course,
if you need to train it again, as your design needs force you to reconsider something – boom
– that’s more money being thrown at Google to solve the problem. Alternatively, a company
invests in their own Deep Learning infrastructure or use another provider. In any case, money,
money, money. I will stress this isn’t just an issue of
unabated capitalism: the issue of data and compute resource to train Deep Learning systems
is not a solved issue and is one of the larger problems being addressed not just in research
of AI methodologies, but even hardware companies such as Intel that are building the next generation
of compute hardware to deliver training and inference of machine learning cheaper and
faster than is currently possible. Now while I’m stressing that AlphaStar isn’t
going to change gaming just yet, that is not to say that machine learning is not having
an impact within the games industry. As I mentioned earlier, the intial enthusiasm for
machine learning largely petered out by the mid-2000’s, the recent Deep Learning revolution
has seen renewed interest. But this new and more concerted effort is being explored to
address issues beyond just the creation of traditional AI players. EA’s SEED Division
revealed their work in 2018 training Deep Learning agents to play Battlefield as well
as exploring imitation learning from human play samples to bootstrap AI behaviours. Meanwhile
Ubisoft’s La Forge research lab in Montreal is experimenting with machine learning for
testing gameplay systems, AI assistants that support programmers in committing bug-free
code, motion matching animation frameworks for character behaviours and lip syncing for
dialogue in different languages. Plus the most obvious applications in data sciences
are long established at this point, as analytics teams use machine learning to learn more about
how people play their games and provide insight into changes that can be made going forward.
I mean let’s look on the bright side, I’m going to have plenty more to talk about on
this channel in the coming years! Thanks for watching this episode on Design
Dive, I figured it was worth giving my 2 cents in explaining why we shouldn’t be expecting
Deep Learning to invade all of Game AI just yet. I hope you found it interesting! If you’ve
got questions, comments or just flat out disagree with me then slap that down in the comments
and once I’ve had enough to drink I’ll go take a look! Don’t forget you can help to
support my work by joining the AI and Games Patreon or by becoming a YouTube member – just
like Scott Reynolds, Ricardo Monteiro and Viktor Viktorov have done right here, plus
all the other lovely folk you see right here in the credits. Take care folks, I’ll be back.

, , , , , , , , , , , , , , , ,

About Author

26 Comments
  1. Jahrazz Jahrazz

    nice seeing the video 46 seconds after upload

  2. AI and Games

    It's time to put the AlphaStar chat to rest. With this Design Dive episode I'm giving my 2 cents on the practical applications (and otherwise) of Deep Learning in games right now. Currently got some fun topics lined up for later in the Spring. But first, I gotta big deadline to hit by the end of the month.

    Don't worry, you'll know what I'm talking about when it hits.

  3. EnterpriseKnight

    5:27 "Active and lively fanbase around their products" yeah not so much lately huh?

  4. Krystal Myth

    Your first point is moot to me. You either want something to learn, which requires you to teach it, or you go back to the dark age and program it yourself.

  5. Jahrazz Jahrazz

    I think it is also really important to differentiate between fun AI and hard AI. Alphastar is great for pro players to have an opponent they can still learn from, but for the average gameplay AI you dont want the AI to be hard, you want the AI to create a fun gameplay experience.
    For competetive RTS games it is good to have an AI that plays like an experienced player, so new players can learn from it by watching/playing it, but it is not necessary to use machine learning, for example age of empires 2 definitive edition upgraded the old aoe2 AI to use actual meta strategies and micro.

  6. Krystal Myth

    Your first point is moot to me. You either want something to learn, which requires you to teach it, or you go back to the dark age and program it yourself. If the industry isn't up to the task, it's not a failure of the system when it's capable of learning, but we simply can'tbe bothered. Blaming the young honor student for not meeting the needs and wishes of the parent. It's not like AI is getting any better under other technologies. This is the future. Either we grasp it, or we pretend we never cared.

  7. The Left can't Meme

    Y'know, I kept hearing alphastar learned from watching 10s of thousands of games, but it never really struck me what that meant

    Alphastar didn't learn to play the game from the ground up. It probably doesn't have any implicit understand of what it's doing,. because it's just a 10s of millions of dollars exercise in "monkey see, monkey do" (or AI sees what the monkeys are doing, AI does)

    Further evidenced by the fact alphastar cannot adjust itself to new gameplay when blizzard makes a change to the units and the games meta changes.

    Now, don't get me wrong. Alphastar has done things in game that baffle the players, it's part of the reason its able to win. Alphastar has done seemingly novel things in game, assuming it has watched 10s of thousands of pro player games, or at least GM level play… especially with the economic mechanics of the game. But those actions could have just been iterative, not necessarily novel. Meaning, alphastar hadn't detected or calculated a better way to play to make mineral collection and mineral gatherer more efficient (as was first thought) but alphastar may have just been iterating a basic mechanic or "never stop building workers, no matter what" where as a pro player knows exactly when to cut building workers, and when to resume.

    Sorry if that was a bit detailed on the games mechanics. If you're curious, a starcraft pro by the name of Beasty QT has done some very high level commentary on alphastars play style.

  8. Scrysis

    Now I'm wishing we had a deep dive on Black & White. I loved that game and always wondered how the creature worked.

  9. Radiant Silver Labs

    Hi Tommy, I watch your vids every day 🙂 Id love to know your thoughts on a "nav mesh in the sky", im trying to code flying AI right now and wondering how to get something at least slight as good as a nav mesh but in the sky.. or any other way to make a simple flying ai that avoids walls etc, i guess i could just raycast and change dir etc. I even thought you could put cubes up there, bake navigation then tunr them off, but then it wouldnt really work in true y axis etc. Thanks for all the great vids, my AI of about 8 months now was built originally from your tuts.

  10. M

    So many videos! Hey Tommy, while I've been very interested in AI from a more abstract and theoretical designer perspective for a long time, I'm only now getting into the programming, and feel quite lost… I've got a partial prototype of an enemy that uses pretty much all the different factors I think I want in a full package, and was wondering if you could give me a lead on what kind of AI programming system I should invest my time focusing on in my learning process.
    (It's gonna be simple and pure top-down 2D with zero verticality simulation)

    ''Tries to keep line of sight on as much of the area spanning x distance in any direction the player could move, even takes priority over direct line of sight on player if area is big enough.
    Has less heavily weighed preference to stay close to a certain sweet spot distance away from the player.''

    Basically, I'm describing three conflicting goals it has that it tries to balance to get the most overal value, each goal has it's own value to the unit.
    Direct line of sight is an absolute goal, but ''staying close to sweetspot'' isn't, and that thing about keeping line of sight on x distance over any direction the player could move… Where do I even start with that one?

    Anyway, if you, or anyone else, has any good sources or advice on how to achieve these things, I'd be VERY appreciative.

  11. Starch Wars

    Great video! I love the realistic view of the situation. ML is amazing but not widely applicable yet

    I also didn’t know Black & White has neural networks! Time to learn more

  12. IMMentat

    The difference between good gameplay AI and winning game using AI are worlds apart.
    Games like half life and fear nailed FPS ai 20 years ago but the industry frequently fails to learn from past glories or failures.

  13. Topher Doll

    On the topic of adapting to patches, I found that interesting because your notes are exactly what humans deal with. They use old patch builds and ideas initially and adjust as those builds are proven to be good or bad. I think the difference is humans theorycraft the moment patch notes show up, trying to know how changes will impact their build and then testing these theories, something AlphaStar wouldn't do. So Humans should, big should because often times the theories we come up with don't match the actual game in testing, have a leap in terms of adjusting to changes but I thought that topic was interesting and how it compared to how humans adjusted to change.

  14. Kalenz

    The shocking revelation of AlphaStar's capabilities is not that it will solve AI problems in gaming development. It's that we now have AI that beats players at fast paced multi tasking strategy games.
    It's a milestone, way more impressive and frightening than chess AIs beating chess masters.
    It's another proof that AI can replace any human output whether physical or mental.
    Reeducation of workers is a big topic in order to deal with automation and prevent people from becoming unemployed.
    There is no guarantee that by the time their reeducation is complete there won't be an AI that can take over the job they just learned.

  15. antdgar

    10:08 Obama looking pretty good!

  16. The NetherOne

    if I was going to use Deep Learning in a video game as the default AI I wouldn't train it in a lab…
    I'd use the games own single player mode as the training group, every player connected to the internet would be training the AI weather they realised it or not. (insert evil villain laugh)

  17. Remember Comics

    The last Terminator movie didn't do well at the box office because people got pissy and totally forgot that it largely rehashes the original movie with a fresh modern take and instead thought it was a heretical aberration, the same way they did with Ghostbusters. It's less that Terminator was bad and more that "female led movies" are being assailed by whiny fanboys.

  18. Skynet

    Why kill them all, when you can beat them all at every video game they come up with and make them feel inferior for all of eternity

  19. Dave Churchill

    AlphaStar also ran 80,000 simultaneous instances of SC2 during training. So they had at LEAST 80,000 CPUs running on top of those TPUs.

  20. Kevin Griffith

    If it weren't for the prohibitive cost of building and training the AI, I would say that an AI like Alphastar could be a very effective tool for balancing gameplay that is generally difficult to quantify. Essentially by taking out factors like player skill or community bias (one of the characters/factions being favored over the others for non-balance reasons) and repeatedly testing you can get a substantial amount of high quality game balance information, particularly if you're already set up to record useful data from the matches (how much of a unit gets made, how many times this ability gets used, such like that). At the moment though, Blizzard is already very effective at gathering data from online matches, so unless the process becomes much cheaper then the industry will likely stay the course.

  21. DartStone

    It may be possible though to use a subset of GAN (Generative Adversarial Neural Networks). But for now, there are no out of the box solutions and you weren't wrong in stressing how much cost effectiveness is important. Even if EA or Activision got millions to put in this kind of R&D, they won't do it because they have no guarantee on the return on investment. They are more focused on designed new manipulative micro transaction systems. But you are also right by saying that ML is mostly used for now, in graphics and sounds. Nvidia already shown promising result already back in 2016-2017 with lips syncing and on the fly animation. Or with audio synthesis. I will be the most happy person when I'll be able to hear a NPC calling me by my character's name. I also witnessed in research labs astonishing tech used to apply and deform textures according to the deformation of the 3D mesh.

    Damn that's a long post. All of that to say that video game and computer graphics have good days before them.

    And again nice vid 🙂

  22. Nicholas Perkins

    Would training A.I. be cheaper if they used GPUs vs TPUs? I know TPUs are better as far as results but could the price of GPUs be a factor for using them instead? Also when are tier lists in games going to be determined by A.I? I'd rather know what character an A.I. chooses more often to beat a game instead of the opinion of pros and hobbyists. Objective Tier lists(which some websites replicate by showing which characters winning teams have most of the time like in tft) would have a strong use for all game designers. The only way to balance a game is to know which characters in the game have an advantage.

  23. Random Schmid

    I think you forgot an aspect of AI:
    a perfect AI will immediately abuse game design flaws and expose weak and overpowered concepts – lowering the game's enjoyability

  24. Skaitan

    So, Alphastar's skill is reset to 0 when you change the game? Is it impossible to transfer the raw strategy data from one game and see if it can apply it to others? Because it seems to me that the core elements of RTS are relatively similar. Resource management, micro and makro controlling units, map domination. Can Alphastar play Age Of Empires for example?

  25. Magos Errant Malleator

    New Terminator sucks.

  26. Sky Yin

    You don't get it. If DL AI succeed (to a next level), there is no (human) designer…

Leave a Reply

Your email address will not be published. Required fields are marked *