[N] OpenAI Five DoTA2 Finals match livestream has begun: match against OG, plus additional OA announcement at end

gwern · 2019-04-13T19:00:29+00:00

Notes:

a big surprise is promised at the end: https://twitter.com/gdb/status/1117085741710860288 EDIT: coop matches, 2 humans+3 OA vs matches. So far they're pretty boring, like the AlphaGo-human colab matches were. The robots don't need to worry about being unemployed by 'centaurs' anytime soon, it seems... EDITEDIT: after 59 minutes of this, I've become increasingly embarrassed on humanity's behalf
Ilya estimated total training time now at 45,000 years of DoTA2 experience

A few weeks wallclock on the latest DoTA2 patch, so it should be converged (unlike TI). Post summary:

In total, the current version of OpenAI Five has consumed 800 petaflop/s-days and experienced about 45,000 years of Dota self-play over 10 realtime months (up from about 10,000 years over 1.5 realtime months as of The International), for an average of 250 years of simulated experience per day. The Finals version of OpenAI Five has a 99.9% winrate versus the TI version [2].
Match settings: current DoTA2 patch; still no summons/illusions; 17 heros for drafting (no mirror draft?); best of 3; 1 courier like TI (supposedly fully adapted down from 5 invulns now)

Matches: OA5 wins over OG, 2-0 (a flawless victory wiping away the shame of TI!); total: 8-0.

victory for OA5; qualitatively, commentators are describing it as being similar to before (constant intense pushes, excellent exploitation of mistakes, paying to bring back heroes immediately to continue the fight), no major mistakes, OG fought hard but while it seemed even up to the middle, they started to crumble afterwards

My impression is that OA5 is overall somewhat better, but it's hard to tell. We didn't see OA5 fall behind or any real exploitation of its apparent long-term strategy blindspots. OG might've been surprised by just how good OA5 really was and how fluidly & quickly it reacts, and wrongfooted by it, so we'll see if they can attempt to exploit OA5's weaknesses in game 2.
victory for OA5: this game was a huge mess for OG. Despite going in with more of a draft advantage (OA5-estimated 60% vs 70%), OG just fell apart and lost towers within like 10 minutes. What a disappointment.
OA also revealed other matches against pro teams:

2-0
2-0
2-0

Links:

previous Benchmark stream: https://www.reddit.com/r/reinforcementlearning/comments/94uziv/openai_five_benchmark_crushes_audience_team/ ; previous TI discussion: https://www.reddit.com/r/reinforcementlearning/comments/99ieuw/n_first_openai_oa5_dota2_match_begins/
/r/DoTA2 speculation: https://www.reddit.com/r/DotA2/comments/bcptv7/psa_reminder_openai_vs_og_in_5h30m_from_this_post/ ; discussion: https://www.reddit.com/r/DotA2/comments/bctq80/openai_vs_og_match_discussions/?sort=new
/r/OpenAI discussion: https://www.reddit.com/r/OpenAI/comments/bctp85/open_ai_five_finals_dota2_are_live_right_now/ (no comments yet)
/r/MachineLearning: https://www.reddit.com/r/MachineLearning/comments/bcumrs/d_openai_5_vs_dota_2_world_champions_happening_now/
OA5 will be opened to public play for a limited time: https://arena.openai.com/
OA5 writeup: https://openai.com/blog/how-to-train-your-openai-five/ discussing total compute, experience trying to scale to 25 heros (successful but too slow to use for Finals), replays of the 3 games with the OA5 'planning view'

Commentary:

this was practically impressive, but it's also a little disappointing. Has OA5 repaired its long-term strategy understanding? Does it still fall apart when behind? Or did it simply improve its early game to the point where OG couldn't even try the TI stalling strategies & accumulation of long-term game? OG was unable to push OA hard enough to reveal anything interesting: we already knew it was eerily efficient at coordinating and timing and pouncing on mistakes, this 2-0 match (or the total 8-0) was mostly just more of the same.

On the plus side, the Arena looks really nice. If the global DoTA2 community can't push it into the long game or otherwise cheese it while coordinating attempts, then we can safely conclude that OA5 really is damn good.

panties_in_my_ass · 2019-04-13T19:03:27+00:00

Looking forward to this! I expect we’ll see a similar limit on character choice to TI, but that they’ll do much better this time. I’d give OpenAI Five good odds of beating OG.

Still, even if they win the series, today’s outcome will only reinforce the need for transfer learning, as it sounds like training on the full set of characters is compute and time prohibitive due to lack of transfer preventing learning new characters the way a human can.

As an aside, I initially misheard GDB to be saying they had a StarCraft bot they would show later, and got really excited, since DeepMind seems to have found the naive approach (LSTM combined with self play) to get stuck in bad strategies. My guess for the surprise is some kind of human plus bot cooperative play.

Edit: restrictions announced: captain’s draft limited to 17 characters, no summon or illusions.

Reflections: very impressive showing from OA5, but the restrictions make this a somewhat hollow victory, and I don’t think this result tells us anything we didn’t already know. They still can’t play the full game, and it doesn’t look like current approaches will be able to any time soon. Current AI is still so brittle and inflexible compared to human cognition.

Part two: Cooperative gameplay confirmed!

Teradimich · 2019-04-14T01:27:44+00:00

45,000 years of Dota 2 gameplay experience it is years totally or counting each hero separately? In the first case, it should be ~14 days of training, in the second — it is 250 days.

MasterScrat · 2019-05-14T06:47:29+00:00

/u/gwern maybe we can unsticky this now?

reinforcementlearning

MODERATORS

Welcome to Reddit.

Want to add to the discussion?