AlphaGo Zero

AlphaGo Zero是DeepMind 圍棋軟件 AlphaGo的最新版。2017年10月19日，AlphaGo團隊在《自然》上發表文章介紹了AlphaGo Zero，文中指出此版本不採用人類玩家的棋譜，且比之前的所有版本都要強大^[1]。通過自我對弈，AlphaGo Zero在三天內以100比0的戰績戰勝了AlphaGo Lee，花了21天達到AlphaGo Master的水平，用40天超越了所有舊版本^[2]。DeepMind聯合創始人兼CEO傑米斯·哈薩比斯說，AlphaGo Zero不再受限於人類認知，很強大^[3]。由於專家數據「經常很貴、不可靠或是無法取得」，不藉助人類專家的數據集訓練人工智能，對於人工智能開發超人技能具有重大意義^[4]，因為這樣的AI不是學習人，是通過對自我的反思和獨有的創造力直接超越人類。文章作者之一大衛·席爾瓦表示，摒棄向人類學習的需求，這有可能是對現有人工智能算法的拓展^[5]。

訓練

AlphaGo Zero神經網絡使用TensorFlow在64個GPU和19個CPU參數伺服器訓練，推理的TPU只有四個。神經網絡最初除了規則，對圍棋一無所知。AI進行「非監督式學習」，自己和自己對弈，直到能預測自己的每一手棋及其對棋局結果的影響^[6]。前三天，AlphaGo Zero連續自我對弈490萬局^[7]。幾天之內它就發展出擊敗人類頂尖棋手的技能，而早期的AlphaGo要達到同等水平需要數月的訓練^[8]。為了比較，研究人員還用人類對局數據訓練了另一版AlphaGo Zero，發現該版本學習更加迅速，但從長遠來看，表現反而較差^[9]。

應用

哈薩比斯表示，AlphaGo的算法對需要智能搜索巨大概率空間的領域建樹最大，如蛋白質摺疊或精準模擬化學反應^[10]。對於很難模擬的領域，如學習如何開車，用處可能相對較低^[11]。

評價

普遍認為，AlphaGo Zero是一次巨大的進步，即便是和它的開山鼻祖AlphaGo作比較時。艾倫人工智能研究院（英語：Allen Institute for Artificial Intelligence）的奧倫·伊奇奧尼（英語：Oren Etzioni）表示，AlphaGo Zero是「非常令人印象深刻的技術成果」，「不管是在他們實現目標的能力上，還是他們花40天時間用四個TPU訓練這套系統的能力」^[6]。《衛報》稱AlphaGo Zero是「人工智能的大突破」，援引謝菲爾德大學的伊萊尼·瓦希萊基（Eleni Vasilaki）和卡內基梅隆大學的湯姆·米切爾（Tom Mitchell），兩人分別說它是令人印象深刻的成就和「突出的工程成就」^[11]。悉尼大學的馬克·佩斯（英語：Mark Pesce）說AlphaGo Zero是「巨大的技術進展」，帶領我們進入「未至之地」^[12]。

然而，紐約大學心理學家蓋瑞·馬庫斯（英語：Gary Marcus）對我們目前所知的則表示謹慎，AlphaGo或許包括「程式設計師如何建造一台解決圍棋等問題的機器的隱晦知識」，在確保它的基礎結構比玩圍棋時更有效率之前，它需要在其他的領域受檢測。相反，DeepMind「自信這種方法可以歸納至更多的領域中」^[7]。

韓國職業圍棋選手李世乭回應稱：「之前的AlphaGo並不完美，我認為這就是為什麼要把AlphaGo Zero造出來」。至於AlphaGo的發展潛力，李世乭表示他必須要靜觀其變，但同時表示它會影響年輕的棋手。韓國國家圍棋隊教練睦鎮碩表示，圍棋界已經模仿到之前AlphaGo各個版本的下棋風格，從中創造新的思路，他希望AlphaGo Zero能帶來新的思路。睦鎮碩補充道，棋界的大趨勢如今被AlphaGo的下棋風格影響。「最初，我們很難理解，我差不多認為我在跟外星人打比賽。然而，有過這麼次的體會，我已經適應它了。」他說。「我們現在錯過了辯論AlphaGo與人類之間的能力差距的點。現在講的是計算機間的差距。」據稱，他已經開始和國家隊棋手分析AlphaGo Zero的比賽風格：「雖然只看了幾場比賽，但我們的印象是，AlphaGo Zero和他的前者相比，下棋更像人類^[13]。」中國職業棋手柯潔在他的微博上表示：「一個純淨、純粹自我學習的AlphaGo是最強的……對於AlphaGo的自我進步來講……人類太多餘了^[14]。」

歷史版本比較

架構和實力^[15]
版本	硬件	等級分	賽況
AlphaGo Fan（英語：AlphaGo Fan）	176個GPU、^[4]分佈式	3144^[1]	5：0 對陣樊麾
AlphaGo Lee	48個TPU、^[4]分佈式	3739^[1]	4：1 對陣李世乭
AlphaGo Master	4個第二代TPU^[4]、單機	4858^[1]	網棋 60:0 對陣 44位職業棋手中國烏鎮圍棋峰會 3:0 對陣柯潔；1:0 對陣五位頂尖棋手聯隊
AlphaGo Zero	4個第二代TPU^[4]、單機	5185^[1]	100：0 對陣AlphaGo Lee 89：11 對陣AlphaGo Master

參考

參考資料

^ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 Mastering the game of Go without human knowledge. Nature. 2017-10-19 [2017-10-19]. （原始內容存檔於2017-10-19）.
^ Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Yahoo!. 2017-10-19 [2017-10-19]. （原始內容存檔於2017-10-19）.
^ AlphaGo Zero: Google DeepMind supercomputer learns 3,000 years of human knowledge in 40 days. The Telegraph. 2017-10-18 [2017-10-19]. （原始內容存檔於2017-10-20）.
^ ^4.0 ^4.1 ^4.2 ^4.3 ^4.4 Hassabis, Demis; Siver, David. AlphaGo Zero: Learning from scratch. DeepMind. 2017-10-18 [2017-10-19]. （原始內容存檔於2017-10-19）.
^ DeepMind AlphaGo Zero learns on its own without meatbag intervention. ZDNet. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-20）.
^ ^6.0 ^6.1 Greenemeier, Larry. AI versus AI: Self-Taught AlphaGo Zero Vanquishes Its Predecessor. Scientific American. [2017-10-20]. （原始內容存檔於2017-10-19）.
^ ^7.0 ^7.1 Computer Learns To Play Go At Superhuman Levels 'Without Human Knowledge'. NPR. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-20）.
^ Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Fortune. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-19）.
^ This computer program can beat humans at Go—with no human instruction. Science | AAAS. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-19）.
^ The latest AI can work things out without being taught. The Economist. [2017-10-20]. （原始內容存檔於2017-10-19）.
^ ^11.0 ^11.1 Sample, Ian. 'It's able to create knowledge itself': Google unveils AI that learns on its own. The Guardian. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-19）.
^ How Google's new AI can teach itself to beat you at the most complex games. Australian Broadcasting Corporation. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-20）.
^ Go Players Excited About ‘More Humanlike’ AlphaGo Zero. Korea Bizwire. 2017-10-19 [2017-10-21]. （原始內容存檔於2017-10-21）.
^ 柯洁:对于AlphaGo的自我进步来讲人类太多余了. 環球網. 2017-10-20 [2017-11-08]. （原始內容存檔於2017-11-09）.
^ 【柯洁战败解密】AlphaGo Master最新架构和算法，谷歌云与TPU拆解. 搜狐. 2017-05-24 [2017-06-01]. （原始內容存檔於2017-09-17）.

外部連結

AlphaGo blog（頁面存檔備份，存於互聯網檔案館）
Nature news on AlphaGo Zero（頁面存檔備份，存於互聯網檔案館）
Full nature article on AlphaGo Zero Archive.is的存檔，存檔日期2018-01-03
AlphaGo Zero Games（頁面存檔備份，存於互聯網檔案館）
AMA on Reddit（頁面存檔備份，存於互聯網檔案館）

[Nature2017-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 Mastering the game of Go without human knowledge. Nature. 2017-10-19 [2017-10-19]. （原始內容存檔於2017-10-19）.

[2] Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Yahoo!. 2017-10-19 [2017-10-19]. （原始內容存檔於2017-10-19）.

[3] AlphaGo Zero: Google DeepMind supercomputer learns 3,000 years of human knowledge in 40 days. The Telegraph. 2017-10-18 [2017-10-19]. （原始內容存檔於2017-10-20）.

[Deepmind20171018-4] 4.0 ^4.1 ^4.2 ^4.3 ^4.4 Hassabis, Demis; Siver, David. AlphaGo Zero: Learning from scratch. DeepMind. 2017-10-18 [2017-10-19]. （原始內容存檔於2017-10-19）.

[5] DeepMind AlphaGo Zero learns on its own without meatbag intervention. ZDNet. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-20）.

[Scientific_American-6] 6.0 ^6.1 Greenemeier, Larry. AI versus AI: Self-Taught AlphaGo Zero Vanquishes Its Predecessor. Scientific American. [2017-10-20]. （原始內容存檔於2017-10-19）.

[npr-7] 7.0 ^7.1 Computer Learns To Play Go At Superhuman Levels 'Without Human Knowledge'. NPR. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-20）.

[8] Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Fortune. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-19）.

[9] This computer program can beat humans at Go—with no human instruction. Science | AAAS. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-19）.

[10] The latest AI can work things out without being taught. The Economist. [2017-10-20]. （原始內容存檔於2017-10-19）.

[guardian-11] 11.0 ^11.1 Sample, Ian. 'It's able to create knowledge itself': Google unveils AI that learns on its own. The Guardian. 2017-10-18 [2017-10-20]. （原始內容存檔於2017-10-19）.

[12] How Google's new AI can teach itself to beat you at the most complex games. Australian Broadcasting Corporation. 2017-10-19 [2017-10-20]. （原始內容存檔於2017-10-20）.

[13] Go Players Excited About ‘More Humanlike’ AlphaGo Zero. Korea Bizwire. 2017-10-19 [2017-10-21]. （原始內容存檔於2017-10-21）.

[14] 柯洁:对于AlphaGo的自我进步来讲人类太多余了. 環球網. 2017-10-20 [2017-11-08]. （原始內容存檔於2017-11-09）.

[sohu0524-15] 【柯洁战败解密】AlphaGo Master最新架构和算法，谷歌云与TPU拆解. 搜狐. 2017-05-24 [2017-06-01]. （原始內容存檔於2017-09-17）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]