幻觉 (人工智能)

在人工智能领域中，幻觉（英語：hallucination，或称人工幻觉^[1]）是由人工智能生成的一种回应，它含有貌似事实的虚假或误导性资讯^[2]。

该术语源自幻觉的心理学概念，因为它们具有相似的特征。然而实际上更相似的概念是“虚谈症（confabulation）”，但“幻觉（hallucination）”一词在人工智能领域已经广为流传。人工智能幻觉的危险之处之一是模型的输出看起来是正确的，而它实际上是错误的。

在自然语言处理中

在自然语言处理中，幻觉通常被定义为“生成的内容相對於被提供的源内容而言是无意义或不可信的”。文本和表达之间的编码和解码错误会导致幻觉。产生不同反应的人工智能训练也可能导致幻觉。当 AI 在数据集上进行训练时，也会出现幻觉，其中标记的摘要尽管事实上准确。在GPT-3等系统中，人工智能会根据之前的一系列单词（包括它自己先前回应过的单词）生成下一个单词，随着对话时长的增加，可能会不断地产生幻觉。 ^[1]到2022年，《纽约时报》等报纸表示担心，随着基于大型语言模型的机器人的使用数量持续增长，用户对机器人输出的过于信任可能会导致问题。 ^[3]

2022年8月， Meta在发布 BlenderBot 3 期间警告说，该系统容易出现“幻觉”，Meta 将其定义为“自信的假话”。 ^[4] 2022年11月15日，Meta 发布了卡拉狄加（英語：Galactica）的演示版，旨在“存储、组合和推理科学知识”。卡拉狄加生成的内容带有警告“输出可能不可靠！语言模型很容易输出幻觉文本。”在一个案例中，当被要求起草一篇关于创建虚拟形象的论文时，卡拉狄加引用了一位在相关领域工作的工作者的虚构的论文。 Meta 于 11月 17日因其具有一定的冒犯性和因幻觉产生的不准确而撤回了卡拉狄加的演示版。 ^[5] ^[6]

OpenAI的ChatGPT于 2022年 12月公开发布测试版，它基于 GPT-3.5 系列大型语言模型。沃顿商学院的莫里克（英語：Ethan Mollick，直译：莫里克）教授将 ChatGPT 称为“无所不知、渴望取悦别人但有时会撒谎的实习生（英語：omniscient, eager-to-please intern who sometimes lies to you）”。数据科学家特蕾莎（英語：Teresa Kubacka）讲述了其故意编造“英語：cycloidal inverted electromagnon，直译：摆线倒置电磁铁”这个短语，并通过向 ChatGPT 询问不存在的现象来测试 ChatGPT。 ChatGPT 回答了了一个听起来似是而非的答案，并配以看似有理有据的引用，使她不得不仔细检查自己是否不小心输入了真实现象的名称。奥伦（英語：Oren Etzioni，直译：奥伦·埃齐安）等其他学者一起评估 Kubacka ，并评价道此类软件通常可以为用户提供“一个非常令人印象深刻的答案，但却是完全错误的”。 ^[7]

Mashable的麦克（英語：Mike Pearl）使用多个问题测试了 ChatGPT。在其中一个例子中，他询问了“中美洲除墨西哥以外最大的国家”的模型。 ChatGPT回复了危地马拉，而答案却是尼加拉瓜。 ^[8]当CNBC向 ChatGPT 询问“The Ballad of Dwight Fry”的歌词时，ChatGPT 提供了虚构的歌词。 ^[9]在为新iPhone 14 Pro撰写评论的过程中，ChatGPT 错误地将相关芯片组列为 A15 而不是A16 ，尽管这可以归因于ChatGPT 是在 2021 年结束的数据集上训练的。 ^[10]当被问及有关新不伦瑞克省的问题时，ChatGPT 回答了很多正确答案，但错误地将萨曼莎·比归类为“来自新不伦瑞克省的人”。 ^[11]当被问及天体物理学磁场时，ChatGPT 错误地提出“黑洞的（强）磁场是由其附近极强的引力产生的”的理论。 ^[12] 快公司要求 ChatGPT 生成一篇关于特斯拉上一财季的新闻文章； ChatGPT 创建了一篇连贯的文章，但编造了其中包含的财务数字。 ^[13]

人们认为，自然语言模型产生幻觉数据的可能原因有很多。 ^[1]例如：

来自数据的幻觉：源内容存在差异（大型训练数据集通常会发生这种情况），
来自训练的幻觉：当数据集中几乎没有差异时，幻觉仍然会发生。在这种情况下，它源自模型的训练方式。造成这种幻觉的原因有很多，例如：
- 来自转换器的错误解码
- 模型先前生成的历史序列的偏差
- 模型在其参数中编码其知识的方式产生的偏差

在它类人工智能中

“幻觉”的概念比自然语言处理的应用更广泛。任何 AI 的自信反应，如果可能被训练数据判断为不合理时，都可以被标记为幻觉。 ^[1] 《连线》在 2018年指出，尽管没有记录在案的其他对抗性攻击（研究人员的概念验证攻击除外），但智能电子产品和自动驾驶等系统容易受到影响这一点“几乎没有争议”。对抗性攻击可能导致其它类别的人工智能产生幻觉。示例包括在计算机视觉不可识别的停止标志；一个音频剪辑被设计成听起来没有表达什么信息，但被某软件转录为“evil.com”等。 ^[14]

分析

《连线》引用的各种研究人员将对抗性幻觉归类为高维统计现象，或者将幻觉归因于训练数据不足。一些研究人员认为，在物体识别的情况下，一些被人类归类为“幻觉”的“不正确”人工智能反应实际上可能被训练数据证明是正确的，甚至人工智能可能给出了人类审阅者认为的“正确”答案，人类并未看到。例如，对于人类来说，一张看起来像狗的普通图像的对抗性图像，实际上可能被 AI 视为包含微小的图案，这些图案（在真实图像中）只会在观看猫时出现。人工智能检测到了人类不敏感的源图像中的细节。 ^[15]

然而，这些发现受到了其他研究人员的质疑。 ^[16]例如，有人反对称模型可能偏向表面统计数据，导致对抗训练在现实场景中不具有鲁棒性。 ^[16]

缓解方法

幻觉现象仍未完全被了解。 ^[1]因此，从业者仍在进行研究以试图减轻/减缓其出现。 ^[17]特别是，研究表明，语言模型不仅会产生幻觉，还会放大幻觉，即使是那些旨在缓解这一问题的模型也遇到了同样的问题。 ^[18]

参见

参考

^ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale. Survey of Hallucination in Natural Language Generation (pdf). ACM Computing Surveys (Association for Computing Machinery). November 2022 [15 January 2023]. S2CID 246652372. doi:10.1145/3571730. （原始内容存档于2023-03-26）.
^ Definition of HALLUCINATION. www.merriam-webster.com. 2024-02-23 [2024-03-06]. （原始内容存档于2023-10-07）（英语）.
^ Metz, Cade. The New Chatbots Could Change the World. Can You Trust Them?. The New York Times. 10 December 2022 [30 December 2022]. （原始内容存档于2023-04-18）.
^ Tung, Liam. Meta warns its new chatbot may forget that it's a bot. ZDNet (Red Ventures). 8 August 2022 [30 December 2022]. （原始内容存档于2023-03-26）（英语）.
^ Edwards, Benj. New Meta AI demo writes racist and inaccurate scientific literature, gets pulled. Ars Technica. 18 November 2022 [30 December 2022]. （原始内容存档于2023-04-10）（美国英语）.
^ Michael Black [@Michael_J_Black]. I asked #Galactica about some things I know about and I'm troubled. In all cases, it was wrong or biased but sounded right and authoritative. (推文). 2022年11月17日 –通过Twitter.
^ Bowman, Emma. A new AI chatbot might do your homework for you. But it's still not an A+ student. NPR. 19 December 2022 [29 December 2022]. （原始内容存档于2023-01-20）（英语）.
^ Pearl, Mike. The ChatGPT chatbot from OpenAI is amazing, creative, and totally wrong. Mashable. 3 December 2022 [5 December 2022]. （原始内容存档于2022-12-10）.
^ Pitt, Sofia. Google vs. ChatGPT: Here's what happened when I swapped services for a day. CNBC. 15 December 2022 [30 December 2022]. （原始内容存档于2023-01-16）（英语）.
^ Wan, June. OpenAI's ChatGPT is scary good at my job, but it can't replace me (yet). ZDNet (Red Ventures). 8 December 2022 [30 December 2022]. （原始内容存档于2023-02-15）（英语）.
^ Huizinga, Raechel. We asked an AI questions about New Brunswick. Some of the answers may surprise you. CBC.ca. 2022-12-30 [30 December 2022]. （原始内容存档于2023-03-26）.
^ Zastrow, Mark. We Asked ChatGPT Your Questions About Astronomy. It Didn't Go so Well.. Discover (Kalmbach Publishing Co.). 2022-12-30 [31 December 2022]. （原始内容存档于2023-03-26）（英语）.
^ Lin, Connie. How to easily trick OpenAI's genius new ChatGPT. Fast Company. 5 December 2022 [6 January 2023]. （原始内容存档于2023-03-29）.
^ Simonite, Tom. AI Has a Hallucination Problem That's Proving Tough to Fix. Wired (Condé Nast). 2018-03-09 [29 December 2022]. （原始内容存档于2018-03-12）.
^ Matsakis, Louise. Artificial Intelligence May Not 'Hallucinate' After All. Wired. 8 May 2019 [29 December 2022]. （原始内容存档于2023-03-26）.
^ ^16.0 ^16.1 Gilmer, Justin; Hendrycks, Dan. A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'. Distill. 2019-08-06, 4 (8) [2023-01-24]. S2CID 201142364. doi:10.23915/distill.00019.1. （原始内容存档于2023-03-26）.
^ Nie, Feng; Yao, Jin-Ge; Wang, Jinpeng; Pan, Rong; Lin, Chin-Yew. A Simple Recipe towards Reducing Hallucination in Neural Surface Realisation (PDF). Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics). July 2019: 2673–2679 [15 January 2023]. S2CID 196183567. doi:10.18653/v1/P19-1256. （原始内容存档 (PDF)于2023-03-27）.
^ Dziri, Nouha; Milton, Sivan; Yu, Mo; Zaiane, Osmar; Reddy, Siva. On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models? (PDF). Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. July 2022 [15 January 2023]. doi:10.18653/v1/2022.naacl-main.38. （原始内容存档 (PDF)于2023-04-06）.

[未命名-20230318191641-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Dai, Wenliang; Madotto, Andrea; Fung, Pascale. Survey of Hallucination in Natural Language Generation (pdf). ACM Computing Surveys (Association for Computing Machinery). November 2022 [15 January 2023]. S2CID 246652372. doi:10.1145/3571730. （原始内容存档于2023-03-26）.

[merriam-webster-2] Definition of HALLUCINATION. www.merriam-webster.com. 2024-02-23 [2024-03-06]. （原始内容存档于2023-10-07）（英语）.

[3] Metz, Cade. The New Chatbots Could Change the World. Can You Trust Them?. The New York Times. 10 December 2022 [30 December 2022]. （原始内容存档于2023-04-18）.

[4] Tung, Liam. Meta warns its new chatbot may forget that it's a bot. ZDNet (Red Ventures). 8 August 2022 [30 December 2022]. （原始内容存档于2023-03-26）（英语）.

[5] Edwards, Benj. New Meta AI demo writes racist and inaccurate scientific literature, gets pulled. Ars Technica. 18 November 2022 [30 December 2022]. （原始内容存档于2023-04-10）（美国英语）.

[6] Michael Black [@Michael_J_Black]. I asked #Galactica about some things I know about and I'm troubled. In all cases, it was wrong or biased but sounded right and authoritative. (推文). 2022年11月17日 –通过Twitter.

[7] Bowman, Emma. A new AI chatbot might do your homework for you. But it's still not an A+ student. NPR. 19 December 2022 [29 December 2022]. （原始内容存档于2023-01-20）（英语）.

[MashableInfo-8] Pearl, Mike. The ChatGPT chatbot from OpenAI is amazing, creative, and totally wrong. Mashable. 3 December 2022 [5 December 2022]. （原始内容存档于2022-12-10）.

[9] Pitt, Sofia. Google vs. ChatGPT: Here's what happened when I swapped services for a day. CNBC. 15 December 2022 [30 December 2022]. （原始内容存档于2023-01-16）（英语）.

[10] Wan, June. OpenAI's ChatGPT is scary good at my job, but it can't replace me (yet). ZDNet (Red Ventures). 8 December 2022 [30 December 2022]. （原始内容存档于2023-02-15）（英语）.

[11] Huizinga, Raechel. We asked an AI questions about New Brunswick. Some of the answers may surprise you. CBC.ca. 2022-12-30 [30 December 2022]. （原始内容存档于2023-03-26）.

[12] Zastrow, Mark. We Asked ChatGPT Your Questions About Astronomy. It Didn't Go so Well.. Discover (Kalmbach Publishing Co.). 2022-12-30 [31 December 2022]. （原始内容存档于2023-03-26）（英语）.

[13] Lin, Connie. How to easily trick OpenAI's genius new ChatGPT. Fast Company. 5 December 2022 [6 January 2023]. （原始内容存档于2023-03-29）.

[14] Simonite, Tom. AI Has a Hallucination Problem That's Proving Tough to Fix. Wired (Condé Nast). 2018-03-09 [29 December 2022]. （原始内容存档于2018-03-12）.

[15] Matsakis, Louise. Artificial Intelligence May Not 'Hallucinate' After All. Wired. 8 May 2019 [29 December 2022]. （原始内容存档于2023-03-26）.

[bugs-16] 16.0 ^16.1 Gilmer, Justin; Hendrycks, Dan. A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'. Distill. 2019-08-06, 4 (8) [2023-01-24]. S2CID 201142364. doi:10.23915/distill.00019.1. （原始内容存档于2023-03-26）.

[17] Nie, Feng; Yao, Jin-Ge; Wang, Jinpeng; Pan, Rong; Lin, Chin-Yew. A Simple Recipe towards Reducing Hallucination in Neural Surface Realisation (PDF). Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics). July 2019: 2673–2679 [15 January 2023]. S2CID 196183567. doi:10.18653/v1/P19-1256. （原始内容存档 (PDF)于2023-03-27）.

[18] Dziri, Nouha; Milton, Sivan; Yu, Mo; Zaiane, Osmar; Reddy, Siva. On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models? (PDF). Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. July 2022 [15 January 2023]. doi:10.18653/v1/2022.naacl-main.38. （原始内容存档 (PDF)于2023-04-06）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]