完了，ai科学家又闹笑话了。

swanswan · 帖子由 **swanswan楼主** » 2025年 10月 19日 17:30

生成解开了数学难题。原来是检索到了答案。你还能指望这些弱智实现agi。

OpenAI researchers recently claimed a major math breakthrough on X, but quickly walked it back after criticism from the community, including Deepmind CEO Demis Hassabis, who called out the sloppy communication.

It started with a now-deleted tweet from OpenAI manager Kevin Weil, who wrote that GPT-5 had "found solutions to 10 (!) previously unsolved Erdős problems" and made progress on eleven more. He described these problems as "open for decades." Other OpenAI researchers echoed the claim.

The wording made it sound like GPT-5 had independently produced mathematical proofs for tough number theory questions - a potential scientific breakthrough and a sign that generative AI could uncover unknown solutions, showing its ability to drive novel research and open the door to major advances.

Mathematician Thomas Bloom, who runs erdosproblems.com, pushed back right away. He called the statements "a dramatic misinterpretation," clarifying that "open" on his site just means he personally doesn't know the solution - not that the problem is actually unsolved. GPT-5 had only surfaced existing research that Bloom had missed.

Deepmind-CEO Demis Hassabis called the episode "embarrassing", and Meta AI chief Yann LeCun pointed out that OpenAI had basically bought into its own hype ("Hoisted by their own GPTards").

swanswan · 帖子由 **swanswan楼主** » 2025年 10月 19日 17:32

千老科学家在智商上可以碾压他们

ql2015 · 帖子由 **ql2015** » 2025年 10月 19日 17:35

这个好玩

swanswan · 帖子由 **swanswan楼主** » 2025年 10月 19日 17:37

ql2015 写了： 2025年 10月 19日 17:35
这个好玩

你们丑国遍地都是这些clown
比三锅有过之无不及

coltzhao

现在的LLM，没有任何独有的创造力。

不过坦白说，大部分人类也没有。

ql2015 · 帖子由 **ql2015** » 2025年 10月 19日 17:42

swanswan 写了： 2025年 10月 19日 17:37
你们丑国遍地都是这些clown
比三锅有过之无不及

之前看到一个新闻标题，没进去看，说的是那个华裔数学家陶something,也是通过AI解决了或者取得了重大进展。你这新闻开始我还以为是他

ql2015 · 帖子由 **ql2015** » 2025年 10月 19日 17:46

swanswan 写了： 2025年 10月 19日 17:30
生成解开了数学难题。原来是检索到了答案。你还能指望这些弱智实现agi。

OpenAI researchers recently claimed a major math breakthrough on X, but quickly walked it back after criticism from the community, including Deepmind CEO Demis Hassabis, who called out the sloppy communication.

It started with a now-deleted tweet from OpenAI manager Kevin Weil, who wrote that GPT-5 had "found solutions to 10 (!) previously unsolved Erdős problems" and made progress on eleven more. He described these problems as "open for decades." Other OpenAI researchers echoed the claim.

The wording made it sound like GPT-5 had independently produced mathematical proofs for tough number theory questions - a potential scientific breakthrough and a sign that generative AI could uncover unknown solutions, showing its ability to drive novel research and open the door to major advances.

Mathematician Thomas Bloom, who runs erdosproblems.com, pushed back right away. He called the statements "a dramatic misinterpretation," clarifying that "open" on his site just means he personally doesn't know the solution - not that the problem is actually unsolved. GPT-5 had only surfaced existing research that Bloom had missed.

Deepmind-CEO Demis Hassabis called the episode "embarrassing", and Meta AI chief Yann LeCun pointed out that OpenAI had basically bought into its own hype ("Hoisted by their own GPTards").

哈哈，就是这个

https://baijiahao.baidu.com/s?id=184501 ... der&for=pc

NativeMan8080

幻想一个猜下一个字的算法能拥有智能，这届硅谷比较不靠谱。

freelikewind · 帖子由 **freelikewind** » 2025年 10月 19日 17:49

现在的ai，一般指的是llm，也就是基于attention机制。
attention是啥，就是内积，靠内积关联信息。
这东西还能有多少推理能力，不过就是最相关的信息堆砌在一起。
这都不叫intelligence，虽然在很多情况下它给你一种intelligence的感觉。
agi就更没边了，agi是一个初始于很小的内核，不断生长的东西，靠内积是无法实现。

delphi · 帖子由 **delphi** » 2025年 10月 19日 17:51

现在的AI就是小镇做题家，刷题刷得勤快点而已

FGH · 帖子由 **FGH** » 2025年 10月 20日 07:11

《The Decoder》指出，这次事件被夸大掩盖了一个更真实的点：GPT-5 在“文献搜集/术语不统一的检索任务”上确实显示出实用价值；Terence Tao 也多次表示，短期内 AI 在数学中的最现实作用，往往是加速检索与常规推理，而非攻克公认的深难未解问题。

话说回来，虽然GPT5还没有能够成神（攻克公认的深难未解问题），但是已经达到了博士生水平，能够协助数学家进行数学证明了。

xiaoju

不是这样

实际上GPT是有数学上的能力限制的，连加减法都做不了。直接做出的难题，必然是从哪里偷看来的。只有CoT深度思考模式可能做出难题，这时候检查核对步骤就行

这个事情说明openai从上到下充斥着不学无术的草台班子

FGH 写了： 2025年 10月 20日 07:11
《The Decoder》指出，这次事件被夸大掩盖了一个更真实的点：GPT-5 在“文献搜集/术语不统一的检索任务”上确实显示出实用价值；Terence Tao 也多次表示，短期内 AI 在数学中的最现实作用，往往是加速检索与常规推理，而非攻克公认的深难未解问题。

话说回来，虽然GPT5还没有能够成神（攻克公认的深难未解问题），但是已经达到了博士生水平，能够协助数学家进行数学证明了。

FGH · 帖子由 **FGH** » 2025年 10月 20日 07:22

xiaoju 写了： 2025年 10月 20日 07:19
不是这样

实际上GPT是有数学上的能力限制的，连加减法都做不了。直接做出的难题，必然是从哪里偷看来的。只有CoT深度思考模式可能做出难题，这时候检查核对步骤就行

这个事情说明openai从上到下充斥着不学无术的草台班子

我天天都在用。你要么是胡说八道，要么是还停留在过去。
如果你用现在的GPT5，使用thinking，就知道它的数学能力了。

xiaoju

说明你根本不懂GPT，只看文科生胡jb吹

GPT是language model：对于定长token组，经过固定次计算，生成下一个token的算法。其计算能力在数学上有上限AC0，无法解决任意的奇偶判断或者加减法以及更复杂的问题。但是其超大数据量又可以把天文数字的问题解法直接背下来，把人类骗过去。

解决这个问题的唯一办法是GPT + CoT，CoT（thinking）等价于图灵机，可以求解人类能解出来的所有问题

但是CoT比单纯GPT贵100倍，除了deepseek外谁都玩不起，所以国外一般是做个专门的mini模型推理，但这种模型又要降智

这次既然是直接抄袭答案，说明openai根本没用thinking模型，纯属草台班子丢人

FGH 写了： 2025年 10月 20日 07:22
我天天都在用。你要么是胡说八道，要么是还停留在过去。
如果你用现在的GPT5，使用thinking，就知道它的数学能力了。

FGH · 帖子由 **FGH** » 2025年 10月 20日 08:16

这样吧，你出两道数学题，我拿给GPT去做，然后把它的答案贴出来。我手里也没有GPT Pro就是普通的GPT 5开启thinking 功能。

xiaoju 写了： 2025年 10月 20日 07:39
说明你根本不懂GPT，只看文科生胡jb吹

GPT是language model：对于定长token组，经过固定次计算，生成下一个token的算法。其计算能力在数学上有上限AC0，无法解决任意的奇偶判断或者加减法以及更复杂的问题。但是其超大数据量又可以把天文数字的问题解法直接背下来，把人类骗过去。

解决这个问题的唯一办法是GPT + CoT，CoT（thinking）等价于图灵机，可以求解人类能解出来的所有问题

但是CoT比单纯GPT贵100倍，除了deepseek外谁都玩不起，所以国外一般是做个专门的mini模型推理，但这种模型又要降智

这次既然是直接抄袭答案，说明openai根本没用thinking模型，纯属草台班子丢人

xiaoju

你随便生成两个长整数，比如50位，让GPT算加法就知道了，提示词要求直接写出答案

注意要用api调用，因为chat app可以悄悄调用工具

FGH 写了： 2025年 10月 20日 08:16
这样吧，你出两道数学题，我拿给GPT去做，然后把它的答案贴出来。我手里也没有GPT Pro就是普通的GPT 5开启thinking 功能。

FGH · 帖子由 **FGH** » 2025年 10月 20日 09:03

xiaoju 写了： 2025年 10月 20日 08:51
你随便生成两个长整数，比如50位，让GPT算加法就知道了，提示词要求直接写出答案

注意要用api调用，因为chat app可以悄悄调用工具

GPT既不是计算器也不是Mathematica。你应该出数学本科生需要做的题目

xiaoju

50位数加法是小学生暑假作业难度的

还有更简单的幼儿园问题，数随机的50位数里面有多少个0

这是简单的让你理解GPT能力极限的办法

你想绕过极限，用的方法就必然超出GPT，deepseek的强制“深度思考”CoT，还是外部工具调用，都不再是GPT了

FGH 写了： 2025年 10月 20日 09:03
GPT既不是计算器也不是Mathematica。你应该出数学本科生需要做的题目

FGH · 帖子由 **FGH** » 2025年 10月 20日 09:21

xiaoju 写了： 2025年 10月 20日 09:13
50位数加法是小学生暑假作业难度的

还有更简单的幼儿园问题，数随机的50位数里面有多少个0

这是简单的让你理解GPT能力极限的办法

你想绕过极限，用的方法就必然超出GPT，deepseek的强制“深度思考”CoT，还是外部工具调用，都不再是GPT了

我没有见过哪个小学生被要求做那样的加法题。你用GPT应该做真正需要的题目。

赖美豪中

LLM目前的工作原理注定就是一个weighted search engine.

swanswan 写了： 2025年 10月 19日 17:30
生成解开了数学难题。原来是检索到了答案。你还能指望这些弱智实现agi。

OpenAI researchers recently claimed a major math breakthrough on X, but quickly walked it back after criticism from the community, including Deepmind CEO Demis Hassabis, who called out the sloppy communication.

It started with a now-deleted tweet from OpenAI manager Kevin Weil, who wrote that GPT-5 had "found solutions to 10 (!) previously unsolved Erdős problems" and made progress on eleven more. He described these problems as "open for decades." Other OpenAI researchers echoed the claim.

The wording made it sound like GPT-5 had independently produced mathematical proofs for tough number theory questions - a potential scientific breakthrough and a sign that generative AI could uncover unknown solutions, showing its ability to drive novel research and open the door to major advances.

Mathematician Thomas Bloom, who runs erdosproblems.com, pushed back right away. He called the statements "a dramatic misinterpretation," clarifying that "open" on his site just means he personally doesn't know the solution - not that the problem is actually unsolved. GPT-5 had only surfaced existing research that Bloom had missed.

Deepmind-CEO Demis Hassabis called the episode "embarrassing", and Meta AI chief Yann LeCun pointed out that OpenAI had basically bought into its own hype ("Hoisted by their own GPTards").

新未名空间

完了，ai科学家又闹笑话了。

#1 完了，ai科学家又闹笑话了。

#2 Re: 完了，ai科学家又闹笑话了。

#3 Re: 完了，ai科学家又闹笑话了。

#4 Re: 完了，ai科学家又闹笑话了。

#5 Re: 完了，ai科学家又闹笑话了。

#6 Re: 完了，ai科学家又闹笑话了。

#7 Re: 完了，ai科学家又闹笑话了。

#8 Re: 完了，ai科学家又闹笑话了。

#9 Re: 完了，ai科学家又闹笑话了。

#10 Re: 完了，ai科学家又闹笑话了。

#11 Re: 完了，ai科学家又闹笑话了。

#12 Re: 完了，ai科学家又闹笑话了。

#13 Re: 完了，ai科学家又闹笑话了。

#14 Re: 完了，ai科学家又闹笑话了。

#15 Re: 完了，ai科学家又闹笑话了。

#16 Re: 完了，ai科学家又闹笑话了。

#17 Re: 完了，ai科学家又闹笑话了。

#18 Re: 完了，ai科学家又闹笑话了。

#19 Re: 完了，ai科学家又闹笑话了。

#20 Re: 完了，ai科学家又闹笑话了。