2026-05-17 Hacker News Top Stories #

Hashimoto 警示行业陷入“AI 精神错乱”：以速度与高变更掩盖架构退化，应回归可靠与可理解性，并引发对资本狂热、岗位替代与需求塌方的担忧。

加州推进法案要求在线游戏停服须退款或提供可离线版本并提前通知，业界批其误解许可且增负担、玩家视为保存胜利并呼吁开源服务器但成本与 DRM 掣肘。

美国司法部为查排放改装传唤苹果谷歌披露逾10万用户数据引发隐私争议，支持者主张针对性执法而非大抓取且各州检测不一使监管复杂。

讽刺文以 npm 供应链攻击调侃 JS 生态脆弱：层层未审依赖致恶意代码难防，相较 Go/Rust 更脆，官方也承认难以根治只能提升韧性。

作者告别 Tailwind 转向模块化原生 CSS 与少量工具类，HN 就语义性、DRY 与原子类的效率和复杂度权衡展开激辩。

ABC News 将 FiveThirtyEight 全站下线引发不满，被指出于私怨且无益公司价值，并牵出品牌归属、模型授权与 Nate Silver 角色之争。

前沿大模型让公开 CTF 题目被一键解、竞赛异化为“付费赢”，作者建议在竞技场景限用 AI 以维系公平与乐趣。

作者分享6种用 SQL 识别交易欺诈的规则优先实践，HN 指出整数定价和非常规时段等特征易致误报且受地区与用户行为差异影响。

Zulip 成立非营利基金会并接管 Kandra Labs 以确保长期独立与隐私承诺，社区看好可持续也讨论未来技术选型与依赖简化。

英伟达开源 26 亿参数 SANA-WM 可用单图和相机轨迹在单 GPU 生成最长一分钟 720p 视频，虽速度与画质亮眼但生成控制与一致性仍是应用痛点。

1. 我相信现在有整家公司正处于人工智能精神错乱状态 (I believe there are entire companies right now under AI psychosis) #

https://twitter.com/mitchellh/status/2055380239711457578

这是一条来自 Mitchell Hashimoto 的推文，他表达了对当前一些公司在人工智能应用方面出现的“心理失调”现象的担忧。Mitchell 指出，这种现象导致理性讨论变得困难，尽管他尊重其中的许多个人朋友，但他担心这种状况的发展。

他回忆了自己在云计算和云自动化转型期间经历的基础设施领域关于 MTBF（平均无故障时间）与 MTTR（平均修复时间）的争论，并指出这些争论现在又在软件开发行业甚至整个世界范围内重现。Mitchell 批评了一种过度依赖快速修复（MTTR）的心态，认为这种心态忽视了系统的韧性，导致系统表面看似健康但实际上风险在积累。

他提到，尽管测试覆盖率提高、缺陷报告减少，但这并不代表系统没有潜在问题。快速的变更速度掩盖了架构的退化，系统可能变得难以理解和维护。他担忧这种现象会导致灾难性的后果，但在与身边人讨论时，常常遭遇被忽视或简单以“测试覆盖充分”“缺陷减少”等理由反驳。

整体来看，Mitchell Hashimoto 提醒业界警惕过度依赖自动化快速修复带来的隐患，强调保持系统韧性和对架构健康的全面关注的重要性。

HN 热度 1838 points | 评论 1033 comments | 作者：reasonableklout | 1 day ago #

https://news.ycombinator.com/item?id=48153379

目前社会和资本市场对 AI 的发展抱有极大期待，投入巨大，但也存在泡沫和风险，可能是集体的“AI 精神错乱”。
大规模数据中心投资背后是企业希望通过 AI 替代大量白领工作，实现效率和成本的巨大提升。
谷歌等大公司在 AI 领域的资本支出极高，赌注巨大，期望用 AI 替代软件工程师，降低人力成本。
自驾车虽然是新技术，但更多是现有交通市场的替代，未必带来整体经济增长。
企业更多关注通过 AI 提升现有业务效率，而非创造全新市场，需求端问题仍未解决。
AI 替代人类工作可能导致大规模失业和经济需求下降，如何让失业者有收入和消费能力是关键难题。
工作和收入可以被拆分，人们需要重新思考收入来源和社会服务的提供方式。
AI 带来的自动化可能使部分行业消失，但人类仍需参与工作，AI 更可能是辅助而非完全替代。
目前企业更多是利用 AI 降低成本，而非共享由 AI 带来的生产力提升收益。
运输行业如卡车司机等职业规模庞大，AI 替代将带来重大社会影响。
AI 技术的普及和应用仍需时间，尤其在不同地区和复杂环境下推广存在挑战。
AI 替代工作可能引发通缩和经济衰退，因为失业导致需求减少，AI 服务成本也会下降。

2. 加利福尼亚州推进阻止发行商关闭在线游戏的法案 (Bill to block publishers from killing online games advances in California) #

https://arstechnica.com/gaming/2026/05/bill-to-keep-online-games-playable-clears-key-hurdle-in-california/

加利福尼亚州议会通过了一项名为“保护我们的游戏法案”的法案草案，旨在保障在线游戏在服务器关闭后仍能被玩家继续使用。根据该法案，数字游戏发行商若停止对在线游戏的支持，必须向玩家提供全额退款或提供一个独立于运营商服务之外的更新版本，确保游戏的持续可用性。同时，发行商需在停止必要服务前 60 天通知玩家。

该法案不适用于完全免费的游戏以及仅在订阅期间提供的游戏，但适用于 2027 年 1 月 1 日后在加州销售的其他游戏。该法案的推进被玩家维权组织“停止杀死游戏”（Stop Killing Games）视为重大胜利，该组织由玩家、开发者和发行商组成，致力于游戏保存。

反对方主要是娱乐软件协会（ESA），他们认为该法案误解了现代游戏发行的本质，指出玩家获得的是游戏的使用许可而非所有权，游戏下线是软件生命周期的自然部分。此外，ESA 担忧法案会给发行商带来过高的版权和授权负担，尤其是涉及音乐和知识产权的时间限制问题。

总体来看，该法案旨在保护玩家权益，防止游戏因服务器关闭而无法继续游玩，推动游戏行业在终止在线服务时采取更透明和负责任的措施。

HN 热度 563 points | 评论 389 comments | 作者：Lihh27 | 1 day ago #

https://news.ycombinator.com/item?id=48152994

关闭在线游戏时应开源服务器代码，方便社区自行维护，并提前 60 天通知玩家。
开源服务器代码在大公司中非常复杂且成本高，涉及法律、版权和合规审核，可能导致开发成本和时间增加。
立法强制开源可以推动企业建立相关流程，但也会增加游戏开发的隐性成本。
只有采用客户端-服务器架构的游戏才会受到影响，小团队和低预算游戏影响较小。
客户端-服务器架构主要为多人游戏服务，不是大预算游戏的专属设计。
过去玩家可以自行搭建服务器或使用局域网对战，现代游戏选择开发者托管服务器多为商业目的。
现代游戏可以采用 VPN 或类似技术实现玩家自建服务器，减少对官方服务器的依赖。
Valve 的 SDK 支持不暴露 IP 的 P2P 网络中继，适合大小游戏，但跨平台跨服存在挑战。
纯 P2P 实时游戏存在时延和网络穿透难题，但随着网络技术进步，这些问题逐渐可控。
混合客户端/服务器和 P2P 模型可以兼顾性能和灵活性，允许玩家选择完全 P2P 模式。
许多游戏利用在线功能实现数字版权管理（DRM），限制玩家自行搭建本地服务器。

3. 美国司法部要求苹果和谷歌披露超过 10 万名汽车调试应用用户身份 (U.S. DOJ demands Apple and Google unmask over 100k users of car-tinkering app) #

https://macdailynews.com/2026/05/15/u-s-doj-demands-apple-and-google-unmask-over-100000-users-of-popular-car-tinkering-app-in-emissions-crackdown/

美国司法部正在要求苹果和谷歌提供超过 10 万名使用 EZ Lynk 公司汽车调试应用 Auto Agent 的用户个人信息，以配合对该公司涉嫌违反《清洁空气法》的调查。EZ Lynk 被指控销售“失效装置”，通过其应用和硬件设备绕过柴油车的排放控制。EZ Lynk 否认指控，称其产品主要用于车辆性能监控和合法改装。

司法部向苹果、谷歌、亚马逊和沃尔玛发出传票，要求提供下载、账户和购买数据，以便识别证人和调查用户如何使用该工具。此举引发隐私担忧，EZ Lynk 及隐私保护组织认为传票范围过大，侵犯用户隐私权。苹果和谷歌计划对传票提出异议。

该案件反映了政府利用应用商店数据加强执法的趋势，规模远超以往类似案件。此案已被法院拒绝 EZ Lynk 基于《通信规范法》第 230 条的免责任请求，诉讼继续进行。对于使用汽车调试工具的车主来说，政府正日益关注并追踪应用下载与用户身份的关联。

HN 热度 460 points | 评论 325 comments | 作者：tencentshill | 1 day ago #

https://news.ycombinator.com/item?id=48151383

政府要求获取所有用户信息的做法被质疑，认为应只针对违规使用者调查，避免侵犯隐私。
有人支持举报违规行为，但认为应通过执法和罚款解决，而非大规模收集数据。
加州对排放监管严格，其他州则较宽松，车辆检查和举报渠道有限。
美国车辆安全和排放检查制度被批评为不完善且不统一，部分州要求年检，部分州则无此要求。
新车需进行检测的规定被质疑合理性，认为新车应当出厂即合格。
有观点认为车辆安全检查对保障行车安全有帮助，但也有人认为对低收入群体不公平且效率低下。
排放检测被认为有效改善空气质量，但频繁检测可能过度且增加负担。
有人建议通过目击举报和针对性执法替代全面检测，认为这样更高效且减少社会成本。

4. “唯一经常发生这种情况的软件包管理器说：‘无法防止这种情况’” (‘No way to prevent this,’ says only package manager where this regularly happens) #

https://kevinpatel.xyz/posts/no-way-to-prevent-this/

这是一篇关于 npm 软件包管理器发生严重供应链攻击事件的讽刺文章。文章描述了在一次 npm 注册表的供应链攻击中，数百万企业应用被入侵，数十亿用户数据暴露，整个 JavaScript 生态系统的开发者对此表示无奈和悲伤，认为这种危机是不可避免的。

文章引用了一位高级前端工程师的话，指出现代网络应用严重依赖层层嵌套、未经严格审核的第三方包，这使得恶意代码注入几乎无法防范。Node.js 社区成员普遍认为此次远程代码执行攻击是完全不可预测的悲剧，并对受影响的运维团队表示同情。

相比之下，使用 Go、Rust 等语言以及依赖本地 Web API 的开发者则没有遭遇类似问题，因为这些生态系统拥有更健全的标准库和严格的加密验证机制，极大减少了对第三方代码的依赖。

npm 官方发言人则表示，面对恶意行为者的存在，没有任何注册政策或构建沙箱机制能够完全阻止此类攻击，呼吁大家保持韧性，准备迎接下一次不可避免的安全事件。文章以一种讽刺的口吻揭示了当前开源生态系统在安全防护上的巨大挑战。

HN 热度 411 points | 评论 203 comments | 作者：alligatorplum | 23 hours ago #

https://news.ycombinator.com/item?id=48155690

冷却期（cooldown）可以有效防止 npm 等包管理器的供应链攻击，建议忽略发布不满几天的新版本。
1 天的冷却期已经足够应对大多数攻击，3 天或 7 天虽然更安全但可能过于保守。
对于紧急漏洞修复，可以通过绕过冷却期的方式快速更新。
长达 7 天的冷却期可能会延迟新功能的使用，但有助于提高安全性。
冷却期并非万能措施，只是提高攻击难度的一个简单有效手段。
维护者通常能在几小时内发现账户被攻破并撤销恶意版本。
冷却期的实施应在包管理器层面统一执行，而非依赖用户手动操作。
设立类似 Linux 发行版的不同发布渠道（如稳定版、测试版）也是一种可行方案。
自动化工具和安全研究人员对新发布的包进行扫描，是发现恶意代码的关键。
冷却期可能导致部分用户忽略新版本，延长漏洞存在时间，但整体安全收益较大。
开发者应保持对账户的持续监控，及时发现异常行为。
过快的“进步”可能带来安全隐患，适当放慢更新节奏有助于提升软件质量。

5. 告别 Tailwind，学习构建结构化的 CSS (Moving away from Tailwind, and learning to structure my CSS) #

https://jvns.ca/blog/2026/05/15/moving-away-from-tailwind--and-learning-to-structure-my-css-/

作者 Julia Evans 分享了她从使用 Tailwind CSS 迁移到使用更语义化的 HTML 和原生 CSS 的经历与心得。她回顾了自己最初选择 Tailwind 的原因——因为不擅长组织 CSS 代码，Tailwind 帮助她快速完成了许多小型网站。

她总结了在迁移过程中学到的 CSS 结构化方法，强调了建立系统和规范的重要性，以避免代码混乱。她借鉴了 Tailwind 中的一些系统，如重置样式表、颜色调色板和字体大小体系，并结合自己的理解进行改造。

具体内容包括：

重置样式：复制了 Tailwind 的“preflight”重置样式，保持了如 box-sizing: border-box 等习惯。
组件化 CSS：将 CSS 按组件划分，每个组件有独立的类名和 CSS 文件，避免样式相互覆盖，提升维护性。
颜色管理：所有颜色变量统一定义在一个文件中，方便调用和管理，避免随意使用颜色。
字体大小：定义了一套基于变量的字体大小和行高体系，借鉴 Tailwind 的命名方式，简化字体大小的使用。
工具类：保留了一些通用的工具类，如屏幕阅读器专用的.sr-only，谨慎修改，保证通用性。
基础样式：只定义了少量全局基础样式，如 section 的最大宽度和 a 标签颜色，保持基础样式简单，逐步优化。
间距管理：尝试让布局组件负责间距，避免随意添加 padding 和 margin，提升代码规范性。
响应式设计：摒弃大量媒体查询，转而使用 CSS Grid 的 auto-fit 和 grid-template-areas 实现更灵活的响应式布局，减少断点依赖。
构建系统：利用现代 CSS 的导入和嵌套特性，开发时无需构建系统，生产环境使用 esbuild 进行打包。

整体来看，作者通过这次迁移，学习并实践了更语义化、模块化和系统化的 CSS 写法，提升了代码的可维护性和灵活性，也探索了 Tailwind 无法实现的一些 CSS 高级特性。

HN 热度 383 points | 评论 246 comments | 作者：mpweiher | 14 hours ago #

https://news.ycombinator.com/item?id=48158400

Tailwind 违背了语义化 HTML 和 CSS 分离的原则，导致代码中充斥大量无意义的 div 和 span，降低了开发者的技能水平。
Tailwind 并不强制产生“div 汤”，不合理的使用是开发者自身的问题，Tailwind 可以提高开发效率，减少样式抽象带来的复杂度。
将样式直接写在标记上增加了代码噪音，违反了 DRY 原则，且不利于缓存和调试，增加认知负担。
CSS 的全局性和级联机制容易引发样式冲突，Tailwind 通过原子类减少了这类问题，适合缺乏严格纪律的开发者。
Tailwind 作为一种更简洁的 CSS 书写方式，兼顾了细节控制和易用性，优于传统 CSS 框架的高层抽象。
CSS 框架和模板不透明，导致修改样式时困难，Tailwind 提供了更直观的控制方式。
CSS 早期缺乏嵌套、变量等特性，促使 Tailwind 等工具的出现以减少样式样板代码。
过早追求 DRY 和分离关注点反而导致了代码混乱，Tailwind 在某些情况下能避免这种问题。

6. ABC 新闻已将所有 FiveThirtyEight 文章下线 (ABC News has taken all FiveThirtyEight articles offline) #

https://twitter.com/baseballot/status/2055309076209492208

该网页是社交平台 X（原 Twitter）上的一条推文，内容由 Nathaniel Rakich 发布，他是 Votebeat 的管理编辑，曾在 FiveThirtyEight 担任高级编辑和选举分析师。推文指出，ABC 新闻已经将所有 FiveThirtyEight 的文章完全下线，相关链接现在全部重定向到 abcnews.com/politics。这一举动被认为是对数千页知识的无谓抹除。

页面还提示用户当前浏览器禁用了 JavaScript，建议开启 JavaScript 或更换支持的浏览器以继续使用 X 平台。页面底部包含帮助中心、服务条款、隐私政策、Cookie 政策等链接，以及版权信息和广告信息。

此外，页面显示了当前美国的热门话题，包括 2026 年欧洲歌唱大赛（Eurovisión2026）、费尔班克斯（Fairbanks）、Rand McNally 和#GothNimi 等。用户可以登录或注册以获取个性化时间线和更多功能。

HN 热度 374 points | 评论 164 comments | 作者：cmsparks | 1 day ago #

https://news.ycombinator.com/item?id=48152553

ABC 公司因个人恩怨拒绝出售 FiveThirtyEight 品牌，表现得很小气。
FiveThirtyEight 被大公司收购后，内容和风格发生变化，部分人对创始人 Nate Silver 感到失望，认为他为了利益出卖了品牌。
FiveThirtyEight 在不同阶段有不同表现，离开 Nate Silver 后仍有优秀的政治数据报道者继续传承其精神。
FiveThirtyEight 的预测模型是核心资产，Nate Silver 保留了模型所有权，只是授权给 ABC 使用。
ABC 删除 FiveThirtyEight 文章的决定被认为是出于私怨，且对公司价值没有正面贡献。
FiveThirtyEight 作为品牌本身具有价值，不仅仅是模型和数据。
Nate Silver 在媒体和商业操作上逐渐成熟，但仍有争议，尤其是与赌博相关的合作。
FiveThirtyEight 一直是授权性质的产品，独立运营时间很短。
预测模型提供的是概率而非确定结果，准确评估应看概率与实际结果的匹配度。
市场和数据驱动的预测方法被认为是有价值的信号和价格发现工具。
FiveThirtyEight 在历次选举中的概率预测总体较准确，除了 2016 年外。
大公司行为以利润为导向，个人批评管理层不会影响其商业决策。
认为 Nate Silver 卖掉 FiveThirtyEight 是个人选择，不应被指责为贪婪。

7. 前沿人工智能已经打破了公开 CTF 的形式 (Frontier AI has broken the open CTF format) #

https://kabir.au/blog/the-ctf-scene-is-dead

这篇博客文章讨论了 CTF（夺旗赛）竞赛在人工智能，特别是大型语言模型（LLM）技术快速发展背景下的变化和挑战。作者是一位资深 CTF 选手，回顾了自己从 2021 年开始参与 CTF 比赛并取得优异成绩的经历，强调 CTF 曾是安全领域学习和技能展示的重要平台。

文章指出，随着 GPT-4 及之后版本的出现，中等难度的 CTF 题目变得可以通过一次性提示由 AI 直接解决，极大地改变了比赛的性质。特别是 Claude Opus 4.5 的发布，使得自动化解决中等甚至部分高难度题目变得简单，比赛逐渐变成了谁能更好地利用 AI 模型和自动化工具的较量，而非纯粹的人类技术比拼。

作者进一步提到，最新的 GPT-5.5 模型甚至能解决极难的漏洞利用题目，使得公开 CTF 变成了“付费赢”的游戏，比赛排名更多反映的是谁能投入更多计算资源，而非选手的真实安全技能。这种变化导致传统 CTF 的意义和价值受到严重削弱，排行榜失去公信力，优秀团队和选手的活跃度下降，题目设计者也失去创作动力。

对于初学者，作者认为虽然他们仍能从 CTF 中学习，但排行榜被 AI 主导后，初学者很容易依赖 AI 而失去主动思考和学习的机会，阻碍技能的真正提升。相比之下，像 picoGym 和 HackTheBox 这类注重教育和实践的实验平台更适合初学者成长。

文章还反驳了“CTF 并未死亡，只是被 AI 增强”的观点，指出顶级决赛虽然难度高且参与人数少，但大部分公开赛已经被 AI 自动化主导，排行榜不再反映真实的人类能力。同时，作者强调 CTF 本质上不是安全研究，而是一种技术艺术和技能展示，AI 的介入剥夺了人类在解题过程中的核心作用。

最后，作者用国际象棋作为类比，指出虽然象棋引擎广泛用于训练和分析，但比赛中禁止使用引擎，保持了比赛的公平性和观赏性。若 CTF 允许 AI 自由使用，则失去竞技的意义和乐趣，也无法推动人类技能的极限。

HN 热度 325 points | 评论 301 comments | 作者：frays | 16 hours ago #

https://news.ycombinator.com/item?id=48157559

首次使用缩写时应拼写全称，方便不了解的人理解，扩大传播效果。
有些缩写如 CTF 和 BGP 非常普及，目标受众通常已熟悉，无需重复解释。
文章针对特定圈子，拼写全称可能显得内容过于基础，降低专业感。
仅拼写全称可能无法帮助理解具体含义，需结合背景解释。
CTF 不仅是安全竞赛，也是一种游戏模式，单纯拼写全称难以传达完整概念。
许多技术缩写的全称并不能直接帮助理解其实际含义。
有全称便于搜索和进一步学习，但理解还需结合具体内容。
教育中计算机辅助学习效果有限，传统手写和白板教学更有助于记忆和理解。
个人学习效果取决于学习者的主动性和坚持，资源丰富但自律是关键。
博士级别的知识掌握不一定能直接带来就业机会。
网络资源和视频教学对自学有帮助，尤其在缺乏面对面教学的情况下。

8. 我用来捕捉交易欺诈的 SQL 模式 (SQL patterns I use to catch transaction fraud) #

https://analytics.fixelsmith.com/posts/sql-fraud-patterns/

这篇文章介绍了六种通过 SQL 查询模式来检测交易欺诈的方法，适用于各种涉及交易记录的场景，如信用卡、医疗索赔、电商和销售点等。作者强调，欺诈检测主要依赖 SQL 查询，而非机器学习或其他新兴技术。

第一种模式是“交易速度”，通过统计单位时间内的交易次数，识别异常快速的交易行为，常见于盗刷者试图快速耗尽卡内余额。作者建议使用不同时间窗口（如 1 分钟、5 分钟、1 小时）并结合白名单减少误报。

第二种是“不可能的旅行”，通过比较同一卡号在短时间内出现在地理位置相距甚远的地方，判断是否存在卡被克隆的情况。使用大圆距离函数计算两地间距离，并设定速度阈值（如 600 英里/小时）筛选异常。

第三种是“金额异常”，关注交易金额是否集中在某些特定数值段，如 1 美元、5 美元、10 美元的整额交易通常是卡片测试，接近某些限制阈值（如 99.99 美元、499.99 美元）的金额可能是欺诈者刻意避开身份验证的表现。

第四种是“可疑商户”，通过分析某商户在短时间内出现大量不同卡号交易且金额异常，识别可能被安装了盗刷设备的商户。作者提出动态基线方法，将当前交易量与商户自身历史数据对比，检测异常增长。

第五种是“非正常时间交易”，通过统计持卡人在正常时间段内的交易习惯，识别在其平时不活跃的时间段发生的交易，提示可能的盗用行为。

文章还提到第六种模式，但内容未完全展示。整体来看，文章提供了实用且具体的 SQL 查询示例，帮助数据分析人员构建有效的欺诈检测系统。

HN 热度 307 points | 评论 124 comments | 作者：redbell | 24 hours ago #

https://news.ycombinator.com/item?id=48155212

价格是否为整数与商家定价策略有关，且不同国家和地区的定价习惯差异较大，不能简单以此判断交易是否异常。
交易时间是否异常的判断存在争议，部分用户在非通常时间段也会进行正常交易，且误报成本较高。
用户出国或跨时区使用信用卡时，会提前通知银行以避免被误判为异常交易。
某些特定交易模式（如连续加满油两次、短时间内跨远距离加油）会被银行系统标记为异常。
一些用户习惯使用整数金额消费，且部分地区加油站或商家会要求预设消费金额。
美国的价格通常不含税且价格设置较为复杂，其他国家则更常见整数价格。
相关元数据足够丰富，理论上可以通过数据分析准确识别异常，但实际操作中仍有挑战。
1 美元的测试交易在某些行业（如酒店、租车）曾被用作信用卡有效性验证，但现代做法通常使用更大金额的预授权。
通过简单的 SQL 模式检测欺诈易被绕过，且更复杂的统计分析和 AI 技术更适合处理此类问题。
欺诈者通常缺乏专业知识，简单的异常检测仍能有效拦截大部分欺诈，但高级犯罪分子可能绕过这些检测。
对“可疑商户”的检测逻辑存在争议，单纯检测高价值交易激增可能误判正常促销活动，难以准确捕捉持续的小规模欺诈行为。

9. Zulip 基金会 (The Zulip Foundation) #

https://blog.zulip.com/2026/05/15/announcing-zulip-foundation/

这篇文章宣布了 Zulip 项目的重要转型：Zulip 的创始人 Tim Abbott 将卸任全职领导职务，加入 Anthropic 公司，同时将 Zulip 所属的 Kandra Labs 公司捐赠给新成立的非营利组织——Zulip 基金会。此举旨在为 Zulip 项目提供更稳定的治理结构，确保其长期独立性和可持续发展。

Zulip 是一款以主题为基础的团队聊天工具，因其能够支持多线程并行对话而受到众多企业、开源项目和研究社区的欢迎。最近发布的 Zulip 12.0 版本包含了来自全球 160 名贡献者的近 5500 次代码提交。

Zulip 基金会将成为 Zulip 项目的正式管理机构，专注于为公共利益组织和社区开发最佳的团队聊天体验。Kandra Labs 将由基金会全资拥有，继续提供托管和支持服务，保持对商业客户的承诺和透明度。基金会的初始董事会成员包括创始人 Tim Abbott、联合领导 Greg Price、产品负责人 Alya Abbott 以及 Rust 语言领域的 Josh Triplett。此外，还设有由数学家、学者和社区领袖组成的顾问委员会。

在领导层过渡期间，Zulip 的运营将保持稳定，包括云服务、移动推送通知、自托管支持以及 Google Summer of Code 项目。Kim Vandiver 被任命为 Kandra Labs 临时总裁，负责确保过渡顺利并领导全球范围的领导层招聘。

此次结构调整还使 Zulip 能够公开承诺其长期坚持的价值观，如保护用户数据隐私和专注产品质量，同时开辟了新的筹资渠道，包括申请资助和接受免税捐赠，避免了因外部投资带来的价值观妥协风险。

Tim Abbott 选择离开 Zulip，加入 Anthropic，是因为 Anthropic 致力于负责任地发展人工智能，造福人类。他和另外三名 Zulip 领导团队成员将一同加入 Anthropic，但他依然对 Zulip 保持承诺，支持其未来发展。

HN 热度 295 points | 评论 78 comments | 作者：boramalper | 1 day ago #

https://news.ycombinator.com/item?id=48152168

Zulip 作为开源项目，通过成立非营利基金会，有助于保障项目的长期稳定和用户隐私，避免商业压力导致的数据出售或广告植入。
Zulip 团队注重培养新开发者和构建贡献渠道，为项目的可持续发展打下基础。
核心团队成员离开是正常的，十多年专注于一个项目非常不易，祝愿项目未来继续保持工程质量和社区原则。
有用户期待即将到来的 Rust 重写，但也有观点认为 Rust 重写的可能性被夸大。
有建议希望在重写服务器时减少供应链风险，简化依赖管理，甚至希望服务器能用 Go 语言实现。
有用户希望能利用 AI 资源开发现代化的 Android 客户端。
选择在周五发布公告是基于时间节点和发布效果的考虑，并非刻意隐藏消息。
Zulip 对很多用户来说是重要的开源项目和沟通工具，尤其适合严肃讨论和信息查找。
有用户觉得 Zulip 的界面对初学者或小团队可能稍显复杂，不如 Discord 易用。
移动端有时启动较慢，但新版本已经在优化长连接支持，预计会改善启动速度。
Zulip 被视为关键的业务沟通工具，拥有庞大的活跃用户群体和社区支持。

10. SANA-WM：一款用于 1 分钟 720p 视频的 26 亿参数开源世界模型 (SANA-WM, a 2.6B open-source world model for 1-minute 720p video) #

https://nvlabs.github.io/Sana/WM/

该网页介绍了 SANA-WM，一款高效的 2.6 亿参数开源世界模型，能够基于一张图像和相机轨迹生成 720p 分辨率、时长达一分钟的可控视频，且仅需单 GPU 即可运行。SANA-WM 在视觉质量上可与 LingBot-World 和 HY-WorldPlay 等大型工业基线相媲美，同时显著提升了效率。

SANA-WM 的核心设计包括四个方面：一是混合线性注意力机制，结合逐帧的 Gated DeltaNet 与周期性 softmax，实现长时间上下文的内存高效建模；二是双分支相机控制，确保精确的六自由度轨迹跟踪；三是两阶段生成流程，利用 17 亿参数的长视频细化器提升纹理、运动和后期画质；四是稳健的标注流程，从公开视频中提取准确的度量尺度六自由度相机姿态，生成高质量时空一致的动作标签。

训练方面，SANA-WM 使用约 21.3 万条带度量尺度姿态监督的公开视频片段，在 64 块 H100 GPU 上训练 15 天完成。推理时，单块 H100 GPU 即可生成一分钟 720p 视频，其蒸馏版本在单块 RTX 5090 GPU 上通过 NVFP4 量化技术，能够在 34 秒内完成视频去噪。

网页还展示了 SANA-WM 在一分钟世界模型基准测试中的表现，显示其动作跟随准确度优于现有开源基线，且在视觉质量相当的情况下，推理速度提升了 36 倍。页面中还通过多个静止第一人称视角的示例，描述了不同自然和奇幻场景下的视频生成效果，展示了环境细节丰富、动态元素自然流畅的特点，体现了模型在长时视频合成和精确动作控制上的优势。

HN 热度 284 points | 评论 116 comments | 作者：mjgil | 11 hours ago #

https://news.ycombinator.com/item?id=48159445

视频游戏中的世界模型难以捕捉到设计中的“意图性”，这导致生成内容常常缺乏沉浸感和一致性。
传统游戏设计和程序生成各有优劣，随着规模增大，程序生成可能成为更高质量的解决方案，但都需要细心打磨。
现有的 AI 生成工具控制能力有限，未来需要根据用户需求开发更精细的控制机制，部分专业工具已具备较高控制度但用户较少。
AI 可以帮助非专业人士实现以前难以完成的创作，但仍需投入精力和细致调整以创造出特别的作品。
AI 生成内容可能导致大量表面合理但缺乏深度的作品泛滥，用户需要花费更多时间筛选有价值的内容。
AI 工具能够加速创造满足人类体验的内容，但真正高质量的作品仍需较长时间和严格自律。
AI 生成内容的普及可能降低整体作品平均质量，但优秀作品的数量可能增加。
AI 工具更像是“微波炉”而非“圆锯”，更适合快速完成而非深度创作，这种便捷性可能影响作品质量和创作态度。
微波炉烹饪虽便捷但通常质量较低，类似地，AI 快速生成内容也可能牺牲深度和精细度。

Hacker News 精彩评论及翻译 #

Mullvad exit IPs are surprisingly identifying #

https://news.ycombinator.com/item?id=48145679

I work at Mullvad. (co-CEO, co-founder)

Some aspects of the described behavior are as we intended and some are not. The cause is not exactly as described in the blog post. As for mitigation, we are already testing a patch of the unintended behavior on a subset of our infrastructure. If any of you try to reproduce the blog post’s findings you may get confusing results throughout the day.

We will also re-evaluate whether the intended behaviors are acceptable or not. Some of this is a trade-off between multiple aspects of privacy, and multiple aspects of user experience.

Please note that this is my current understanding, which may change. I was only made aware of this an hour ago, and most of that time was spent talking with Ops, considering what to do immediately, and writing this post.

Finally, for those of you who do security research: when you find a security or privacy issue, please consider notifying the maintainer/vendor before publishing your findings, even if you intend to publish right away.

kfreds

我在Mullvad工作。（联合首席执行官，联合创始人）

所描述行为中的某些方面是我们预期的，有些则不是。原因并不像博客文章中描述的那样。关于缓解措施，我们已经在部分基础设施上测试针对非预期行为的补丁。如果你们尝试复现博客中发现的问题，可能会在一天中得到令人困惑的结果。

我们也会重新评估这些预期行为是否可以接受。这其中涉及隐私的多个方面和用户体验的多个方面之间的权衡。

请注意，这是我目前的理解，可能会有所变化。我是在一小时前才得知此事，期间大部分时间都在与运营团队沟通，考虑立即采取的措施，并撰写这篇文章。

最后，对于从事安全研究的朋友们：当你们发现安全或隐私问题时，即使打算马上公布，也请考虑先通知维护者或供应商。

Amazon workers under pressure to up their AI usage… #

https://news.ycombinator.com/item?id=48149107

Not just Amazon, too. It feels like all of big tech (and some smaller firms) have simultaneously gone insane. Imagine if your CEO woke up one day and told the company: “We need to encourage travel spending. Please book as many business trips as you can, and spend as much money as possible. Fly first class to our satellite offices! Take limos instead of Ubers! Eat at fine restaurants! Make sure you are constantly traveling. In fact, we are going to make Travel Spending part of your annual performance review: If you don’t spend enough on business travel, you’ll get a low rating!”

We are living in a totally bonkers time.

ryandrake

不仅仅是亚马逊，感觉所有的大型科技公司（以及一些小公司）都同时变得疯狂了。想象一下，如果你的CEO某天醒来告诉公司：“我们需要鼓励差旅开支。请预订尽可能多的公务出差，尽可能多花钱。飞头等舱去我们的分支办公室！坐豪华轿车而不是优步！在高级餐厅用餐！确保你不断在出差。事实上，我们将把差旅开支纳入年度绩效考核：如果你在公务出差上的花费不够多，你的评分就会很低！”

我们生活在一个完全疯狂的时代。

ABC News has taken all FiveThirtyEight articles of… #

https://news.ycombinator.com/item?id=48153035

BTW, I approached ABC about buying back the former FiveThirtyEight IP*, and they said they wouldn’t sell at any price because I’d criticized their management of the brand.

–Nate Silver (538 founder)

ABC seem pretty petty here.

applfanboysbgon

顺便提一句，我曾找ABC商谈回购前五三八网站的知识产权，他们说无论出多少钱都不卖，因为我曾批评过他们对这个品牌的管理。

——内特·西尔弗（538创始人）

ABC这次表现得挺小气的。

Project Gutenberg – keeps getting better #

https://news.ycombinator.com/item?id=48150432

Hi! I’m one of the programmers at Gutenberg. We’ve been improving the site a lot over the past few months (and more is coming!). If you haven’t visited the page recently, it’s worth checking out again: https://www.gutenberg.org/

JSeiko

你好！我是Gutenberg的一名程序员。过去几个月里我们对网站进行了大量改进（未来还会有更多更新！）。如果你最近没访问过这个页面，值得再来看看：https://www.gutenberg.org/

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48154116

I’m pretty sure he’s talking about companies and people outsourcing their decision making and thinking to AI and not really about using AI itself.

I don’t think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.

These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.

This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can’t let it do the thinking and decision making.

impulser_

我很确定他讲的是公司和个人将决策和思考外包给人工智能，而不是真的指使用人工智能本身。

我不认为用人工智能写代码就是人工智能精神错乱或者有什么坏处，但如果你只是给人工智能下指令并全盘相信它告诉你的内容，那你就是人工智能精神错乱了。你在推特上常能看到金融人士和风险投资人这样做。他们实际上会直接发布ChatGPT的截图，作为他们对某个话题的思考和推理，而不是自己稍微动动脑子。

在创意、思考或提供建议方面，这些东西简直一团糟，因为它们只是模式匹配器，只会给你看到的模式。大多数人都能看出来，如果你试图和它聊点创意，它往往就只会吐出最普通的垃圾。

不过，在某些需要模式匹配才能有用的任务上，比如写代码，它还是挺有用的，但前提是你不能让它做思考和决策。

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48154451

I thinking that it’s quite a different experience going all Jackson Pollock with AI in your own studio on your own terms, compared to the sorry state of affairs of having 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota.

elktown

我觉得，在自己的工作室里以自己的方式用 AI 进行全然像杰克逊·波洛克那样的创作，是一种相当不同的体验，而不像在一家公司里，有成百上千个“波洛克”为了完成喷漆配额而胡乱乱涂乱抹的那种糟糕状况。

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48157017

I feel like I’m in a different field compared to the rest of hacker news.

And below you repeat what all of Hacker News hypemen say about AI (“I have stopped writing code”, “it’s mature and the next step of engineering”)

Thank you for reinforcing the point of OP

EDIT: you’re the same person that a month ago said your company feels git is outdated now that you have agentic coding, and you don’t even need to write your own commit messages. This is next-level trolling, or a serious case of AI psychosis.

sph

我感觉自己跟其他Hacker News上的人好像不在一个频道。

下面你又重复了所有Hacker News吹AI的人说的话（“我已经停止写代码了”，“它很成熟，是工程学的下一步”）。

谢谢你帮着证明了楼主的观点。

附注：你就是一个月前那个说你的公司觉得git已经过时了，因为现在有了自主编码，甚至不需要自己写提交信息的人。这要么是高级别的恶意戏弄，要么就是严重的AI精神错乱。

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48154252

Correct. I use AI a ton and I’m having more fun every day than I ever did before thanks to it (on average, highs are higher, lows are lower). Your characterization is all very accurate. Thank you.

Here’s some other topics I’ve written on it:

https://mitchellh.com/writing/my-ai-adoption-journey
https://mitchellh.com/writing/building-block-economy
https://mitchellh.com/writing/simdutf-no-libcxx (complex change thanks to AI, shows how I approach it rationally)

mitchellh

正确。我大量使用人工智能，得益于它，我每天都比以前更有趣（总体来说，高兴的时刻更高，低落的时刻更低）。你对它的描述非常准确。谢谢你。

这是我写过的一些相关话题：

https://mitchellh.com/writing/my-ai-adoption-journey
https://mitchellh.com/writing/building-block-economy
https://mitchellh.com/writing/simdutf-no-libcxx（复杂的变更，多亏了人工智能，展示了我理性处理问题的方法）

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48154496

I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.

Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.

Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).

zmmmmm

我认为AI救援咨询将成为一种重要的高价值咨询模式，类似于专家介入处理安全漏洞或数据恢复的情况。

纯由AI编写的系统将扩展到一个无人能够理解的复杂程度，缺陷修复率会逐渐下降，而每修复一个缺陷所消耗的资源则会增加，最终AI的修改平均导致的缺陷比修复的还多，整个系统将变得不稳定。清理这样一团糟、重新构建系统将成为一种特殊的流程（可能仍然依赖AI），需要提炼出核心设计原则以避免灾难性的崩溃。

未来某个时候，新的软件工程将主要围绕避免此类问题的设计原则展开，但我们需要大约20年时间来学习这些原则，就像早期的软件工程花了远超预期的时间才形成一套稳定的设计原则（而且至今人们仍在争论这些原则！）。

Moving away from Tailwind, and learning to structu… #

https://news.ycombinator.com/item?id=48160046

I got curious about what writing more semantic HTML would feel like.

I’ve been teaching semantic HTML / accessible markup for a long time, and have worked extensively on sites and apps designed for screen readers.

The biggest problem with Tailwind is that it inverts the order that you should be thinking about HTML and CSS.

HTML is marking up the meaning of the document. You should start there. Then style with CSS. If you need extra elements for styling at that point, you might use a div or span (but you should ask yourself if there’s something better first).

Tailwind instead pushes the dev into a CSS-first approach. You think about the Tailwind classes you want, and then throw yet-another-div into the DOM just to have an element to hang your classes on.

Tailwind makes you worse as a web developer from a skill standpoint, since part of your skill should be to produce future-proof readable HTML and CSS that it usable by all users and generally matches the HTML and CSS specs. But devs haven’t cared about that for years, so it makes sense that Tailwind got so popular. It solved the “I’m building React components” approach to HTML and CSS authoring and codified div soup as a desirable outcome.

Tailwind clearly never cared about any of this. The opening example on Tailwind’s website is nothing but divs and spans. It’s proven to be a terrible education for new developers, and has contributed to the div soup that LLMs will output unless nudged and begged to do otherwise.

TonyAlicea10

我对编写更具语义性的HTML会是什么感觉产生了好奇。

我教过很长时间的语义化HTML和无障碍标记，也在为屏幕阅读器设计的网站和应用上做过大量工作。

Tailwind最大的问题是它颠倒了你应该考虑HTML和CSS的顺序。

HTML是用来标记文档的含义。你应该从这里开始，然后再用CSS来设置样式。如果你在这一步需要额外的元素来做样式，可以使用div或span（但你应该先问问自己，是否有更好的选择）。

然而Tailwind却把开发者推向了一种以CSS为先的方法。你先想着想用哪些Tailwind类，然后为了挂载这些类，再往DOM里加多余的div。

从技能角度看，Tailwind会让你的网页开发水平变差，因为你技能的一部分应该是能够写出面向未来的、可读的HTML和CSS，这些代码能被所有用户使用，并且基本符合HTML和CSS规范。但这些年来，开发者们并不在意这些，所以Tailwind流行是有道理的。它解决了“我在构建React组件”的HTML和CSS创作方式，并把一堆div乱堆的做法理所当然化了。

Tailwind显然根本不关心这些问题。Tailwind官网开头的示例，全都是div和span。这已经被证明对新开发者的教育非常糟糕，也助长了除非被强迫否则大型语言模型(LM)会输出的那种乱堆div的代码。

Bill to block publishers from killing online games… #

https://news.ycombinator.com/item?id=48154801

open source server code if you are going to cease support

When I was a senior exec at a big public tech company, there was a product we decided to discontinue and we thought would be nice to just open source. Somehow I ended up in charge of managing that process and was shocked at how complex, time-consuming and expensive it was in a multi-billion dollar, publicly-traded corp vs some code my friends and I wrote.

Legal had to verify that there was no licensed library code used and that we had clear, valid copyright to everything there. The project had been written over several years, merged with a project we’d acquired with a startup, some key people weren’t around any more, the source control had transitioned across multiple platforms, etc. And even once we nailed all that down sufficiently, we didn’t get an “all clear” from legal, we just got a formal legal opinion that any liability was probably under $1M. And then we had to convince an SVP to endorse that assumption of $1M potential liability and make a business case for approval to the CEO.

For a public company, the default assumption for any online game would be “the server side code WILL be open sourced” (under threat of prosecution). That means legal would mandate “No commercially licensed libraries can be used, any open source libraries will have to be vetted to ensure the license is compatible and everything else will need to pass IP and compliance audit.” That will certainly have an impact on development time frames and economics.

mrandish

如果你打算停止支持，就开放源代码服务器部分。

当我在一家大型上市科技公司担任高管时，公司决定停产某个产品，并认为开源会是个不错的选择。结果，这个任务落到我负责管理这个流程，我震惊于在一家市值数十亿美元的上市公司里，这个过程竟然如此复杂、耗时且昂贵，而这和我和朋友们写的一些代码完全不同。

法务部门必须确认项目中没有使用任何有许可证限制的库代码，并且我们对所有代码拥有明确有效的版权。这个项目花了多年时间开发，还与我们收购的一家初创公司的项目合并过，一些关键人员已经不在团队中，源代码管理经历过多个平台的变迁，等等。即使我们最终基本搞定了这些问题，法务也没有给出“全清”许可，只是正式出具了一份法律意见，说明潜在法律责任估计低于100万美元。然后我们还得说服一位高级副总裁认可这100万美元潜在责任的假设，进而为项目向CEO做商业审批陈述。

对于一家上市公司来说，任何网络游戏的默认假设是“服务器端代码必须开源”（否则面临法律起诉的风险）。这意味着法务部门会要求“不允许使用任何商业许可的库，任何开源库都必须经过许可兼容性审查，所有其他代码须通过知识产权和合规审计。”这一切必然会对开发周期和经济效益造成影响。

A 0-click exploit chain for the Pixel 10 #

https://news.ycombinator.com/item?id=48150716

I followed the link to the Pixel 9 bug/exploit and saw this:

“Over the past few years, several AI-powered features have been added to mobile phones that allow users to better search and understand their messages. One effect of this change is increased 0-click attack surface, as efficient analysis often requires message media to be decoded before the message is opened by the user”

Haven’t we learned our lesson on this? Don’t read and act on my sms messages without me asking you to!

krupan

我点开了关于 Pixel 9 漏洞/漏洞利用的链接，看到了这段话：

“过去几年里，手机上增加了多项由人工智能驱动的功能，这些功能让用户能够更好地搜索和理解他们的消息。这个变化的一个影响是零点击攻击面增加，因为高效的分析往往需要在用户打开消息之前对消息中的媒体内容进行解码。”

我们难道还没吸取教训吗？别在我没让你看的情况下读取并处理我的短信！

We are retiring our bug bounty program #

https://news.ycombinator.com/item?id=48148785

Sounds a like a tactical tornado, made me think of this paragraph:

“Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.” - John Ousterhout, A Philosophy of Software Design

chapinb

听起来像个战术旋风，让我想起了这段话：

“几乎每个软件开发组织里都会有至少一个把战术编程发挥到极致的开发者：战术旋风。战术旋风是个高产的程序员，写代码的速度远远快于其他人，但工作方式完全是战术导向的。谈到快速实现某个功能，没有人比战术旋风做得更快。在一些组织里，管理层把战术旋风视为英雄。然而，战术旋风留下的是一片狼藉。他们很少被那些未来需要和他们代码一起工作的工程师视为英雄。通常，其他工程师必须去收拾战术旋风留下的烂摊子，这就让外界看起来清理残局的那些工程师（其实才是真正的英雄）进展比战术旋风慢。”——约翰·奥斯特豪特，《软件设计的哲学》

Ask HN: How to be SOC2 Type 2 compliant as a solo-… #

https://news.ycombinator.com/item?id=48150204

Don’t. You are exactly the wrong kind of firm to be pursuing SOC2.

SOC2 is like the corporate GPL of security. It’s an infectious secret handshake company security teams swap in lieu of filling out security questionnaires. Nobody savvy takes it seriously.

There will come a time where your business will grow to the point where it makes sense to pay for the secret handshake. The overwhelming most likely scenario in which that happens is a purchase order made contingent on your SOC2 Type I attestation, where the revenue from that purchase order more than pays for the attestation.

Do not ever do a SOC2 speculatively, in the hopes that it will improve your sales prospects. Plenty of successful firms don’t have SOC2s. If you’re losing sales where SOC2 is a factor, you didn’t have those sales to begin with.

tptacek

别做。你们公司完全不适合追求SOC2。

SOC2就像企业安全领域的GPL协议。它是安全团队用来替代填写安全问卷的一种秘密“握手”方式。没有人真正认真看待它。

总有一天，当你的业务发展到一定规模，支付这份“秘密握手”的费用才会变得合理。最常见的情况是，某个采购订单以你具备SOC2 Type I认证为条件，而该采购订单带来的收入足以覆盖认证的费用。

绝不要抱着投机心理去做SOC2，指望它能提升你的销售前景。很多成功的公司根本没有SOC2。如果你在销售中因为缺少SOC2而失利，那说明那些销售你本来也拿不到。

UK government replaces Palantir software with inte… #

https://news.ycombinator.com/item?id=48146859

I say this as somebody who has worked vendor side in UK public sector for a number of years.

It’s policy. It’s official Whitehall policy.

As a department you can’t hire programmers at £100k/year, because that pushes them way, way higher than civil service bands allow. But you can pay a “Systems Integrator” - a consultancy like Cap Gemini, Deloitte, Fujitsu - £600/day for the same programmer in the same seat. So, £100k/year = bad, £120k/year via an external consultancy = good.

Then we get into actually building and owning tech. Look at the history of GDS - they were empowered to pay half decent salaries and build and own things, but then had budgets slashed and programs cut. Why? Because we can “just buy it”. Yes, you won’t own the IP, it’ll cost 4x as much, it’ll take 3x-5x longer, but at least you won’t have “inefficient civil service bloat” to have to manage.

This all started in the 1980s, and there are signs of it swinging back. I was at one department last year where they were telling me they’re thinking about hiring actual engineers and embedding some devops stuff internally - absolutely jaw-droopingly revolutionary. Genuinely.

PaulRobinson

我这么说是因为我在英国公共部门的供应商方工作了好几年。

这是政策。这是官方的白厅政策。

作为一个部门，你不能以每年10万英镑的薪水雇佣程序员，因为那会远远超出公务员的薪酬等级限制。但你可以以每天600英镑的价格，通过“系统集成商”——像凯捷、德勤、富士通这样的咨询公司——雇佣同一个程序员，在同一个岗位上。所以，年薪10万英镑=不好，外包咨询公司年薪12万英镑=好。

然后我们谈谈实际的技术构建和拥有。看看政府数字服务（GDS）的历史——他们被授权支付较为丰厚的薪水，构建并拥有技术，但后来预算被大幅削减，项目被砍掉。为什么？因为我们“可以直接购买”。是的，你不会拥有知识产权，成本会增加4倍，时间会延长3到5倍，但至少你不用管理“低效的公务员臃肿”。

这一切始于1980年代，现在有迹象表明情况正在逆转。去年我在一个部门时，他们告诉我他们正在考虑招聘真正的工程师，内部嵌入一些DevOps的工作——这真是令人震惊的革命性举措。确实如此。

A few words on DS4 #

https://news.ycombinator.com/item?id=48143570

DwarfStar4 is a small LLM inference runtime that can run DeepSeek 4. The blog post implies that it currently requires 96GB of VRAM.

For others who are lacking context :-)

gcr

DwarfStar4 是一个小型的语言模型推理运行时，可以运行 DeepSeek 4。博客文章暗示它目前需要 96GB 的显存。

给其他缺乏背景信息的人 :-)

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48161071

I think there’s a reasonable argument that our entire society right now is under AI psychosis:

The stock market keeps going up in the face of the indefinite closure of Hormuz. We’re investing in datacenters at a scale that only makes sense if AI capabilities continue to advance to the point where they surpass most humans at most white collar tasks, if not reach superintelligence.

And what are the possible outcomes?

Bust. We’ve come away with a useful tool but the hundreds of billions of capital expenditure were thrown away on a pipe dream.
Success! We’re the dog that’s caught the car. Then what? Currently the political debate is, to caricature only slightly, between “oh no the datacenters will use more water than golf courses” and “lol what are you going to do, regulate matrix multiplication?”. How the hell are we going to cope with introducing a new intelligent species?

Either way, it sure seems like we’re collectively operating more in the interests of the future AI than in the interests of humanity. What is this, if not a sort of psychosis?

kalkin

我认为有一个合理的观点认为，我们整个社会目前正处于一种人工智能精神错乱状态：

尽管霍尔木兹海峡无限期关闭，股市依然持续上涨。我们正以一种只有在人工智能能力持续进步，达到超越大多数人在多数白领工作上的水平，甚至达到超级智能的情况下才有意义的规模，投资于数据中心。

那么，可能的结果是什么？

失败。我们得到了一个有用的工具，但数千亿美元的资本支出被浪费在一个纸上谈兵的梦想上。
成功！我们成了那只抓住汽车的狗。然后呢？目前的政治辩论，稍作夸张概括，就是“哦不，数据中心用水会比高尔夫球场还多”对抗“哈哈，你打算怎么管矩阵乘法？”我们到底该如何应对引入一个新的智能物种？

无论哪种结果，看起来我们集体更像是在为未来的人工智能利益而运作，而非为人类利益。这又算什么，如果不是某种精神错乱？

Bill to block publishers from killing online games… #

https://news.ycombinator.com/item?id=48153843

It seems like the fair solution to this problem is to open source server code if you are going to cease support for an online game. That way the community has the opportunity to run their own servers if they want to.

I also really support giving 60 day notice if an online game is going to shut down. Places I have worked have had policies like that for games they are sun setting and I think the best game publishers think a lot about how to do that operation. It’s not simple, because if people think a game is going away their behavior changes. And nothing sucks like buying online content for a game right before it shuts down. No matter what you do people will tell you they didn’t know the game was shutting down. And if you give away content that you previously sold that also sometimes angers the community.

The problem is when companies know a game isn’t working they tend to want to shut it down right away because the money they spend keeping it up is never coming back. And maybe the company is going to die too. So I do support a law for a 60 day notice.

georgeecollins

看起来，解决这个问题的公平办法是，如果你要停止支持一款在线游戏，就把服务器代码开源。这样社区就有机会自行运营服务器，如果他们愿意的话。

我也非常支持如果一款在线游戏要关闭，提前60天通知玩家。我曾经工作过的地方对于逐步停止运营的游戏都有类似的政策，我觉得最优秀的游戏发行商都会仔细考虑如何执行这项操作。这并不简单，因为如果玩家觉得游戏要关闭了，他们的行为就会改变。而且，在游戏即将关闭前买在线内容是最让人失望的事情。无论你做什么，总有人会说他们不知道游戏要关闭了。如果你把之前卖过的内容免费送出，有时也会激怒社区。

问题是，当公司知道某个游戏不赚钱时，他们倾向于立即关闭，因为维持运营的费用永远不会回本。也许公司本身也面临倒闭风险。所以我支持出台一个强制提前60天通知的法律。

UK government replaces Palantir software with inte… #

https://news.ycombinator.com/item?id=48145971

Having spent time working in UK healthcare tech, I never understood why everyone was lining up to throw buckets of money at Palantir. Quite apart from being obviously evil and so on, none of their solutions were actually very good.

Unfortunately, it’s hard to escape the feeling that friends in high places, some lobbying and some er… reciprocal back scratching might have been instrumental.

See also senior staff at NHS England (or Digitial? can’t remember) handing massive NHS compute contracts to AWS, and then leaving the civil service to become… an AWS employee.

dust-jacket

在英国医疗科技领域工作过一段时间后，我一直不明白为什么大家都争先恐后地向Palantir投入大量资金。撇开明显的邪恶本质不谈，他们的解决方案实际上也都不是很出色。

不幸的是，很难摆脱这样的感觉：高层朋友的一些游说和某些……互惠互利的关系可能起了关键作用。

还可以看看英国国家医疗服务体系（NHS England，或者是Digital部门？记不清了）的高级人员将大量NHS计算合同交给AWS，然后离开公务员队伍去当了AWS员工。

U.S. DOJ demands Apple and Google unmask over 100k… #

https://news.ycombinator.com/item?id=48152071

I watched a pickup roll coal in the middle of freaking East Bay, literally within site of downtown San Francisco, on a bicyclist. I reported their license to the California Air Resources Board, and not longer after that I saw it up on jacks in a neighborhood auto shop. That made my day. Asshole.

kstrauser

我在东湾的市中心附近，几乎就在旧金山市区的视线范围内，看见一辆皮卡故意排放黑烟，针对的是一个骑自行车的人。我把他们的车牌报告给了加州空气资源委员会，不久之后我看到那辆车被架在一个社区汽车修理店的千斤顶上。那让我整天心情都好了。混蛋。

Bitwarden scrubs ‘Always free’ and ‘Inclusion’ val… #

https://news.ycombinator.com/item?id=48148318

Actually, the part of the article that made me prick my ears up was this paragraph:

In February, longtime CEO Michael Crandell moved to an advisory role, according to LinkedIn, with no announcement from the company. His replacement, Michael Sullivan, former CEO of both Acquia and Insightsoftware, touts his experience with “all facets of mergers and acquisitions” on his own LinkedIn page, including experience working with leading private equity firms.

In combination with downplaying the free plan and removing any hint of now politically unfashionable DEI-like language, what this screams to me is: Bitwarden is being prepped for a sale.

chipotle_coyote

实际上，文章中让我竖起耳朵的是这一段：

据LinkedIn显示，长期担任CEO的Michael Crandell于二月份转任顾问角色，公司并未发布任何公告。他的接任者Michael Sullivan，曾任Acquia和Insightsoftware的CEO，在他的LinkedIn页面上宣扬自己拥有“兼具并购各个方面”的经验，包括与领先私募股权公司合作的经历。

结合淡化免费计划和删除任何带有如今在政治上不受欢迎的DEI类语言的迹象，这让我感受到的是：Bitwarden 正在为出售做准备。

I believe there are entire companies right now und… #

https://news.ycombinator.com/item?id=48154583

Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable.

Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)

leoc

纯AI编写的系统将扩展到一个人类永远无法理解的复杂程度，缺陷修复率将逐渐下降，每个缺陷的代币燃烧率将上升，最终AI的改动平均会导致的缺陷多于修复的缺陷，整个系统将变得不稳定。

哇，真的，AI确实准备在大型复杂软件系统上达到人类的表现了！;)

We are retiring our bug bounty program #

https://news.ycombinator.com/item?id=48148671

Which goes on to prove that bottleneck isn’t in writing the code. It is in reading and understanding the code.

We all had that one “productive” engineer in our teams who would write huge PRs that would have large swaths of refactoring whether warranted or not and that was way before anyone even could imagine in their wildest dreams that neural networks could generate that huge amounts of code.

The net effect of such a “productive” engineer always was that instead of increasing the team velocity, team would come to a crawling pace because either his PR had to be reviewed in detail eating up all the time and/or if you just did cursory LGTM then they blew up in production meanwhile forcing everyone back to the drawing board but project architecture would have shifted so rapidly due to his “productivity” that no one had a clear picture of the codebase such as what’s where except that one “super smart talented productive loyal to the company goals” guy.

wg0

这进一步证明了瓶颈不在于写代码，而是在于阅读和理解代码。

我们团队里总有那么一个“高产”工程师，他会写出巨大的PR，里面包含大量是否必要的重构，这还是在任何人连神经网络能生成如此大量代码都无法想象之前。

这种“高产”工程师的最终效果往往是，团队的速度非但没有提升，反而降到了爬行的节奏，因为要么他的PR必须详细审查，耗费全部时间；要么如果只是草草通过LGTM，结果在生产环境出问题，迫使大家回到起点重新设计。与此同时，由于他的“高产”，项目架构变化太快，没人能清楚了解代码库的整体情况，只有那位“超级聪明、有才华、高产且忠于公司目标”的工程师才知道各部分在哪儿。

New arXiv policy: 1-year ban for hallucinated refe… #

https://news.ycombinator.com/item?id=48142541

It’s just one darn hallucinated citation for heaven’s sake, not fraud or something.

It is fraud.

It doesn’t account for the substance or quality of their work at all.

References are part of the work. If you’re making up the references, what else are you making up?

People make mistakes and a good fraction of them can learn from those mistakes. There’s no need to permanently cripple someone’s ability to progress their life or contribute to humanity just because an AI hallucinated a reference one time in their life.

A one year ban is not permanent. Having a negative consequence for making poor decisions seems like an inducement to learn from the mistake?

In an ideal world, one would be keeping notes on references used while doing the research that lead to writing the paper. Choosing not to do that is one poor decision.

Having a positive outlook, if asking an AI to provide references that may have been missed, one should at least verify the references exist and are relevant. Choosing not to do that is also a poor decision, even if one did take notes on references used while researching.

toast0

这只是一个该死的杜撰引用，天哪，不是什么欺诈之类的。

这是欺诈。

引用是工作的一部分。如果你编造引用，那你还会编造别的什么呢？

人都会犯错，而且很多人能从错误中学习。仅仅因为AI在某个时刻杜撰了一个引用，就没有必要永久剥夺某人发展人生或为人类做贡献的能力。

一年禁令并不是永久的。对错误决策设定负面后果，难道不正是促使人从错误中学习吗？

在理想情况下，写论文时应当记录所有使用的引用。选择不这样做就是一个不好的决定。

持积极态度来说，如果让AI提供可能遗漏的引用，至少应该核实这些引用是否真实存在并且相关。即使在研究时做了笔记，选择不核实也是个不好的决定。

U.S. DOJ demands Apple and Google unmask over 100k… #

https://news.ycombinator.com/item?id=48151935

Yeah, I’d HAPPILY report every single truck rolling coal around me if there was a place to report that information.

Hell, I’ve seen a truck roll coal around cop cars and, obviously, nothing happened.

This is just gross privacy intrusion masquerading as “protecting the environment”. We don’t need 100% compliance to the law and simple prosecution/ticketing of obvious violations would go a long way towards solving the problem outright. Much like we didn’t need our cars emailing prosecutors every time someone drove without a seat belt on. Cops giving out tickets for not wearing a seatbelt was enough.

cogman10

是的，如果有举报的渠道，我会乐意举报每一辆在我周围排放黑烟的卡车。

见过卡车在警车旁边排放黑烟，显然什么事都没发生。

这纯粹是在打着“保护环境”的旗号进行恶劣的隐私侵犯。我们不需要100%遵守法律，只要对明显违法行为进行简单的起诉和开罚单，就能很好地解决问题。就像我们不需要汽车每次有人没系安全带就给检察官发邮件一样。警察开安全带罚单就足够了。

The main thing about P2P meth is that there’s so m… #

https://news.ycombinator.com/item?id=48156234

The government should just regulate it, control purity and production and let people access small amounts for recreation/performance

Famously, the US spent about 15-20 years attempting this with opioids. They were widely available to people via a pseudo-medical process, or via secondhand dealing. Opioids were/are manufactured by regulated, publicly traded companies with inspectors who controlled purity and production. The result? A shattering drug addiction crisis that at its height killed more people annually than the entire Vietnam War.

(For people saying ’no, that was illegal heroin or fentanyl that did all that damage’- the Wiki page for the opioid crisis is quite clear that at least 50% of all deaths were due to perfectly legal, regulated opioids).

When you make drugs legal & easy to get, lots & lots of people do them- who develop life-shattering addictions and OD en masse. They also build tolerance and then move on to even harder stuff. AFAIK out of the 300ish countries on the globe, there is not 1 that has decriminalized hard drugs in the modern era. And no don’t say Portugal, contrary to widespread myth they forced people under threat of jail to attend drug rehab, and anyways they’ve recently curtailed even that.

I realize this is not going to get a lot of upvotes on HN, but yes making it difficult to do hard drugs is a reasonable public policy goal. (Which again, is why literally every country on the planet does it). There’s room to argue about the exact tactics, but the broad goal is perfectly legitimate

hash872

政府应该直接监管，控制纯度和生产，并允许人们少量获取用于娱乐或表演。

众所周知，美国曾花费大约15到20年时间尝试这样做对待阿片类药物。通过一种伪医疗流程，或者通过二手交易，阿片类药物广泛地供应给公众。阿片类药物由受监管的上市公司生产，有检验人员控制纯度和生产。结果呢？引发了震惊世界的毒瘾危机，鼎盛时期每年死于阿片类药物的人数超过了整个越南战争的死亡人数。

（对于那些说“不是，这些伤害全部来自非法的海洛因或芬太尼”的人，阿片类危机的维基页面明确指出，至少50%的死亡是由于完全合法、受监管的阿片类药物造成的。）

当你让毒品合法且容易获得时，很多人会使用，进而产生毁灭人生的成瘾和大规模的过量死亡。他们还会建立耐药性，随后转向更强烈的毒品。据我所知，在全球约300个国家中，没有一个在现代社会去刑事化了硬性毒品。别说葡萄牙，尽管有普遍误解，他们实际上是威胁违法者入狱，强制让人们接受戒毒治疗，而且他们最近还限制了这种做法。

我知道这观点在HN上可能不会获得很多赞同，但降低硬毒品的使用难度不是合理的公共政策目标吗？（这也是为什么全球每个国家都这么做）。关于具体策略可以讨论，但这一宏观目标是完全合理的。

Waymo updates 3,800 robotaxis after they ‘drive in… #

https://news.ycombinator.com/item?id=48152102

That’s a tough problem - distinguishing wet pavement from deep water. Humans make that mistake frequently. Autonomous vehicles should probably be equipped with a water sensor. (We did that in our DARPA Grand Challenge vehicle back in 2005). Then they can enter water very cautiously and see if it’s too deep. This may make them too cautious about shallow puddles on roads, though.

Animats

这是一个棘手的问题——区分湿滑的路面和深水。人类经常会犯这个错误。自动驾驶车辆可能应该配备水传感器。（我们在2005年的DARPA大挑战车辆中就做过这个）。这样它们就可以非常谨慎地进入水中，判断水是否太深。不过，这可能会让它们对路上的浅水坑过于谨慎。

“Too dangerous to release” or just too expensive? #

https://news.ycombinator.com/item?id=48148437

It’s pretty clear at this point that Mythos’ capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like the ones available to OpenAI’s Plus/Pro subscribers.

Anthropic tries to create marketing hype around Mythos using two psychological tricks.

Put large numbers in the headlines.

“Mythos discovered 271 vulnerabilities in Firefox” makes the model seem extremely capable to the uninitiated.

But it’s actually meaningless as a measure of capability improvement.

Anthropic gave away $100mil specifically as Mythos credits to these projects and companies (that’s $2.5mil per project). Spending the same exorbitant amount of compute analyzing the same codebases in an older model like GPT 5.x Pro would have turned up 260 of these vulnerabilities, or could even have turned up more than 271 ones.

No need to speculate, since this is exactly what we saw in the few code bases where we have such comparisons (like in the curl codebase). Supposedly weaker models, working with a much lower budget, turned up dozens of vulnerabilities. Mythos turned up only one, which ended up as a low severity CVE.

Do the whole “too dangerous to release” shtick. This is one of Dario Amodei’s favorite moves. When he was vice president of research at OpenAI, he declared GPT-3 (which wasn’t able to produce coherent text beyond 3-4 sentences at the time) too dangerous [1] as well.

Long story short, it’s the ChatGPT 4.5 situation again: a company trained a model that’s too slow and expensive, but not much more capable than what came before. It therefore requires these marketing stunts.

[1] https://www.itpro.com/technology/artificial-intelligence-ai/361603/openai-tool-previously-thought-too-dangerous-for-the

saithound

到目前为止，很明显 Mythos 在规模化发现和利用零日漏洞的能力，只是在现有模型基础上的一个渐进改进，比如那些开放给 OpenAI Plus/Pro 订阅用户的模型。

Anthropic 试图通过两种心理技巧来制造 Mythos 的营销噱头。

在标题中使用大量数字。

“Mythos 在 Firefox 中发现了271个漏洞”让外行人感觉这个模型非常强大。

但这实际上作为能力提升的衡量标准是毫无意义的。

Anthropic 专门给这些项目和公司发放了1亿美元的 Mythos 额度（每个项目250万美元）。用同样高昂的计算资源，在像 GPT 5.x Pro 这样较旧的模型上分析相同的代码库，也会发现260个漏洞，甚至可能多于271个。

无需猜测，因为我们在少数有可比数据的代码库中（比如 curl 代码库）已看到这点。被认为能力较弱，预算也低得多的模型发现了几十个漏洞。而 Mythos 只发现了一个漏洞，而且最终被认定为低危的 CVE。

使用“太危险而不能发布”的说辞。这是 Dario Amodei 最喜欢的手法之一。当他是 OpenAI 研究副总裁时，也曾宣布 GPT-3（当时只能生成3-4句连贯文本）“太危险”而不能发布。

长话短说，这又是 ChatGPT 4.5 的情况：一家公司训练了一个运行缓慢且成本高昂的模型，但其能力并没有比之前强多少。因此不得不靠这些营销噱头来支撑。

Show HN: Find the best local LLM for your hardware… #

https://news.ycombinator.com/item?id=48148561

The results of this tool are not good. It’s recommending outdated models like Qwen2.5 series and missing good new models.
This could have been a single web page that runs in your browser and lets you enter hardware specs, like all of the other tools like this. It is not a good idea to install and run unknown projects like this on your computer in this age.
The project is very obviously vibecoded, down to the README
Every comment from this account appears to be AI generated too.

I would recommend not installing and running this on your computer. There is no advantage over other tools and everything about the account and project looks like low effort AI generated content.

Aurornis

这个工具的结果不好。它推荐了像Qwen2.5系列这样的过时模型，错过了很多好的新模型。
这本可以是一个在浏览器中运行的单页面网站，允许你输入硬件规格，就像其他类似工具一样。在这个时代安装并运行这种未知项目在你的电脑上并不是个好主意。
这个项目显然是用vibecoded制作的，连README文件都是。
这个账号的每条评论看起来也都是AI生成的。

我建议不要在你的电脑上安装和运行这个工具。它没有比其他工具更好的优势，而且账号和项目的一切都看起来像是低质量的AI生成内容。

U.S. DOJ demands Apple and Google unmask over 100k… #

https://news.ycombinator.com/item?id=48151642

The government says it needs this information to identify and interview witnesses who can testify about how the tools were actually used.

Why start this whole thing, if you don’t already have this information and have people willing to help you as witnesses?

Sounds to me they’re saying they don’t have this already, but why is this investigation happening in the first place then? Rather than finding every user of the tool, find the users who use the tool in the way you don’t approve of, then request the information for those?

Really bananas approach to go for “Every single user of the app” and “Everyone who bought a dongle” when it has very real and legal use cases.

embedding-shape

政府表示需要这些信息来识别和采访能够作证说明这些工具实际用途的证人。

如果你们还没有这些信息，也没有愿意帮忙作证的人，为什么还要开始整个调查呢？

听起来他们是在说自己还没有这些信息，那这次调查到底为什么要进行呢？与其找出所有工具的使用者，不如找出那些以你们不认可的方式使用工具的人，然后针对这些人请求信息？

直接针对“每一个应用用户”和“每一个购买了加密狗的人”采取行动，真是荒谬，毕竟这些工具确实有合法合理的使用场景。