13 KiB
linkTitle, title, breadcrumbs, next, description, cascade
| linkTitle | title | breadcrumbs | next | description | cascade | ||
|---|---|---|---|---|---|---|---|
| AI Daily | AI Daily-AI资讯日报 | false | /en/2025-12/2025-12-17 | Your daily source for curated AI news, practical tools, and actionable tutorials to master Artificial Intelligence; |
|
AI News Daily 2025/12/18
AI News | Daily Briefing | Web Data Aggregation | Frontier Science Exploration | Industry Voice | Open Source Innovation | AI & Human Future | Visit Web Version ↗️ | Join Group Chat 💬
Today's Rundown
Tencent's Hunyuan World Model 1.5 Launches, Supports Interactive World Generation from Text/Images
ByteDance's Seedance Achieves 100% Audiovisual Sync, Live on Jiemeng & Doubao
OpenAI Releases FrontierScience Benchmark: GPT-5.2 Scores 77% on Olympiad
Yao Shunyu Appointed Tencent Chief AI Scientist, Reports to Martin Lau
Nvidia Acquires Slurm Developer SchedMD, Strengthening Computing Power Scheduling Moat
Product & Feature Updates
-
Tencent's Hunyuan World Model 1.5 is officially live! This bad boy (the first real-time interactive experience platform in China) is now open for experience (AI News). You can literally input text or images and BAM 💥, an interactive world pops up, ready for you to explore freely with your keyboard, mouse, or even a gamepad. What's even cooler? The model's training system is open-sourced for the first time, covering everything from data to inference and deployment.
-
Kling 2.6's voice control feature has officially dropped! Kuaishou AI has rolled out Voice Control, letting you create more captivating personalized content (AI News) using your own unique voice. Plus, they've kicked off a creative competition 🌟 with a top prize of $1000 cash. Just submit your work for a chance to get featured on the homepage.

-
ByteDance's Seedance 1.5 Pro is here! This next-gen audio-video model nails 🎬 100% audiovisual synchronization, with character lip-sync, intonation, and performance rhythm all perfectly aligned. It supports natural expression across multiple languages and dialects, and it can even pull off complex camera movements (AI News) like the Hitchcock zoom. You can find it live on the Jiemeng AI and Doubao platforms.
-
Meta has launched its SAM Audio model. Following image segmentation, Meta is now extending its "🔊 segment everything" philosophy into the audio realm. It supports three prompting methods—text, visual, and time span—allowing it to precisely separate sounds, just like masking (AI News) in images. It's already available for trial on the Segment Anything Playground.

-
Xiaomi's MiMo large model is now open to developers! Xiaomi has announced the opening of its 🤖 MiMo series large models and the CarIoT hardware ecosystem. Their AIoT platform now connects over 1.04 billion devices, with a developer community reaching 1.2 Million (AI News). The MiMo-V2-Flash is already open-sourced and has landed in the TOP2 globally among open-source models in Agent evaluations.

-
Meta has rolled out AI hearing-enhancing glasses. These snazzy new glasses feature an open-ear speaker design that can amplify the voices of people you're chatting with 👓. They're perfect for noisy environments (AI News) like coffee shops or bustling streets, making everyday conversations way easier.
Frontier Research
-
OpenAI has unveiled its FrontierScience benchmark. This benchmark is specifically built to evaluate expert-level scientific capabilities, packed with hundreds of original questions across physics, chemistry, and biology. GPT-5.2 scored 77% in the Olympiad track and 🔬 25% in the research track, outperforming other cutting-edge models in both. Gemini 3 Pro showed similar performance (AI News) to GPT-5.2 in the Olympiad track.
-
The FreeKV framework is boosting LLM inference efficiency! This framework achieves algorithm-system collaborative optimization to tackle long-context KV caching issues. By using speculative retrieval and dual-buffered streaming recall, it hits 🚀 near-lossless accuracy and can speed things up by up to 13 times (AI News) compared to SOTA methods.
-
Titans is giving AI a real memory! This paper, praised by Google's Jeff Dean, solves AI's "goldfish memory" problem. It uses three distinct mechanisms—short-term, long-term, and persistent memory—each doing its job. In super-long text comprehension tasks involving 2 million tokens, it achieves over 96% accuracy, totally crushing Mamba2's 5.4% (AI News).
Industry Outlook & Social Impact
-
Yao Shunyu has been appointed Tencent's Chief AI Scientist! It's official! Tencent is upgrading its large model R&D architecture, with post-95 star scholar Yao Shunyu taking on the role of Chief AI Scientist in the "CEO/President's Office," reporting directly to Martin Lau. He'll also be heading up the AI Infra Department and the Large Language Model Department, set to 📈 fully strengthen Tencent's large model R&D system (AI News).
-
Nvidia has acquired SchedMD, the developer of Slurm. This low-key move is being hailed as "widening their moat" 💪. Slurm, developed by SchedMD, is the resource scheduling tool used by over half of the world's TOP500 supercomputers, with giants like Meta, Mistral, and Thinking Machines relying on it. Even if you're rocking AMD chips, if you need computing power scheduling, you simply can't bypass Nvidia (AI News).
-
AI context management is sparking privacy debates. Would you trust uploading all your life notes to a third-party server? Community discussions show that while 🔥 feeding Obsidian notes to Claude can net you personalized advice, most folks are leaning towards controllable solutions (AI News) like local LLMs. Others are even warning that relying too much on AI summaries might erode true knowledge acquisition.
-
GitHub Actions is starting to charge platform fees! Starting in 2026, scheduling for private repositories and self-hosted runners will be billed at $0.002 per minute 💸. Yep, even if your computing power is on your own servers, you'll be paying a "tax." Smaller teams are feeling the pinch more, and the community is already weighing up alternative solutions like GitLab or Forgejo (AI News).
-
Can AI push formal verification into the mainstream? The hot topic here is that specifications themselves are tough to formalize, and requirements are always shifting. Optimists point out that big models like Opus and GPT-5.2 🤖 have significantly sped up proof engineering. But the pessimists reckon cultural and economic barriers are the true obstacles to popularization (AI News).
Top Open Source Projects
-
Moore Threads has open-sourced its LiteGS foundational library! This 3DGS reconstruction algorithm, which snagged a silver award 🥈 at SIGGRAPH Asia 2025, is now open source! It wraps up a 60-second task in just 34 seconds, achieving the same quality with only 10% of the original training time. With full-link optimization from GPU system to algorithm design, the code is open on GitHub (AI News). ⭐ It's already grabbing attention in the academic world.
-
Nvidia has released its Nemotron 3 open-source model. This MoE architecture supports a million-token context and comes in three sizes: Nano (30B), Super (100B), and Ultra (500B). The Nano version is already out, boasting a 🚀 4x throughput improvement over its predecessor, and it's being hailed as the most open and efficient model (AI News) in its category.
-
Xiaomi's MiMo-V2-Flash is out as open source! This MoE large language model, with 309B total parameters and 15B active ones, was developed in-house for extreme inference efficiency. It's got strong code and Agent capabilities 💡, zippy generation speeds, and its API is free for a limited time, letting you hook it up to tools like Claude Code and Cursor (AI News). ⭐ Developers are super excited about it!

-
Chatterbox, an open-source TTS system, is rocking it! Hailed as the most advanced open-source text-to-speech system, it's already bagged ⭐ 15,614 stars. Check out the project at resemble-ai/chatterbox (AI News).
-
Microsoft has open-sourced its TRELLIS.2 image-to-3D model. This 4B parameter model supports generating 3D models from images. An online experience address is already live, but community feedback is a bit mixed 🤷♀️. Some folks even think it's not as good as previous versions. The model is released on Hugging Face (AI News).

-
Meituan has open-sourced its LongCat virtual human model. Similar to ByteDance's OmniHuman and Kuaishou's Avatar, it supports audio-driven photo-to-video generation 🎤. This is especially handy for streamers and music video scenes. The project homepage and model are already released on Hugging Face (AI News).
Social Media Buzz
-
Prompt Caching technology gets a deep dive! What's being cached isn't just text, it's "thought states" 🧠. Essentially, it reuses the KV matrix, saving about 90% on token costs and slashing first-token latency by 85% for long texts. Real-world tests show Anthropic's manual mode hitting a 100% hit rate (AI News), while OpenAI's automatic mode only manages 50%.

-
Gemini 3 Flash is now available! It's significantly faster than the Pro version, with basically no change to the frontend aesthetics 😎. It's still leading other models in terms of beauty. ZenMux is hosting its debut and it's currently free, so click here to use it (AI News)!
-
Moat considerations in the Vibe Coding era. Tech isn't the core competency anymore 🤔. It's super easy to grab a wave of traffic, but building a moat takes more time and thought. Some see flaws, others see opportunities (AI News)—and these opportunities aren't for the nitpickers.
-
GPT Image 1.5 image capability test. It's just a pure drawing model, not a 🌍 world model like Banana Pro. The community's saying "Google is a generation ahead this time," and you can check out Baoyu's test for the weather card generation effects here (AI News).

-
The AI hardware creative, Stickerbox, is going viral! Voice input → AI automatically draws → instant sticker printing 🖨️ – helping kids bring stories from their minds to life! With a child-safe mode that has no screen interaction, this logic will soon be migrating to the 3D printing field (AI News).
AI News Daily Audio Edition
| 🎙️ Xiaoyuzhou | 📹 Douyin |
|---|---|
| Afterlife Pub | Self-Media Account |
![]() |
![]() |

