<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[AIGC Newsletter]]></title><description><![CDATA[AIGC Newsletter is the weekly deep-dive edition of AIGC News, covering the most important AI news, new models, agents, APIs, pricing shifts, and what matters most to builders and founders]]></description><link>https://aigc.news</link><image><url>https://substackcdn.com/image/fetch/$s_!DDks!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cf5487-863a-4479-8c5c-c5abb1e31139_250x250.png</url><title>AIGC Newsletter</title><link>https://aigc.news</link></image><generator>Substack</generator><lastBuildDate>Thu, 12 Mar 2026 22:05:45 GMT</lastBuildDate><atom:link href="https://aigc.news/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[pxiaoer]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[aigc@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[aigc@substack.com]]></itunes:email><itunes:name><![CDATA[pxiaoer]]></itunes:name></itunes:owner><itunes:author><![CDATA[pxiaoer]]></itunes:author><googleplay:owner><![CDATA[aigc@substack.com]]></googleplay:owner><googleplay:email><![CDATA[aigc@substack.com]]></googleplay:email><googleplay:author><![CDATA[pxiaoer]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Today on aigc.news: Don't just take my word for it—you need to try Gemini 3 Pro yourself.]]></title><description><![CDATA[Today marks the release of Gemini 3 Pro, the new state-of-the-art model.]]></description><link>https://aigc.news/p/today-on-aigcnews-dont-just-take</link><guid isPermaLink="false">https://aigc.news/p/today-on-aigcnews-dont-just-take</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Wed, 19 Nov 2025 15:19:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/62d79518-e6ce-40ac-b8d8-9eaff5fa5c1b_1030x624.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><p>Today marks the release of Gemini 3 Pro, the new state-of-the-art model. </p><p><strong>Do yourself a favor:</strong> take a moment to explore the links below and try out the tools to see what this model can really do.</p><p></p><ol><li><p><a href="https://blog.google/products/gemini/gemini-3/">Gemini 3: Introducing the latest Gemini AI model from Google</a></p></li><li><p><a href="https://blog.google/technology/developers/gemini-3-developers/">Start building with Gemini 3</a></p></li><li><p><a href="https://blog.google/products/gemini/gemini-3-gemini-app/">Gemini 3 brings upgraded smarts and new capabilities to the Gemini app</a></p></li><li><p><a href="http://antigravity.google/">Google Antigravity</a></p></li><li><p><a href="https://research.google/blog/generative-ui-a-rich-custom-visual-interactive-user-experience-for-any-prompt/">Generative UI: A rich, custom, visual interactive user experience for any prompt</a></p></li></ol><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Today on aigc.news: Google’s Gemini 3 Pro looks set to pull half a length ahead]]></title><description><![CDATA[With Gemini 3 Pro&#8217;s capabilities surfacing, the competitive landscape may be tilting sooner than expected.]]></description><link>https://aigc.news/p/today-on-aigcnews-googles-gemini</link><guid isPermaLink="false">https://aigc.news/p/today-on-aigcnews-googles-gemini</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Tue, 18 Nov 2025 15:45:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ko6G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4>2025-11-18</h4><p>In today&#8217;s issue:</p><ul><li><p><strong>AIGC Papers</strong> &#8211; <strong>&#960;</strong>*0.6 brings real-world experience into home robots, &#8220;Back to Basics&#8221; rethinks diffusion as true denoising, and Depth Anything 3 offers a cleaner geometric backbone for future world-models.</p></li><li><p><strong>AIGC Projects</strong> &#8211; Depth Anything 3 lands in ComfyUI, AgentEvolver lets LLM agents train themselves, and a new Qwen upscaling LoRA targets real-world photography.</p></li><li><p><strong>AIGC News</strong> &#8211; Gemini 3 Pro&#8217;s powerful model card leaks, xAI rolls out Grok 4.1, and Replicate joins Cloudflare right as a major outage hits the platform.</p></li></ul><p></p><h3><strong>Today&#8217;s AIGC Papers</strong></h3><ul><li><p><strong>&#960;*0.6</strong> shows how a VLA can <em>truly learn from experience</em>, turning home robots into durable, all-day performers through demonstrations, corrections, and self-practice.</p></li><li><p><strong>Back to Basics</strong> pulls diffusion models back to <em>real denoising</em>, using a pure image-space Transformer to achieve high-fidelity generation with a simpler formulation.</p></li><li><p><strong>Depth Anything 3</strong> reconstructs consistent 3D geometry from <em>any</em> view using a single Transformer, giving future world-models a more stable and cleaner geometric backbone.</p><p></p></li></ul><ol><li><p>&#960;*0.6: a VLA that Learns from Experience<strong>( <a href="https://pi.website/blog/pistar06">blog</a> | <a href="https://www.pi.website/download/pistar06.pdf?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">paper</a> )</strong></p><p></p><p>Physical Intelligence presents &#960;*0.6, a VLA combined with advantage-conditioned RL, trained through demonstrations, corrections, and self-experience. The system doubles success rates and throughput on real household tasks&#8212;making coffee, folding clothes, packing objects&#8212;and supports long, continuous, real-world operation, and the model card for &#960;0.6 is also provided in the paper.</p><p></p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ARbE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ARbE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ARbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png" width="610" height="343.125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:610,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ARbE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!ARbE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68c0c6fe-4682-46f5-9b54-889581351925_1920x1080.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><ol start="2"><li><p>Kaiming He  just Introduced JiT!  Back to Basics: Let Denoising Generative Models Denoise<strong>&#65288; <a href="https://arxiv.org/abs/2511.13720?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">paper</a> | <a href="https://github.com/LTH14/JiT?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">code</a>)</strong></p><p></p><p>MIT&#8217;s Li Tianhong and Kaiming He propose predicting the <em>clean image</em> instead of noise, using a large patch-based ViT (&#8220;Just-image Transformers&#8221;) in pixel space. The method needs no tokenizer or pretraining yet achieves competitive high-res ImageNet generation, offering a simpler and clearer theoretical view of diffusion.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hiGn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hiGn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hiGn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg" width="596" height="543.5025906735751" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:704,&quot;width&quot;:772,&quot;resizeWidth&quot;:596,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!hiGn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hiGn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F743747f2-12a0-436d-b5bd-b1d08c8193b6_772x704.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol start="3"><li><p>Depth Anything 3: Recovering the Visual Space from Any Views <strong>( <a href="https://depth-anything-3.github.io/?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">webpage</a> | <a href="https://arxiv.org/abs/2511.10647?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">paper</a> | <a href="https://github.com/ByteDance-Seed/depth-anything-3?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">code</a>  )</strong></p><p></p><p>ByteDance introduces <strong>Depth Anything 3</strong>, using a single Transformer and depth-ray representation to recover consistent geometry from single images, multi-view inputs, or videos. It surpasses VGGT in camera-pose and geometry accuracy, and outperforms DA2 in monocular depth, enabling high-fidelity 3DGS reconstruction.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UUbd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UUbd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 424w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 848w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 1272w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UUbd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png" width="642" height="253.5370879120879" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:575,&quot;width&quot;:1456,&quot;resizeWidth&quot;:642,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!UUbd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 424w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 848w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 1272w, https://substackcdn.com/image/fetch/$s_!UUbd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3395d248-97cd-417d-aab7-517c75dd79b3_1920x758.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>                                    </strong></h1><h3><strong>Today&#8217;s AIGC Projects</strong></h3><ul><li><p><strong>ComfyUI-DepthAnythingV3</strong> turns Depth Anything 3 into drop-in ComfyUI nodes, making DA3 usable in everyday image and video workflows.</p></li><li><p><strong>AgentEvolver</strong> provides a self-evolving training loop where LLM agents generate their own tasks, rewards, and improvements.</p></li><li><p><strong>Qwen-Edit-2509-Upscale-LoRA</strong> gives Qwen-Image-Edit a practical, detail-preserving photography upscaler for real-world enhancement.</p></li></ul><ol><li><p>ComfyUI-DepthAnythingV3 <strong>( <a href="https://github.com/PozzettiAndrea/ComfyUI-DepthAnythingV3?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">link</a> )</strong></p><p>ComfyUI-DepthAnythingV3 wraps <strong>Depth Anything 3</strong> into ready-made ComfyUI nodes, letting users run DA3 on images and, with custom graphs, on multi-view / video inputs. It exposes DA3&#8217;s spatially consistent depth prediction in a visual workflow, so you can feed depth maps directly into ControlNet, 3DGS, or other geometry-aware pipelines.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-pui!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-pui!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 424w, https://substackcdn.com/image/fetch/$s_!-pui!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 848w, https://substackcdn.com/image/fetch/$s_!-pui!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 1272w, https://substackcdn.com/image/fetch/$s_!-pui!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-pui!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png" width="568" height="232.1153846153846" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:595,&quot;width&quot;:1456,&quot;resizeWidth&quot;:568,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!-pui!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 424w, https://substackcdn.com/image/fetch/$s_!-pui!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 848w, https://substackcdn.com/image/fetch/$s_!-pui!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 1272w, https://substackcdn.com/image/fetch/$s_!-pui!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdfb3cf3-ad54-4d81-915d-b99192abff78_1920x785.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ol start="2"><li><p>AgentEvolver: Towards Efficient Self-Evolving Agent System ( <strong><a href="https://github.com/modelscope/AgentEvolver?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">link</a></strong> )</p><p>AgentEvolver is an end-to-end framework where LLM-based agents <em>teach themselves</em> via three loops: <strong>self-questioning</strong> (generate new tasks), <strong>self-navigating</strong> (reuse past trajectories with hybrid policies), and <strong>self-attributing</strong> (fine-grained credit assignment over states/actions. It cuts dataset engineering cost, improves exploration efficiency, and yields faster capability gains than traditional RL-style agent training.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sHld!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sHld!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 424w, https://substackcdn.com/image/fetch/$s_!sHld!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 848w, https://substackcdn.com/image/fetch/$s_!sHld!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 1272w, https://substackcdn.com/image/fetch/$s_!sHld!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sHld!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png" width="568" height="339.7857142857143" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:871,&quot;width&quot;:1456,&quot;resizeWidth&quot;:568,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!sHld!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 424w, https://substackcdn.com/image/fetch/$s_!sHld!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 848w, https://substackcdn.com/image/fetch/$s_!sHld!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 1272w, https://substackcdn.com/image/fetch/$s_!sHld!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F184360dd-6e9a-42fb-9dbb-dfa1ee887224_1920x1149.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol start="3"><li><p>vafipas663/Qwen-Edit-2509-Upscale-LoRA<strong> ( <a href="https://huggingface.co/vafipas663/Qwen-Edit-2509-Upscale-LoRA?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">link</a> )</strong></p><p>Qwen-Edit-2509-Upscale-LoRA is a LoRA adapter for <strong>Qwen-Image-Edit-2509</strong> focused on realistic photography. Trained on UltraHR-100K and Unsplash-lite, it repairs extreme low resolution, oversharpening, strong JPEG artifacts, motion blur, pixelation, and heavy noise&#8212;often up to 16&#215;&#8212;while preserving composition and structure, making it a practical replacement for many &#8220;magic&#8221; commercial upscalers.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_uMX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_uMX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_uMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg" width="695" height="641" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:641,&quot;width&quot;:695,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!_uMX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_uMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F865c84df-2ee9-4cf4-84e1-43a16888f93a_695x641.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>                                 </strong></h1><h3><strong>Today&#8217;s AIGC News</strong></h3><ul><li><p><strong>Gemini 3 Pro model card leak</strong> &#8211; A leaked model card outlines Google&#8217;s next-gen sparse MoE multimodal model with 1M context, 64K outputs, RL training, detailed agent benchmarks, safety evals, and a Jan 2025 knowledge cutoff.</p></li><li><p><strong>Grok 4.1 announced</strong> &#8211; xAI unveils Grok 4.1, a new iteration of the Grok family with upgraded capabilities and overall performance.</p></li><li><p><strong>Replicate &#215; Cloudflare, amid outage</strong> &#8211; Replicate is joining Cloudflare just as Cloudflare suffers a major service outage impacting large parts of its platform.</p></li></ul><ol><li><p><strong>The model card for Gemini 3 Pro has reportedly leaked, revealing a model with extremely strong capabilities</strong></p><p></p><p>Leaked Gemini 3 Pro model card describing Google&#8217;s next-generation sparse MoE multimodal model with 1M-token context, 64K outputs, trained on large-scale web, code and media with RL. It details agentic performance, deployment channels, benchmarks, safety evaluations, frontier safety status, remaining risks, and a January 2025 knowledge cutoff.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ko6G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ko6G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ko6G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg" width="630" height="521.5279361459521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:726,&quot;width&quot;:877,&quot;resizeWidth&quot;:630,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ko6G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ko6G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f8f2356-ad33-450d-a0e6-16dade2f8848_877x726.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol start="2"><li><p>xAI has announced Grok 4.1, a new iteration of its Grok model family with upgraded capabilities and performance. ( <strong><a href="https://x.ai/news/grok-4-1?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">link</a></strong> )</p><p></p></li><li><p>Replicate is joining Cloudflare, at the same time that Cloudflare has been experiencing a significant service outage affecting its platform. ( <strong><a href="https://replicate.com/blog/replicate-cloudflare?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">link</a></strong> )</p></li></ol><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p><blockquote><p>Always fresh, always live</p><p><strong><a href="https://live.aigc.news/?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">Real-time AIGC tracker</a></strong></p><p>New models, papers, and projects as they drop &#8212; stay ahead of the AI curve.</p><p>For deeper insights and long-form analysis, subscribe to our weekly briefings at <strong><a href="https://newsletter.aigc.news/?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news">newsletter.aigc.news</a></strong>.</p><p>&#128640;<strong><a href="https://live.aigc.news/?utm_campaign=today-on-aigc-news-google-s-gemini-3-pro-looks-set-to-pull-half-a-length-ahead&amp;utm_medium=referral&amp;utm_source=aigc.news"> See today&#8217;s AIGC highlights</a></strong></p></blockquote><p></p><p>That&#8217;s it for today.</p><p>Keep building, keep thinking for yourself &#8212; we&#8217;ll be here tracking the next wave.</p><p>The <strong><a href="https://aigc.news/">aigc.news</a></strong> Team</p><p></p>]]></content:encoded></item><item><title><![CDATA[I’m Back: The Relaunch of aigc.news & Its Newsletter]]></title><description><![CDATA[Your daily 1-minute scan of the 9 must-see signals in AIGC.]]></description><link>https://aigc.news/p/im-back-the-relaunch-of-aigcnews</link><guid isPermaLink="false">https://aigc.news/p/im-back-the-relaunch-of-aigcnews</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 17 Nov 2025 15:05:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wdvf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It&#8217;s been a while &#8212; this is pxiaoer.</p><p>My last update here was several months ago. During this break, I&#8217;ve been focused on one simple question:</p><blockquote><p>In this era of exploding AIGC information, what can I offer that is genuinely useful?</p></blockquote><p>I&#8217;ve realized that people don&#8217;t need <em>more</em> information. What&#8217;s missing are two key things:</p><ol><li><p>Someone to <strong>filter the noise</strong> for you.</p></li><li><p>A way to quickly know <strong>&#8220;the few things worth seeing today.&#8221;</strong></p></li></ol><p>So starting today, I&#8217;m tackling this with a much simpler approach.</p><h3>1. <a href="https://aigc.news/">aigc.news</a>: 3 News + 3 Papers + 3 Projects, Every Day</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wdvf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wdvf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 424w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 848w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 1272w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wdvf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png" width="488" height="343.0703012912482" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:980,&quot;width&quot;:1394,&quot;resizeWidth&quot;:488,&quot;bytes&quot;:107773,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.aigc.news/i/179146740?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wdvf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 424w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 848w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 1272w, https://substackcdn.com/image/fetch/$s_!wdvf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642916dd-28bc-4b1a-82f3-c9fbae85a60f_1394x980.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the past few months, I&#8217;ve been quietly building <a href="https://aigc.news/">aigc.news</a>.</p><p>It will now be updated daily and is completely free. Each day, you&#8217;ll find just three essential sections:</p><ul><li><p><strong>3 AIGC News:</strong> The developments that actually matter, with one-click access to the original sources.</p></li><li><p><strong>3 AIGC Papers:</strong> A single sentence explaining why each one is worth your attention.</p></li><li><p><strong>3 AIGC Projects :</strong> Who they&#8217;re for and what they do, at a glance.</p></li></ul><p>I&#8217;m also building two small extensions:</p><ul><li><p>A &#8220;3x3 daily signal card&#8221; that&#8217;s easy to save and share.</p></li><li><p>A subdomain: <a href="https://live.aigc.news/">live.aigc.news</a>, acting as a simple live radar for paper and project updates.</p></li></ul><p>Think of aigc.news as your: <strong>&#8220;1-minute scan of the 9 must-see signals in the AIGC world today.&#8221;</strong></p><div><hr></div><h3>2. <a href="https://newsletter.aigc.news/">newsletter.aigc.news</a>: Weekly Digests + Occasional Deep Dives</h3><p>This newsletter will now focus on two things:</p><ul><li><p><strong>Weekly Digest</strong> From the week&#8217;s daily signals, I&#8217;ll pull out the ~10 most worth revisiting and add my perspective: what&#8217;s just hype, and what might shape the coming months.</p></li><li><p><strong>Occasional In-Depth Essays</strong> When a key question surfaces repeatedly, I&#8217;ll explore it in a long-form article you can reference later. For example:</p><ul><li><p>What is the true division of labor between open-source and closed-source?</p></li><li><p>How do multimodal models, embodied intelligence, and agents fit together?</p></li><li><p>What opportunities remain for indie developers and small teams in this wave?</p></li></ul></li></ul><p>In short: <strong>aigc.news</strong> gives you the <strong>daily signals</strong>. <strong>newsletter.aigc.news</strong> helps you see the <strong>broader direction</strong>.</p><p></p><h3>3. Subscriptions &amp; Pricing (For Now)</h3><p>It&#8217;s very simple:</p><ul><li><p><strong>aigc.news:</strong> Free daily updates.</p></li><li><p><strong>newsletter.aigc.news:</strong> The weekly digest and most long-form pieces will also be free.</p></li></ul><p>Your subscription status won&#8217;t change, and you won&#8217;t lose access to content.</p><p>My priority right now is to get this system running smoothly and prove its usefulness to you. Everything else can wait.</p><p>The AIGC space is noisy enough. I have no intention of adding to it. I just want to do one thing well:'</p><p></p><p><strong>With a steady, sustainable rhythm, help you notice the few signals that truly matter.</strong></p><p></p><p>Welcome back. I hope that starting today, you&#8217;ll make aigc.news your quick morning radar.</p><p>&#8212; pxiaoer</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #92]]></title><description><![CDATA[Top Papers of the week]]></description><link>https://aigc.news/p/aigc-weekly-92</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-92</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Tue, 01 Apr 2025 13:03:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z7D5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z7D5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z7D5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z7D5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131056,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/160323353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z7D5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!Z7D5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb272bb07-3de1-4a06-82c8-907ed8b628da_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Top Papers of the week</strong></h2><p>1.)  <strong>Tracing the thoughts of a large language model ( <a href="https://www.anthropic.com/research/tracing-thoughts-language-model">blog</a> | <a href="https://transformer-circuits.pub/2025/attribution-graphs/biology.html">paper1</a> | <a href="https://transformer-circuits.pub/2025/attribution-graphs/methods.html">paper2</a> )</strong></p><div id="youtube2-Bj9BD2D3DzA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Bj9BD2D3DzA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Bj9BD2D3DzA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Anthropic's research on the Claude language model reveals key insights:</p><ul><li><p><strong>Multilingual Ability</strong>: Claude uses a shared conceptual space across languages, enabling knowledge transfer and suggesting a universal "language of thought."</p></li><li><p><strong>Poetry</strong>: It plans ahead for rhyming, showing foresight and flexibility.</p></li><li><p><strong>Mental Math</strong>: Claude combines approximation and precise calculation to solve problems, reflecting complex internal strategies.</p></li><li><p><strong>Reasoning</strong>: It performs multi-step reasoning by integrating facts, demonstrating adaptability.</p></li><li><p><strong>Hallucinations</strong>: Claude avoids guessing to reduce hallucinations but can still falter in some cases.</p></li><li><p><strong>Jailbreaks</strong>: Specific prompts can bypass safety mechanisms, exploiting coherence-safety conflicts.</p><p></p></li></ul><p>2.) <strong>Synthetic Video Enhances Physical Fidelity in Video Synthesis </strong>( <a href="https://kevinz8866.github.io/simulation/">webpage</a> |  <a href="https://arxiv.org/abs/2503.20822">paper</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JzRb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JzRb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 424w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 848w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 1272w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JzRb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png" width="641" height="256.66414835164835" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:1456,&quot;resizeWidth&quot;:641,&quot;bytes&quot;:594260,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/160323353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JzRb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 424w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 848w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 1272w, https://substackcdn.com/image/fetch/$s_!JzRb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c19575-02cd-43ef-987e-21d80cfdd568_2082x834.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We explore enhancing video generation models using physics-consistent synthetic videos from computer graphics. These videos maintain 3D consistency and improve model fidelity by reducing artifacts. Our method curates synthetic data and transfers its realism, boosting physical consistency across tasks. While not fully understanding physics, this work shows synthetic videos can enhance physical fidelity in video synthesis.</p><p></p><p>3.) <strong>Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback ( <a href="https://arxiv.org/abs/2503.22230">paper</a> )</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h2cN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h2cN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 424w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 848w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 1272w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h2cN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png" width="636" height="347.60360360360363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1332,&quot;resizeWidth&quot;:636,&quot;bytes&quot;:123645,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/160323353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h2cN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 424w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 848w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 1272w, https://substackcdn.com/image/fetch/$s_!h2cN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d798283-092f-4532-88ac-aea5da8a4eeb_1332x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>RLHF is essential for aligning large language models with human preferences, but prompt-data construction has been neglected. This paper explores data-driven bottlenecks in RLHF performance scaling, focusing on reward hacking and reduced response diversity. We propose a hybrid reward system combining RTV and GenRM to counter reward hacking and introduce Pre-PPO to maintain response diversity and boost learning efficiency. Prioritizing math and coding tasks early in training also significantly improves performance. Experiments on two model sizes show that RTV is most resistant to reward hacking, followed by GenRM with ground truth and then GenRM with SFT Best-of-N responses. Our methods capture task-specific nuances quickly, enhancing overall RLHF performance. This work highlights the importance of careful data construction and provides practical solutions to overcome performance barriers in RLHF.</p><p></p><p>4.) <strong>Gemini Robotics: Bringing AI into the Physical World ( <a href="https://arxiv.org/abs/2503.20020">paper</a> )</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sGqU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sGqU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 424w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 848w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 1272w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sGqU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png" width="617" height="353.6697819314642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1284,&quot;resizeWidth&quot;:617,&quot;bytes&quot;:938296,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/160323353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sGqU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 424w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 848w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 1272w, https://substackcdn.com/image/fetch/$s_!sGqU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a5b30cc-4041-47df-aa67-52a38ad2d63c_1284x736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recent advancements in multimodal models have shown promise in digital domains, but translating these capabilities to physical robots remains challenging. This report introduces Gemini Robotics, a new family of AI models built on Gemini 2.0, specifically designed for robotics. Gemini Robotics is a Vision-Language-Action (VLA) model that can directly control robots, performing complex manipulation tasks with smooth, reactive movements. It is robust to variations in objects and environments and can follow diverse instructions. With fine-tuning, it can tackle long-horizon tasks, learn new tasks from few demonstrations, and adapt to novel robot embodiments. This is enabled by Gemini Robotics-ER, an extended model that enhances spatial and temporal reasoning for robotics tasks such as object detection, trajectory prediction, and 3D bounding box predictions. The Gemini Robotics family represents a significant step towards general-purpose robots, addressing safety considerations and unlocking AI's potential in the physical world.</p><p></p><p>5.) <strong>Qwen2.5-Omni Technical Report</strong>( <a href="https://arxiv.org/abs/2503.20215">paper</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HRk4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HRk4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 424w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 848w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 1272w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HRk4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png" width="671" height="365.06513409961684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:568,&quot;width&quot;:1044,&quot;resizeWidth&quot;:671,&quot;bytes&quot;:234593,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/160323353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HRk4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 424w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 848w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 1272w, https://substackcdn.com/image/fetch/$s_!HRk4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff388ff7f-1de9-47d6-bf9d-f0890fa9f4fb_1044x568.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We present Qwen2.5-Omni, an end-to-end multimodal model handling text, images, audio, and video inputs while generating text and speech responses in a streaming manner. It uses block-wise processing for audio and visual inputs, synchronized via TMRoPE. The Thinker-Talker architecture separates text (Thinker) and speech (Talker) generation to prevent interference, with Talker using sliding-window DiT for low-latency audio decoding. Qwen2.5-Omni outperforms Qwen2-Audio, matches Qwen2.5-VL, and sets new benchmarks on Omni-Bench, excelling in speech instruction following, robustness, and naturalness.</p><p></p><p>6.) <strong>Scaling Laws of Synthetic Data for Language Models</strong> ( <a href="https://arxiv.org/abs/2503.19551">paper</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yZmD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yZmD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 424w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 848w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 1272w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yZmD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png" width="629" height="422.4452690166976" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:724,&quot;width&quot;:1078,&quot;resizeWidth&quot;:629,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!yZmD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 424w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 848w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 1272w, https://substackcdn.com/image/fetch/$s_!yZmD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec93fa55-8cda-4af7-bc71-da01f7741217_1078x724.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Large language models (LLMs) rely on high-quality web data, but this resource is depleting. Synthetic data offers a solution, though its scalability remains uncertain. We propose SynthLLM, a framework that creates high-quality synthetic datasets by recombining concepts from pre-training corpora. Key findings include: (1) SynthLLM follows scaling laws reliably, (2) performance plateaus at 300B tokens, and (3) larger models need fewer tokens to optimize. SynthLLM outperforms existing methods, proving synthetic data as a scalable alternative for advancing LLM performance.</p><p></p><p>7.) <strong>GAIA-2: Pushing the Boundaries of Video Generative Models for Safer Assisted and Automated Driving( <a href="https://wayve.ai/thinking/gaia-2/">blog</a> | <a href="https://arxiv.org/abs/2503.20523">paper</a> )</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D-6w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D-6w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 424w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 848w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 1272w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D-6w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png" width="637" height="418.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:956,&quot;width&quot;:1456,&quot;resizeWidth&quot;:637,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D-6w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 424w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 848w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 1272w, https://substackcdn.com/image/fetch/$s_!D-6w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc57e0e32-4e1d-4ac1-a406-1a84edfa995d_1880x1234.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Generative models enable environment simulation but lack key features for autonomous driving, like multi-agent interactions and multi-camera consistency. We present GAIA-2, a latent diffusion model that generates controllable, high-resolution, spatiotemporally consistent videos across diverse driving environments. GAIA-2 integrates structured inputs and latent embeddings to simulate complex, scalable driving scenarios, advancing autonomous system development.</p><p></p><p>8.) <strong>ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model</strong>( <a href="https://humanaigc.github.io/chat-anyone/">webpage</a> | <a href="https://arxiv.org/abs/2503.21144">paper</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VSWA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VSWA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 424w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 848w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 1272w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VSWA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png" width="643" height="169.15522875816993" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:322,&quot;width&quot;:1224,&quot;resizeWidth&quot;:643,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;MY ALT TEXT&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="MY ALT TEXT" title="MY ALT TEXT" srcset="https://substackcdn.com/image/fetch/$s_!VSWA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 424w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 848w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 1272w, https://substackcdn.com/image/fetch/$s_!VSWA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F312eabd6-982e-4aa2-90da-3e5bf18bcd42_1224x322.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Real-time interactive video-chat portraits are advancing but struggle with synchronized body motions and fine control over expressions. We propose a framework for stylized video generation, extending from talking heads to upper-body interaction. Using hierarchical motion diffusion and explicit hand control, our method generates expressive, synchronized videos at 512&#215;768 resolution, 30fps, enabling real-time, natural video chats with rich gestures and realism.</p><p></p><p>9.) <strong>What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models ( <a href="https://arxiv.org/abs/2503.24235">paper</a> )</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MHo7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MHo7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 424w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 848w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 1272w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MHo7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png" width="645" height="263.1190476190476" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:514,&quot;width&quot;:1260,&quot;resizeWidth&quot;:645,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MHo7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 424w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 848w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 1272w, https://substackcdn.com/image/fetch/$s_!MHo7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe81f264e-4ddb-4b2f-b622-15303d315ac3_1260x514.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As pretraining-era scaling wanes, test-time scaling (TTS) has gained focus, enhancing LLMs' problem-solving in tasks like math, coding, and open-ended Q&amp;A. This survey introduces a unified framework across four TTS dimensions: what, how, where, and how well to scale. We review methods, applications, and challenges, offering deployment guidelines and future directions for further scaling and broader generalization.</p><p></p><p>10.) <strong>Large Language Model Agent: A Survey on Methodology, Applications and Challenges( <a href="https://arxiv.org/abs/2503.21460">paper</a> | <a href="https://github.com/luo-junyu/Awesome-Agent-Papers">repo</a> )</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!axzS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!axzS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 424w, https://substackcdn.com/image/fetch/$s_!axzS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 848w, https://substackcdn.com/image/fetch/$s_!axzS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 1272w, https://substackcdn.com/image/fetch/$s_!axzS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!axzS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png" width="645" height="303.00824175824175" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1456,&quot;resizeWidth&quot;:645,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;LLM Agent Research Overview&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="LLM Agent Research Overview" title="LLM Agent Research Overview" srcset="https://substackcdn.com/image/fetch/$s_!axzS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 424w, https://substackcdn.com/image/fetch/$s_!axzS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 848w, https://substackcdn.com/image/fetch/$s_!axzS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 1272w, https://substackcdn.com/image/fetch/$s_!axzS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b1e3b94-d580-4871-b1f4-054bd187f406_1870x879.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The rise of intelligent agents, powered by large language models (LLMs), marks a step toward artificial general intelligence. This survey presents a taxonomy of LLM agents, exploring their architecture, collaboration, evolution, and emergent behaviors in complex environments. It unifies fragmented research, examines evaluation methods, tools, challenges, and applications, and highlights future research directions.</p><p></p><h2><strong>AIGC News of the week</strong></h2><p>1.) deepseek-ai/DeepSeek-V3-0324 ( <a href="https://huggingface.co/deepseek-ai/DeepSeek-V3-0324">huggingface</a> )</p><p>2.) bytedance&#8217;s MegaTTS 3 ( <a href="https://github.com/bytedance/MegaTTS3">repo</a> )</p><p>3.) OpenAI Agents SDK support MCP ( <a href="https://openai.github.io/openai-agents-python/mcp/">link</a> )</p><p>4.) Gemini 2.5:  Google&#8217;s most intelligent AI model ( <a href="https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/">link</a> )</p><p>5.) VGGT: Visual Geometry Grounded Transformer ( <a href="https://github.com/facebookresearch/vggt">repo</a> )</p><p></p><p>more AI News:   <a href="https://live.aigc.news/">live.aigc.news</a></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek OpenSourceWeek Day 6: In-Depth Analysis of DeepSeek-V3/R1 Inference System Overview]]></title><description><![CDATA[deepseek's One More Thing]]></description><link>https://aigc.news/p/deepseek-opensourceweek-day-6-in</link><guid isPermaLink="false">https://aigc.news/p/deepseek-opensourceweek-day-6-in</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Sat, 01 Mar 2025 14:06:16 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bd3e736c-5de2-4992-a509-7b602f727105_1400x788.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today at 12 PM, DeepSeek brought a "One More Thing" to its open-source week, introducing some details and cost calculations about the DeepSeek-V3 / R1 inference system&#8212;something you're surely interested in.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eZD2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eZD2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 424w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 848w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 1272w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eZD2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png" width="622" height="539.9666110183639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1040,&quot;width&quot;:1198,&quot;resizeWidth&quot;:622,&quot;bytes&quot;:518293,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eZD2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 424w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 848w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 1272w, https://substackcdn.com/image/fetch/$s_!eZD2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd487ce70-2d54-4824-848a-b52b1852e531_1198x1040.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://x.com/deepseek_ai/status/1895688300574462431</figcaption></figure></div><p></p><p><strong>Inference System Design Principles</strong></p><p>DeepSeek first introduced the design principles of the inference system, with optimization goals focused on: greater throughput and lower latency.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!prhG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!prhG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 424w, https://substackcdn.com/image/fetch/$s_!prhG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 848w, https://substackcdn.com/image/fetch/$s_!prhG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 1272w, https://substackcdn.com/image/fetch/$s_!prhG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!prhG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png" width="1280" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:289791,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!prhG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 424w, https://substackcdn.com/image/fetch/$s_!prhG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 848w, https://substackcdn.com/image/fetch/$s_!prhG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 1272w, https://substackcdn.com/image/fetch/$s_!prhG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fd2ef98-4b8b-4122-9e32-d6749d04cb5b_1280x572.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><em>Architecture Diagram</em></figcaption></figure></div><p>The solution currently adopted by DeepSeek: Large-Scale Node Expert Parallelism (EP)</p><ul><li><p>EP significantly increases batch size, thereby improving the efficiency of GPU matrix multiplication and boosting throughput.</p></li><li><p>EP distributes experts across different GPUs, with each GPU only needing to compute a small number of experts (thus reducing memory access demands), lowering latency.</p></li></ul><p><strong>Introduced Complexity:</strong></p><ul><li><p>EP introduces cross-node transmission. To optimize throughput, a suitable computation workflow must be designed to allow transmission and computation to occur synchronously.</p></li><li><p>EP involves multiple nodes, naturally requiring Data Parallelism (DP), with load balancing needed between different DP instances.</p></li></ul><p></p><p><strong>Large-Scale Cross-Node Expert Parallelism</strong></p><p>DeepSeek employs a multi-machine, multi-card expert parallelism strategy:</p><ul><li><p><strong>Prefill:</strong> Router Expert EP32, MLA, and Shared Expert DP32. One deployment unit consists of 4 nodes, 32 redundant router experts, with 9 router experts and 1 shared expert per card.</p></li><li><p><strong>Decode:</strong> Router Expert EP144, MLA, and Shared Expert DP144. One deployment unit consists of 18 nodes, 32 redundant router experts, with 2 router experts and 1 shared expert per card.</p></li></ul><p>The multi-machine, multi-card expert parallelism introduces significant communication overhead, so dual-batch overlapping is used to mask communication costs and improve overall throughput.</p><ul><li><p>In the <strong>prefill phase</strong>, computation and communication of two batches are interleaved&#8212;one batch&#8217;s computation can mask the communication overhead of the other.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VulS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VulS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 424w, https://substackcdn.com/image/fetch/$s_!VulS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 848w, https://substackcdn.com/image/fetch/$s_!VulS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 1272w, https://substackcdn.com/image/fetch/$s_!VulS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VulS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png" width="1280" height="281" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb089098-71f6-419b-9166-12f9635badb7_1280x281.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:281,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185425,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VulS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 424w, https://substackcdn.com/image/fetch/$s_!VulS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 848w, https://substackcdn.com/image/fetch/$s_!VulS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 1272w, https://substackcdn.com/image/fetch/$s_!VulS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb089098-71f6-419b-9166-12f9635badb7_1280x281.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><ul><li><p>In the <strong>decode phase</strong>, execution times vary across stages, so the attention part is split into two stages, creating a 5-stage pipeline to overlap computation and communication.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IULZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IULZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 424w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 848w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 1272w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IULZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png" width="1280" height="306" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:306,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:220292,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IULZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 424w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 848w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 1272w, https://substackcdn.com/image/fetch/$s_!IULZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ffbf181-8d5d-498b-b4ff-06c61d45af57_1280x306.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p>Due to the use of large-scale parallelism (including expert parallelism and data parallelism), some GPUs may become overloaded, necessitating computational and communication load balancing.</p><ul><li><p><strong>Prefill Load Balancer</strong></p><ul><li><p>Core Issue: Variations in the number and length of requests across different Data Parallelism (DP) instances lead to differences in core-attention computation and dispatch transmission volumes.</p></li><li><p>Optimization Goal: Ensure roughly equal computation load across GPUs (core-attention load balancing) and similar token input volumes (dispatch transmission load balancing) to avoid prolonged processing times on some GPUs.</p></li></ul></li><li><p><strong>Decode Load Balancer</strong></p><ul><li><p>Core Issue: Variations in the number and length of requests across DP instances result in differences in core-attention computation (related to KVCache usage) and dispatch transmission volumes.</p></li><li><p>Optimization Goal: Ensure roughly equal KVCache usage across GPUs (core-attention load balancing) and similar request volumes (dispatch transmission load balancing).</p></li></ul></li><li><p><strong>Expert-Parallel Load Balancer</strong></p><ul><li><p>Core Issue: For a given MoE model, certain naturally high-load experts exist, leading to uneven expert computation loads across GPUs.</p></li><li><p>Optimization Goal: Balance expert computation across GPUs (i.e., minimize the maximum dispatch reception volume across all GPUs).</p></li></ul></li></ul><p></p><p><strong>Real-World Statistics of the Online Inference System</strong></p><p>All DeepSeek R1/V3 services use H800 GPUs. Matrix computations and dispatch transmissions use FP8 format consistent with training, while core-attention computations and combine transmissions use BF16 consistent with training, maximizing service performance.</p><p>Since service loads are high during the day and low at night, all servers handle inference during peak load times, while some machines are freed up for research and training during low-load periods.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hrWs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hrWs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 424w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 848w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 1272w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hrWs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png" width="1194" height="410" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:410,&quot;width&quot;:1194,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72772,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hrWs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 424w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 848w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 1272w, https://substackcdn.com/image/fetch/$s_!hrWs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8efbab03-a91b-45b9-a950-947e08ddbc62_1194x410.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Over the past 24 hours (Beijing time, 2025/02/27 12:00 to 2025/02/28 12:00), the total node usage for DeepSeek V3 and R1 inference services peaked at 278 nodes and averaged 226.75 nodes (each node with 8 H800 GPUs). Assuming a GPU rental cost of $2/hour, the total daily cost is $87,072.</p><p>Within this 24-hour statistical period, DeepSeek V3 and R1 recorded:</p><ul><li><p>Total input tokens: 608 billion, of which 342 billion tokens (56.3%) hit the KVCache disk cache.</p></li><li><p>Total output tokens: 168 billion. The average output rate was 20&#8211;22 tokens per second (tps), with an average KVCache length of 4,989 per output token.</p></li><li><p>Average throughput per H800 GPU:</p><ul><li><p>For prefill tasks: ~73.7k tokens/s input throughput (including cache hits).</p></li><li><p>For decode tasks: ~14.8k tokens/s output throughput.</p></li></ul></li></ul><p>These statistics encompass all loads from web, app, and API usage. If all tokens were priced according to DeepSeek R1&#8217;s rates, the theoretical daily revenue would be $562,027, yielding a cost-profit margin of <strong>545%</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E-Lz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E-Lz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 424w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 848w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 1272w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E-Lz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png" width="1195" height="441" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:441,&quot;width&quot;:1195,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:147690,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158169889?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E-Lz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 424w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 848w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 1272w, https://substackcdn.com/image/fetch/$s_!E-Lz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce9d85a9-398e-421e-b836-5a676b3fd890_1195x441.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>In reality, DeepSeek doesn&#8217;t generate this much revenue because V3&#8217;s pricing is lower, paid services only account for a portion of usage, and discounts are offered at night.</p><p></p><p><strong>What Can We Infer?</strong></p><p>Rumors previously suggested that DeepSeek&#8217;s official deployment consisted of a 320-H800 inference cluster. Now, it appears to be 278 nodes&#8212;2,224 H800s. Officially, DeepSeek acknowledges owning at least 10,000 H800s, meaning the GPUs used for inference are relatively few.</p><ul><li><p><strong>Cost:</strong> Average of 226.75 nodes (1,814 GPUs), at $2/hour per GPU, yields a daily cost of $87,072.</p></li><li><p><strong>Revenue:</strong> Input: 608B tokens; Output: 168B tokens, resulting in a daily revenue of $562,027.</p></li><li><p><strong>Gross Daily Profit:</strong> $474,955 = ~3,457,672.4 RMB/day.</p></li></ul><p>However, the above calculation has flaws. Even if the 6x profit margin is halved to 3x, the profit margin remains very high. Many domestic vendors deploying DeepSeek have shut down API services due to losses, raising questions about where the problem lies.</p><p>In an interview, Liang Wenfeng said: &#8220;We just do things at our own pace, then calculate costs and set prices. Our principle is not to lose money.&#8221;</p><p>This suggests DeepSeek is currently profitable, with earnings likely reinvested into R&amp;D. We look forward to R2&#8217;s release soon.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek OpenSourceWeek Day 5: In-Depth Analysis of 3FS and Smallpond]]></title><description><![CDATA[February 28th, the last day of February, also marks the final day of DeepSeek's Open Source Week.]]></description><link>https://aigc.news/p/deepseek-opensourceweek-day-5-in</link><guid isPermaLink="false">https://aigc.news/p/deepseek-opensourceweek-day-5-in</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Fri, 28 Feb 2025 16:01:58 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3b95fc66-8fe9-43f5-8cad-5b55601e786a_1400x788.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>February 28th, the last day of February, also marks the final day of DeepSeek's Open Source Week. On this day, DeepSeek open-sourced two projects: <strong><a href="https://github.com/deepseek-ai/3FS">3FS</a></strong> and <strong><a href="https://github.com/deepseek-ai/smallpond">Smallpond</a></strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pOYJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pOYJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 424w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 848w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 1272w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pOYJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png" width="609" height="546.9816360601002" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1076,&quot;width&quot;:1198,&quot;resizeWidth&quot;:609,&quot;bytes&quot;:633277,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pOYJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 424w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 848w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 1272w, https://substackcdn.com/image/fetch/$s_!pOYJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7999f92c-e545-4953-990e-eafc4ce0158e_1198x1076.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>According to the official introduction, the <strong>Fire-Flyer File System (3FS)</strong> is a parallel file system designed to fully utilize the bandwidth of modern SSDs and RDMA networks.</p><ul><li><p>Achieves an aggregate read throughput of <strong>6.6 TiB/s</strong> in a 180-node cluster.</p></li><li><p>Delivers <strong>3.66 TiB/minute</strong> throughput in the GraySort benchmark on a 25-node cluster.</p></li><li><p>Provides peak throughput of <strong>over 40 GiB/s</strong> per client node in KVCache lookups.</p></li><li><p>Features a distributed architecture with strong consistency semantics.</p></li><li><p>Used for training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search, and KVCache lookups in inference for V3/R1.</p></li></ul><p><strong>Smallpond</strong>, on the other hand, is a data processing framework built on top of 3FS.</p><p></p><p><strong>Fire-Flyer File System (3FS)</strong></p><p>3FS is part of the <strong>Fire-Flyer AI-HPC</strong> developed by DeepSeek. It is detailed in the paper <em><a href="https://arxiv.org/abs/2408.14158">Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning</a></em>.</p><p>Fire-Flyer AI-HPC consists of three components: the <strong><a href="https://github.com/HFAiLab/hai-platform">HAI Platform</a></strong> (open-sourced two years ago), <strong>3FS</strong> (open-sourced today), and <strong>HaiScale</strong> (yet to be open-sourced).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GmOe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GmOe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 424w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 848w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 1272w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GmOe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png" width="614" height="497.9025341130604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1026,&quot;resizeWidth&quot;:614,&quot;bytes&quot;:552823,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GmOe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 424w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 848w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 1272w, https://substackcdn.com/image/fetch/$s_!GmOe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5cb3ea3-3eed-4bc0-8f29-7fecd55a8e66_1026x832.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>In summary, 3FS has several key features:</p><ol><li><p><strong>High-Performance Design</strong>: 3FS is tailored to leverage the high IOPS (input/output operations per second) and throughput of NVMe SSDs, as well as RDMA networks. This design enables it to efficiently handle large-scale data requests, meeting the demands of deep learning and large-scale computing.</p></li><li><p><strong>System Architecture</strong>: The 3FS system comprises four roles: cluster manager, metadata service, storage service, and client. The metadata and storage services periodically send heartbeat signals to the cluster manager to ensure system stability and efficiency. Multiple cluster managers ensure high availability.</p></li><li><p><strong>Request Control Mechanism</strong>: 3FS implements a request transmission control mechanism to alleviate network congestion. Upon receiving a read request, the storage service asks the client for permission to transfer data. This limits the number of concurrent senders, maintaining good performance under high load.</p></li><li><p><strong>Strong Consistency with Chain Replication</strong>: 3FS adopts the Chain Replication and Allocate Query (CRAQ) approach to provide strong consistency. File contents are split into blocks and replicated across a series of storage targets, fully unleashing the throughput and IOPS of all SSDs.</p></li><li><p><strong>High Throughput</strong>: By optimizing batch write and read operations, 3FS achieves write speeds exceeding <strong>10 GiB/s per node</strong>, accelerating checkpoint saving and loading, and reducing latency during training.</p></li><li><p><strong>3FS-KV System</strong>: 3FS also supports <strong>3FS-KV</strong>, a shared-storage distributed data processing system built on 3FS. It supports key-value storage, message queues, and object storage models, further enhancing system flexibility and performance.</p></li></ol><p>3FS provides robust storage support for deep learning and large-scale computing, effectively meeting demands for high throughput and low latency.</p><p></p><p><strong>Description from the Paper:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gPYH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gPYH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 424w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 848w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 1272w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gPYH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png" width="636" height="1142.122105263158" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1706,&quot;width&quot;:950,&quot;resizeWidth&quot;:636,&quot;bytes&quot;:801942,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gPYH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 424w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 848w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 1272w, https://substackcdn.com/image/fetch/$s_!gPYH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7cb98ce7-efa9-4ba1-a83d-b9ac651d4cf1_950x1706.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6BS5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6BS5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 424w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 848w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 1272w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6BS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png" width="634" height="444.9122807017544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:912,&quot;resizeWidth&quot;:634,&quot;bytes&quot;:339427,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6BS5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 424w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 848w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 1272w, https://substackcdn.com/image/fetch/$s_!6BS5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc86428aa-23de-4c48-8ef1-d6d576552c97_912x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u0FS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u0FS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 424w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 848w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 1272w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u0FS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png" width="610" height="509.65367965367966" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:924,&quot;resizeWidth&quot;:610,&quot;bytes&quot;:404125,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u0FS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 424w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 848w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 1272w, https://substackcdn.com/image/fetch/$s_!u0FS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fca88bf57-8bd3-4b84-adb3-c067be797a67_924x772.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Performance</strong></p><p><strong>Peak Throughput</strong></p><p>Read throughput test results for a 3FS cluster: The cluster consists of <strong>180 storage nodes</strong>, each equipped with <strong>2&#215;200Gbps InfiniBand NICs</strong> and <strong>16&#215;14TiB NVMe SSDs</strong>. Over <strong>500+ client nodes</strong>, each with a <strong>1&#215;200Gbps InfiniBand NIC</strong>, were used for the read stress test. Under background traffic from training jobs, the aggregate read throughput reached approximately <strong>6.6 TiB/s</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bC3l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bC3l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 424w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 848w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 1272w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bC3l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png" width="609" height="199.09615384615384" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:476,&quot;width&quot;:1456,&quot;resizeWidth&quot;:609,&quot;bytes&quot;:659714,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bC3l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 424w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 848w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 1272w, https://substackcdn.com/image/fetch/$s_!bC3l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6186eeaf-4791-4a6a-a799-d24d85a1798f_2048x669.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p><strong>Sorting Performance</strong></p><p>The test cluster consists of <strong>25 storage nodes</strong> (2 NUMA domains per node, 1 storage service per NUMA, 2&#215;400Gbps NICs per node) and <strong>50 compute nodes</strong> (2 NUMA domains, 192 physical cores, 2.2 TiB RAM, and 1&#215;200Gbps NIC per node). Sorting <strong>110.5 TiB of data</strong> across 8192 partitions took <strong>30 minutes and 14 seconds</strong>, achieving an average throughput of <strong>3.66 TiB/minute</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O9Zo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O9Zo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 424w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 848w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 1272w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O9Zo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png" width="573" height="373.86675824175825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:950,&quot;width&quot;:1456,&quot;resizeWidth&quot;:573,&quot;bytes&quot;:1011744,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O9Zo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 424w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 848w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 1272w, https://substackcdn.com/image/fetch/$s_!O9Zo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bb42440-857e-4604-98b9-065b905c50b7_1616x1054.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>KVCache</strong></p><p>KVCache is designed to optimize the LLM inference process by caching keys and value vectors from previous tokens in the decoder layers, avoiding redundant computation. The figure above shows the read throughput for all KVCache clients, highlighting peak and average values, with a peak throughput of up to <strong>40 GiB/s</strong>. The figure below shows the IOPS of delete operations during garbage collection (GC) over the same period.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bfpx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bfpx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 424w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 848w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 1272w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bfpx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png" width="605" height="397.654532967033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:957,&quot;width&quot;:1456,&quot;resizeWidth&quot;:605,&quot;bytes&quot;:1403643,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bfpx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 424w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 848w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 1272w, https://substackcdn.com/image/fetch/$s_!Bfpx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467b1f42-3b83-4211-b89c-ed7624f9d626_1640x1078.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Why is a Specialized File System Like 3FS Needed?</strong></p><p>In LLM scenarios, there&#8217;s a need for highly concurrent, high-throughput, and scalable distributed file systems that also demand strong consistency, intelligent routing, and cache management. Systems like 3FS, which offer high-performance solutions tailored for RDMA and SSDs, flexible metadata design, and asynchronous zero-copy I/O, become highly valuable.</p><p></p><p></p><p><strong>Code Analysis</strong></p><p>One notable aspect of 3FS&#8217;s implementation is that DeepSeek used <strong>Rust</strong> to develop the <strong>chunk_engine</strong>.</p><p>The <strong>chunk_engine</strong> is a core module at the bottom layer of the 3FS storage service, responsible for managing, allocating, and reclaiming physical disk blocks. The upper layers can read and write block data through this engine. It primarily uses <strong>cxx</strong> to automatically generate C++ bindings, allowing C++ code to directly call Rust code.</p><p>In recent years, Rust has gained popularity in the MLSys (Machine Learning Systems) field. For example, Hugging Face&#8217;s <strong><a href="https://github.com/huggingface/tokenizers">tokenizers</a></strong> are also implemented in Rust. The DeepSeek team likely chose Rust for the chunk_engine due to its maintainability, memory safety, and excellent performance.</p><p>The DeepSeek team may also have used the Rust framework <strong>Tokio</strong> in backend services, as I found several Rust open-source projects, including Tokio, in Quant AI&#8217;s open-source initiatives. I sincerely hope more teams adopt Rust for developing machine learning systems.</p><p></p><p><strong>Smallpond</strong></p><p><strong>Smallpond</strong> is a lightweight data processing framework built on top of <strong>DuckDB</strong> and <strong>3FS</strong>. It supports lightweight, high-performance data processing and scales to petabyte-scale datasets.</p><p>Installation and usage are straightforward, with a minimal API offering two types: one for dynamically building dataflow graphs and another for static construction.</p><p></p><p><strong>Installation:</strong></p><pre><code><code>pip install smallpond</code></code></pre><p><strong>Usage Example:</strong></p><pre><code><code># Download example data
wget https://duckdb.org/data/prices.parquet

import smallpond
# Initialize session
sp = smallpond.init()
# Load data
df = sp.read_parquet("prices.parquet")
# Process data
df = df.repartition(3, hash_by="ticker")
df = sp.partial_sql("SELECT ticker, min(price), max(price) FROM {0} GROUP BY ticker", df)
# Save results
df.write_parquet("output/")
# Show results
print(df.to_pandas())</code></code></pre><p>For performance, refer to the sorting performance of 3FS.</p><p></p><p><strong>Conclusion of DeepSeek Open Source Week</strong></p><p>DeepSeek Open Source Week concludes today. Thank you, DeepSeek, for sharing valuable resources for everyone to learn and use.</p><p>Many teams have already taken action and achieved tangible performance improvements. For instance, the vLLM team replaced <strong>TRITON_MLA</strong> with <strong>FLASHMLA</strong>, boosting throughput by <strong>2-16%</strong>, delivering real results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!80wj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!80wj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 424w, https://substackcdn.com/image/fetch/$s_!80wj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 848w, https://substackcdn.com/image/fetch/$s_!80wj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 1272w, https://substackcdn.com/image/fetch/$s_!80wj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!80wj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png" width="497" height="627.9211409395973" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1506,&quot;width&quot;:1192,&quot;resizeWidth&quot;:497,&quot;bytes&quot;:588194,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158109566?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!80wj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 424w, https://substackcdn.com/image/fetch/$s_!80wj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 848w, https://substackcdn.com/image/fetch/$s_!80wj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 1272w, https://substackcdn.com/image/fetch/$s_!80wj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85fefe48-aa59-4158-b085-ec4aeb5d98d0_1192x1506.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://x.com/vllm_project/status/1894994674630435123</figcaption></figure></div><p></p><p>These projects open-sourced by DeepSeek will continue to influence us. Our journey of intense learning continues.</p><p></p><p><strong>more</strong></p><ul><li><p>day0: <a href="https://aigc.openbot.ai/p/deepseek-opensourceweek-is-coming">https://aigc.openbot.ai/p/deepseek-opensourceweek-is-coming</a></p></li><li><p>day1: <a href="https://aigc.openbot.ai/p/deepseek-open-source-week-day-1-in">https://aigc.openbot.ai/p/deepseek-open-source-week-day-1-in</a></p></li><li><p>day2:<a href="https://aigc.openbot.ai/p/day-2-of-deepseek-opensourceweek">https://aigc.openbot.ai/p/day-2-of-deepseek-opensourceweek</a></p></li><li><p>day3:<a href="https://aigc.openbot.ai/p/deepseek-opensourceweek-day-3-deepgemm">https://aigc.openbot.ai/p/deepseek-opensourceweek-day-3-deepgemm</a></p></li><li><p>day4:<a href="https://aigc.openbot.ai/p/deepseek-opensourceweek-day-4-in">https://aigc.openbot.ai/p/deepseek-opensourceweek-day-4-in</a></p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek OpenSourceWeek Day 4: In-Depth Analysis of DualPipe & EPLB]]></title><description><![CDATA[Today marks the fourth day of DeepSeek Open Source Week, and DeepSeek has introduced three projects, all centered around optimizing parallel strategies for V3/R1 training and inference.]]></description><link>https://aigc.news/p/deepseek-opensourceweek-day-4-in</link><guid isPermaLink="false">https://aigc.news/p/deepseek-opensourceweek-day-4-in</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Thu, 27 Feb 2025 15:25:15 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1e6dbf42-d5ce-4021-bcb5-18e006c5fe56_1000x420.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today marks the fourth day of DeepSeek Open Source Week, and DeepSeek has introduced three projects, all centered around optimizing parallel strategies for V3/R1 training and inference.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BqTT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BqTT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 424w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 848w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 1272w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BqTT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png" width="569" height="659.5443886097153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1384,&quot;width&quot;:1194,&quot;resizeWidth&quot;:569,&quot;bytes&quot;:590801,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158040173?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BqTT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 424w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 848w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 1272w, https://substackcdn.com/image/fetch/$s_!BqTT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30e58d2b-3cd4-4adf-8c26-16e63e27aaed_1194x1384.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong><a href="https://github.com/deepseek-ai/DualPipe">DualPipe</a></strong> is a bidirectional pipeline parallelism algorithm designed for computation-communication overlap in V3/R1 training. Meanwhile, <strong><a href="https://github.com/deepseek-ai/eplb">EPLB</a></strong> serves as an expert-parallel load balancer for V3/R1.</p><p>The final project, <strong><a href="https://github.com/deepseek-ai/profile-data">profile-data</a></strong>, primarily releases analytical data from DeepSeek&#8217;s infrastructure for training and inference. So far, it includes data on Prefilling for both training and inference, while the Decoding analysis data for inference has yet to be made public.</p><p>Today, we&#8217;ll dive into an analysis of the DualPipe and EPLB projects, both of which lean toward engineering optimization.</p><p></p><p><strong>DualPipe</strong></p><p>DualPipe is mentioned in the DeepSeek V3 paper as a bidirectional pipeline parallelism communication algorithm, mainly used to optimize data interaction and training efficiency in large-scale models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!deYq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!deYq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 424w, https://substackcdn.com/image/fetch/$s_!deYq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 848w, https://substackcdn.com/image/fetch/$s_!deYq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 1272w, https://substackcdn.com/image/fetch/$s_!deYq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!deYq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png" width="1202" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1202,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:664621,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158040173?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!deYq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 424w, https://substackcdn.com/image/fetch/$s_!deYq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 848w, https://substackcdn.com/image/fetch/$s_!deYq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 1272w, https://substackcdn.com/image/fetch/$s_!deYq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fa3f89a-58c2-4eb0-a368-76d8d1d72f1e_1202x756.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Key Features:</strong></p><ul><li><p><strong>Computation-Communication Overlap</strong><br>DualPipe&#8217;s design aims to maximize cluster computing performance by achieving full overlap of computation and communication during forward and backward passes, reducing idle wait times typical in traditional pipeline parallelism. This is especially critical for expert parallelism (Expert Parallelism) across nodes in MoE models.</p></li><li><p><strong>Bidirectional Scheduling</strong><br>DualPipe employs a bidirectional scheduling strategy, feeding data from both ends of the pipeline simultaneously to reuse hardware resources efficiently. It also incorporates a sophisticated yet highly effective 8-step scheduling strategy.</p></li><li><p><strong>Memory Optimization</strong><br>DualPipe deploys the shallowest layers (including the embedding layer) and the deepest layers (including the output layer) on the same pipeline level (PP Rank), enabling physical sharing of parameters and gradients to further enhance memory efficiency.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X2pe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X2pe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 424w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 848w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 1272w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X2pe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png" width="1456" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:477,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:562902,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158040173?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X2pe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 424w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 848w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 1272w, https://substackcdn.com/image/fetch/$s_!X2pe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d35c3e1-6bdb-41e6-8bf9-e972fad22239_1674x548.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Pipeline Bubble and Memory Usage Comparison</strong> (Pipeline bubble refers to idle wait time)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TqRF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TqRF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 424w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 848w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 1272w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TqRF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png" width="1456" height="897" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:897,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:734628,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/158040173?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TqRF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 424w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 848w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 1272w, https://substackcdn.com/image/fetch/$s_!TqRF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49840cee-cac4-4fa3-9047-de3d64db9506_1682x1036.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>For those interested, you can check out the code&#8212;it&#8217;s not very long and is great for learning.</p><p></p><p><strong>EPLB</strong></p><p><strong>EPLB (Expert Parallelism Load Balancer)</strong> is primarily designed to optimize the distributed deployment of MoE models. It ensures load balancing among different experts in the MoE portion by replicating shared experts and fine-grained high-load experts across multiple GPUs in the cluster. This allows GPUs to handle more "hot data" (data sent to shared experts) efficiently.</p><p>EPLB isn&#8217;t detailed in DeepSeek&#8217;s paper, and its code is remarkably concise at just 160 lines.</p><p><strong>Key Features:</strong></p><ul><li><p><strong>Load Balancing Optimization</strong><br>It replicates high-load experts (a strategy we can call "redundant expert strategy") and uses heuristic adjustments for expert allocation to ensure balanced workloads across GPUs.</p></li><li><p><strong>Hierarchical Load Balancing</strong><br>EPLB adopts a three-tier structure: node-level &#8594; intra-node expert replication &#8594; GPU allocation. It prioritizes assigning experts from the same group to the same node to minimize cross-node data transfers, then ensures load balancing at each layer. This approach, combined with DeepSeek V3&#8217;s Group-Limited Expert Routing strategy, significantly boosts distributed training efficiency.</p></li><li><p><strong>Dynamic Scheduling Strategy</strong><br>EPLB dynamically selects load balancing strategies based on the situation&#8212;using a hierarchical strategy during the prefilling phase and a global strategy during the decoding phase.</p></li></ul><div><hr></div><p><strong>Let&#8217;s Look at the Code:</strong></p><p><strong>Redundant Expert Strategy</strong></p><pre><code><code>def replicate_experts(weight: torch.Tensor, num_phy: int):
    # Replicate high-load experts
    for i in range(num_log, num_phy):
        redundant_indices = (weight / logcnt).max(dim=-1).indices
        phy2log[:, i] = redundant_indices
        logcnt[arangen, redundant_indices] += 1</code></code></pre><p></p><p><strong>Hierarchical Load Balancing</strong></p><pre><code><code>def rebalance_experts_hierarchical():
    # Step 1: Pack expert groups to nodes
    tokens_per_group = weight.unflatten(-1, (num_groups, group_size)).sum(-1)
    group_pack_index, group_rank_in_pack = balanced_packing(tokens_per_group, num_nodes)

    # Step 2: Build redundant experts within nodes
    tokens_per_mlog = weight.gather(-1, mlog2log).view(-1, num_logical_experts // num_nodes)

    # Step 3: Pack physical experts to GPUs
    tokens_per_phy = (tokens_per_mlog / mlogcnt).gather(-1, phy2mlog)</code></code></pre><p><strong>Dynamic Scheduling Strategy</strong></p><pre><code><code>def rebalance_experts():
    if num_groups % num_nodes == 0:
        # Use hierarchical strategy
        phy2log, phyrank, logcnt = rebalance_experts_hierarchical()
    else:
        # Use global strategy
        phy2log, phyrank, logcnt = replicate_experts()</code></code></pre><p>Interested readers can explore the full code.</p><p></p><p>Tomorrow is the final day of DeepSeek Open Source Week&#8212;will they drop a heavyweight open-source project? Let&#8217;s wait and see!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek OpenSourceWeek Day 3: DeepGEMM In-Depth Analysis]]></title><description><![CDATA[Today marks the third day of DeepSeek's Open Source Week, with the release of DeepGEMM right on schedule at 9 AM.]]></description><link>https://aigc.news/p/deepseek-opensourceweek-day-3-deepgemm</link><guid isPermaLink="false">https://aigc.news/p/deepseek-opensourceweek-day-3-deepgemm</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Wed, 26 Feb 2025 16:06:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7577d166-4b4d-49bd-9b40-be89b41dd8d7_800x450.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><p>Today marks the third day of DeepSeek's Open Source Week, with the release of DeepGEMM right on schedule at 9 AM.<br></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LbJr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LbJr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 424w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 848w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 1272w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LbJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png" width="620" height="441.37353433835847" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:850,&quot;width&quot;:1194,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:452047,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LbJr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 424w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 848w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 1272w, https://substackcdn.com/image/fetch/$s_!LbJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef10be7c-2bcb-4ad0-89bb-5eb4d02a0ab1_1194x850.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>As of now, the project has garnered 3.3k stars since its release.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M6GC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M6GC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 424w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 848w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 1272w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M6GC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png" width="674" height="305.52197802197804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:1456,&quot;resizeWidth&quot;:674,&quot;bytes&quot;:288381,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M6GC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 424w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 848w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 1272w, https://substackcdn.com/image/fetch/$s_!M6GC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fb62ae-5308-4fc6-a423-d85d0084a330_2488x1128.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://github.com/deepseek-ai/DeepGEMM</figcaption></figure></div><p></p><p>The official introduction describes DeepGEMM as an FP8-supporting GEMM library compatible with both dense and MoE (Mixture of Experts) GEMM operations, designed for training and inference of V3/R1 models.</p><p> </p><p> <strong>A Brief Introduction to GEMM</strong></p><p>General Matrix Multiplication (GEMM) is one of the most fundamental and critical operations in deep learning and scientific computing. GEMM refers to the multiplication of two matrices, A and B, to produce a result matrix C, typically expressed as C = A &#215; B.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LVpM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LVpM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 424w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 848w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 1272w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LVpM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png" width="1252" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:1252,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140311,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LVpM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 424w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 848w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 1272w, https://substackcdn.com/image/fetch/$s_!LVpM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21c9717a-b48d-4236-aabf-82e3f313fb2a_1252x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>In deep learning, GEMM underpins core components such as fully connected layers, convolutional layers, and attention mechanisms. For instance, in Transformer architectures, both self-attention and feedforward network layers heavily rely on matrix multiplication. As model sizes grow, GEMM operations dominate the computational time in training and inference, making their performance a key factor in the efficiency of deep learning systems.</p><p>Modern GPU architectures, like NVIDIA&#8217;s Tensor Core technology, are specifically designed to accelerate matrix multiplication. With the ever-increasing scale of models, the demand for high-performance GEMM implementations continues to rise, especially in large language models (LLMs) and MoE frameworks, where efficient GEMM is critical for real-time inference and cost-effective training.</p><p>In the paper <em>DeepSeek LLM: Scaling Open-Source Language Models with Longtermism</em>, DeepSeek mentions GEMM, though it ties into their work in another paper, <em><a href="https://arxiv.org/abs/2408.14158">Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning</a></em>, which introduces the HAI-LLM training system. For those interested, I recommend checking out the <em>Fire-Flyer AI-HPC</em> paper.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VtAi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VtAi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 424w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 848w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 1272w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VtAi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png" width="1456" height="703" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:703,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:906763,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VtAi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 424w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 848w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 1272w, https://substackcdn.com/image/fetch/$s_!VtAi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35687d2b-05f0-4f94-bd16-12438e61d75b_1876x906.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Today&#8217;s open-sourced DeepGEMM supports FP8 and is tailored for training and inference of DeepSeek&#8217;s V3/R1 models. In the <a href="https://arxiv.org/abs/2412.19437">V3 paper</a>, DeepSeek details several optimizations for FP8 training.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C6cE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C6cE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 424w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 848w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 1272w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C6cE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png" width="1312" height="828" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:828,&quot;width&quot;:1312,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:487818,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C6cE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 424w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 848w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 1272w, https://substackcdn.com/image/fetch/$s_!C6cE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d0cf8c5-2572-43a5-9f1b-87d67e1f9325_1312x828.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The main challenges of FP8 training lie in precision and error handling. To tackle these, DeepSeek implemented the following optimizations:</p><ul><li><p><strong>Fine-Grained Quantization</strong>: Data is split into smaller groups, each with a specific multiplier to maintain high precision.</p></li><li><p><strong>Online Quantization</strong>: Weights are computed online for each 1x128 activation block or 128x128 weight block, with scaling factors inferred on-the-fly, and activations converted to FP8 in real time.</p></li><li><p><strong>Improved Accumulation Precision</strong>: FP8 accumulation can introduce random errors, so intermediate results are stored in FP32, then converted back after accumulation.</p></li><li><p><strong>Low-Precision/Mixed-Precision Storage and Communication</strong>: For MoE model training, FP8 is mixed with BF16/FP32 to ensure dynamic model stability.</p></li></ul><p>For a detailed look at these optimizations, check out the DeepSeek V3 paper.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eyaI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eyaI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 424w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 848w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 1272w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eyaI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png" width="650" height="512.4012638230648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1266,&quot;resizeWidth&quot;:650,&quot;bytes&quot;:582532,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eyaI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 424w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 848w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 1272w, https://substackcdn.com/image/fetch/$s_!eyaI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8491a232-0faf-4f9b-8517-c7a1a58c3788_1266x998.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br><strong>DeepGEMM Overview</strong></p><p>Here&#8217;s a summary of its key features:</p><ul><li><p><strong>FP8 Support</strong>: DeepGEMM uses CUDA&#8217;s two-stage accumulation to address precision issues.</p></li><li><p><strong>Grouped GEMM Support</strong>: It improves on CUTLASS&#8217;s grouped GEMM, with targeted optimizations for MoE models.</p></li><li><p><strong>Just-In-Time Compilation</strong>: Through JIT technology, code is dynamically generated and optimized at runtime, boosting performance and flexibility.</p></li><li><p><strong>FFMA SASS Interleaving</strong>: DeepSeek analyzed SASS compilation results in depth, tweaking FFMA/FADD instructions to enhance fine-grained FP8 GEMM efficiency.</p></li></ul><p><strong>Performance</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zaGg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zaGg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 424w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 848w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 1272w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zaGg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png" width="618" height="731.7659906396256" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1518,&quot;width&quot;:1282,&quot;resizeWidth&quot;:618,&quot;bytes&quot;:709290,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zaGg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 424w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 848w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 1272w, https://substackcdn.com/image/fetch/$s_!zaGg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12ff51ba-6e5d-446b-9954-6db1682b54f7_1282x1518.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All metrics show improvement, with the highest gain reaching a 2.7x speedup. DeepSeek notes that performance isn&#8217;t optimal in some areas and welcomes PRs from those interested in further optimization.</p><p>In the DeepGEMM project&#8217;s README, the DeepSeek team provides a detailed breakdown of the optimizations. For those interested, it&#8217;s worth diving into the code alongside the documentation for a hands-on exploration.</p><p></p><p></p><p><strong>Spotlight: The interleave_ffma.py File</strong></p><p>Today, let&#8217;s focus on a specific file in the project: interleave_ffma.py under the jit directory. It contains some clever tricks worth exploring.</p><p>Here&#8217;s the code:</p><pre><code><code>import argparse
import mmap
import os
import re
import subprocess
from torch.utils.cpp_extension import CUDA_HOME

def run_cuobjdump(file_path):
    command = [f'{CUDA_HOME}/bin/cuobjdump', '-sass', file_path]
    result = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    assert result.returncode == 0
    return result.stdout

def extract_ffma(sass):
    lines = sass.splitlines()
    collected = []
    current = []
    arch_name, func_name = 'N/A', 'N/A'
    skip_next_line = False
    for line in lines:
        if 'code for' in line:
            arch_name = line.lstrip().lstrip('code for ').rstrip()
        elif 'Function :' in line:
            func_name = line.lstrip().lstrip('Function :').rstrip()
        elif 'FFMA' in line:
            current.append(line)
            skip_next_line = True
        elif skip_next_line:
            current.append(line)
            skip_next_line = False
        else:
            if len(current) &gt;= 16:
                assert len(current) % 2 == 0
                collected.append((f'{arch_name}::{func_name}', current))
            current = []
    if os.getenv('DG_PRINT_REG_REUSE', None):
        print(f"Found {len(collected)} FFMA segments")
    return collected

def extract_hex_from_line(line):
    match = re.search(r'/\*\s*(0x[0-9a-fA-F]+)\s*\*/', line)
    assert match
    return int(match.group(1), 16)

def validate(m, offset, le_bytes, num_lines):
    assert len(le_bytes) == num_lines // 2
    assert m[offset:offset + 16] == le_bytes[0]
    for i in range(1, num_lines // 2):
        if m[offset + i * 16:offset + i * 16 + 16] != le_bytes[i]:
            return False
    return True

def parse_registers(line):
    import re
    line = re.sub(r'/\*.*?\*/', '', line)
    line = line.replace(';', '')
    tokens = line.strip().split(',')
    registers = []
    for token in tokens:
        token = token.strip()
        words = token.split()
        for word in words:
            if word.startswith('R'):
                reg = word.split('.')[0]
                registers.append(reg)
    return registers

def modify_segment(m, name, ffma_lines):
    num_lines = len(ffma_lines)
    assert num_lines % 2 == 0
    le_bytes, new_le_bytes = [], []
    reused_list = []
    dst_reg_set = set()
    last_reused, last_dst_reg = False, ''
    num_changed = 0
    for i in range(num_lines // 2):
        dst_reg = parse_registers(ffma_lines[i * 2])[-2]
        low_line, high_line = ffma_lines[i * 2], ffma_lines[i * 2 + 1]
        low_hex, high_hex = extract_hex_from_line(low_line), extract_hex_from_line(high_line)
        le_bytes.append(low_hex.to_bytes(8, 'little') + high_hex.to_bytes(8, 'little'))
        reused = (high_hex &amp; 0x0800000000000000) != 0
        if reused:
            is_first_occurred = dst_reg not in dst_reg_set
            if is_first_occurred or (last_reused and dst_reg == last_dst_reg):
                assert high_hex &amp; 0x0800200000000000, f"{hex(high_hex)}"
                high_hex ^= 0x0800200000000000
                reused = False
                num_changed += 1
            else:
                reused_list.append(i)
        dst_reg_set.add(dst_reg)
        new_le_bytes.append(low_hex.to_bytes(8, 'little') + high_hex.to_bytes(8, 'little'))
        last_reused, last_dst_reg = reused, dst_reg
    if os.getenv('DG_PRINT_REG_REUSE', None):
        print(f" &gt; segment `{name}` new reused list ({num_changed} changed): {reused_list}")
    offsets = []
    offset = m.find(le_bytes[0])
    while offset != -1:
        offsets.append(offset)
        offset = m.find(le_bytes[0], offset + 1)
    offsets = list(filter(lambda x: validate(m, x, le_bytes, num_lines), offsets))
    for offset in offsets:
        for i in range(num_lines // 2):
            m[offset + i * 16:offset + i * 16 + 16] = new_le_bytes[i]

def process(path):
    if os.getenv('DG_PRINT_REG_REUSE', None):
        print(f'Processing {path}')
    output = run_cuobjdump(path)
    segments = extract_ffma(output)
    with open(path, 'r+b') as f:
        mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_WRITE)
        for segment in segments:
            modify_segment(mm, *segment)
        mm.close()

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Interleave FFMA reg reuse')
    parser.add_argument('--so', help='Path to the SO file')
    args = parser.parse_args()
    process(args.so)</code></code></pre><p>This file is designed to optimize the register reuse patterns of FFMA (Fused Floating-point Multiply-Add) instructions in CUDA-compiled assembly code by modifying the binary file, ultimately improving GPU instruction execution efficiency.</p><p></p><p><strong>Key Function Breakdown:</strong></p><p></p><p><strong>SASS Code Extraction</strong></p><pre><code><code>def run_cuobjdump(file_path):
    command = [f'{CUDA_HOME}/bin/cuobjdump', '-sass', file_path]</code></code></pre><p>Uses NVIDIA&#8217;s cuobjdump tool to extract SASS (assembly) code from the binary.</p><p></p><p><strong>FFMA Instruction Analysis</strong></p><pre><code><code>def extract_ffma(sass):</code></code></pre><p>Extracts segments of SASS code containing FFMA instructions, collecting architecture and function names along with the instruction sequences.</p><p></p><p><strong>Register Usage Analysis</strong></p><pre><code><code>def parse_registers(line):</code></code></pre><p>Parses the registers used in each instruction, identifying those starting with 'R'.</p><p></p><p><strong>Binary Modification</strong></p><pre><code><code>def modify_segment(m, name, ffma_lines):</code></code></pre><p>Modifies the reuse and yield bits of FFMA instructions by tweaking specific bit patterns (e.g., 0x0800200000000000) to optimize register reuse.</p><p></p><p><strong>Workflow:</strong></p><ul><li><p>Reads a compiled CUDA shared library (.so file).</p></li><li><p>Extracts SASS code using cuobjdump.</p></li><li><p>Identifies and collects all FFMA instruction sequences.</p></li><li><p>Analyzes register usage patterns in each FFMA instruction.</p></li><li><p>Modifies the reuse flags based on specific rules.</p></li><li><p>Writes the optimized instructions back to the original file.</p></li></ul><p></p><p><strong>Optimization Strategy:</strong></p><p>The tool targets:</p><ul><li><p>First-time register usage.</p></li><li><p>Consecutive reuse of the same destination register.<br>By tweaking the reuse and yield bits, it optimizes instruction scheduling.</p></li></ul><p><strong>Usage:</strong></p><pre><code><code>python interleave_ffma.py --so path/to/cuda_lib.so</code></code></pre><p></p><p><strong>Analysis:</strong></p><p>This tool acts as a post-processing optimizer, running after CUDA compilation to enhance GPU instruction efficiency by tweaking the binary. Its focus on FFMA instruction register reuse is particularly impactful for compute-intensive applications like deep learning.</p><p>A regex in the file often stumps readers:</p><pre><code><code>def extract_hex_from_line(line):
    match = re.search(r'/\*\s*(0x[0-9a-fA-F]+)\s*\*/', line)
    assert match
    return int(match.group(1), 16)</code></code></pre><p>In CUDA SASS assembly, instructions often appear like this:</p><pre><code><code>FFMA R8, R8, R6, R4;                  /* 0x5c98078000870808 */</code></code></pre><p>This regex extracts 0x5c98078000870808, the hexadecimal machine instruction encoding. The function:</p><ul><li><p>Extracts the hex code from the assembly line.</p></li><li><p>Converts it to an integer for subsequent modification.</p></li><li><p>This step is crucial for locating instructions, modifying specific bits (e.g., reuse and yield flags), and writing them back.</p></li></ul><p></p><p>Honestly, DeepSeek&#8217;s engineers seem to outshine even some NVIDIA folks when it comes to CUDA mastery! Oh, and they&#8217;ve slashed their API prices again.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I1Qn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I1Qn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 424w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 848w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 1272w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I1Qn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png" width="456" height="646.59375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1815,&quot;width&quot;:1280,&quot;resizeWidth&quot;:456,&quot;bytes&quot;:962887,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157971007?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I1Qn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 424w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 848w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 1272w, https://substackcdn.com/image/fetch/$s_!I1Qn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4977a542-b376-4449-815b-207e5fd9ae4a_1280x1815.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[ Day 2 of DeepSeek OpenSourceWeek: In-Depth Analysis of DeepEP]]></title><description><![CDATA[On the second day of OpenSourceWeek, the official DeepSeek X account posted an article at 10:24, introducing the second open-source project of Open Source Week: DeepEP.]]></description><link>https://aigc.news/p/day-2-of-deepseek-opensourceweek</link><guid isPermaLink="false">https://aigc.news/p/day-2-of-deepseek-opensourceweek</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Tue, 25 Feb 2025 16:08:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8294b4e2-fba6-4172-aea7-1b95b68ae29a_1000x420.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2ZKT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2ZKT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 424w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 848w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 1272w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2ZKT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png" width="500" height="411.51919866444075" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:986,&quot;width&quot;:1198,&quot;resizeWidth&quot;:500,&quot;bytes&quot;:512158,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2ZKT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 424w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 848w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 1272w, https://substackcdn.com/image/fetch/$s_!2ZKT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f7518a4-f7a8-412d-8dab-d2641c2cc92e_1198x986.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://x.com/deepseek_ai/status/1894211757604049133</figcaption></figure></div><p>On the second day of OpenSourceWeek, the official DeepSeek X account posted an article at 10:24, introducing the second open-source project of Open Source Week: <a href="https://github.com/deepseek-ai/DeepEP">DeepEP</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KB8q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KB8q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 424w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 848w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 1272w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KB8q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png" width="1456" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:527089,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KB8q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 424w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 848w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 1272w, https://substackcdn.com/image/fetch/$s_!KB8q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083480d5-0321-42f6-bb57-08e7a8f91643_2048x1003.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://github.com/deepseek-ai/DeepEP</figcaption></figure></div><p><br>Since the code was released, it has already garnered 4.3K stars.</p><p></p><p>Many people are not very familiar with MoE models, so this article will first briefly introduce MoE and some of DeepSeek's work on MoE.</p><p></p><p><strong>MoE Introduction</strong></p><p>The Mixture-of-Experts (MoE) model is a simple extension of the Transformer architecture, rapidly becoming the preferred architecture for medium-to-large-scale language models (2 billion to 600 billion parameters).</p><p><strong>Key Advantages:</strong></p><ul><li><p>Faster pre-training speed compared to dense models</p></li><li><p>Faster inference speed compared to models with the same number of parameters</p></li></ul><p></p><p><strong>Challenges:</strong><br>It requires significant memory since all expert systems need to be loaded into memory. Additionally, as it is typically used for medium-to-large models, it often requires parallel processing across multiple GPUs, and communication must be highly efficient.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gCHh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gCHh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 424w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 848w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 1272w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gCHh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png" width="1274" height="982" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:982,&quot;width&quot;:1274,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:411360,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gCHh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 424w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 848w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 1272w, https://substackcdn.com/image/fetch/$s_!gCHh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8f01bce-ccb4-424f-9fb0-1949a4c21121_1274x982.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://arxiv.org/abs/2101.03961">Switch Transformers paper</a></figcaption></figure></div><p></p><p>MoE models primarily consist of two key components:</p><ol><li><p><strong>Sparse MoE Layers</strong><br>These layers replace the feed-forward network (FFN) layers in traditional Transformer models. An MoE layer contains several "experts" (e.g., 8), each of which is an independent neural network.</p></li><li><p><strong>Gating Network or Router</strong><br>This component determines which tokens are sent to which experts. Sometimes, a token may even be routed to multiple experts.</p><p></p></li></ol><p><strong>Summary:</strong><br>A notable advantage of Mixture-of-Experts (MoE) models is their ability to perform effective pre-training with far fewer computational resources than dense models. This means that, under the same computational budget, you can significantly scale up the model or dataset size. Especially during pre-training, MoE models typically reach the same quality level faster than dense models.</p><p></p><p></p><p><strong>DeepSeek MoE</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GDzr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GDzr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 424w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 848w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 1272w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GDzr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png" width="1290" height="854" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:854,&quot;width&quot;:1290,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:329088,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GDzr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 424w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 848w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 1272w, https://substackcdn.com/image/fetch/$s_!GDzr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8df25801-6501-4d73-87e0-d53cd00d1dab_1290x854.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On January 11, 2024, DeepSeek released the paper <em><a href="https://arxiv.org/pdf/2401.06066">DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models</a></em><a href="https://arxiv.org/pdf/2401.06066">,</a> making it one of the earliest companies to research MoE models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i3yf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i3yf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 424w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 848w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 1272w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i3yf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png" width="1242" height="860" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:860,&quot;width&quot;:1242,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:633748,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!i3yf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 424w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 848w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 1272w, https://substackcdn.com/image/fetch/$s_!i3yf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95c4c21a-1957-4a1c-a6d0-2a78097dba84_1242x860.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the DeepSeekMoE paper, the concepts of <strong>Router Expert</strong> and <strong>Shared Expert</strong> were introduced, and experiments were conducted on increasing fine-grained experts.</p><p>In the paper, DeepSeekMoE 16B has 2 shared experts and 64 router experts per layer, with each token activating 2 shared experts and 6 router experts. The 145B version, on the other hand, has 4 shared experts and 128 router experts, with each token activating 4 shared experts and 12 router experts.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fk2z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fk2z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 424w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 848w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 1272w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fk2z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png" width="1264" height="534" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:534,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:429498,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fk2z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 424w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 848w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 1272w, https://substackcdn.com/image/fetch/$s_!fk2z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ab8abc-cef4-41e1-81d9-5c98ed1ab11c_1264x534.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The DeepSeekMoE paper also mentions today&#8217;s topic: <strong>Expert Parallelism</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bpkv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bpkv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 424w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 848w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 1272w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bpkv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png" width="1308" height="1610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1610,&quot;width&quot;:1308,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:919292,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bpkv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 424w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 848w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 1272w, https://substackcdn.com/image/fetch/$s_!Bpkv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f0c8c64-94b9-4722-b043-164ea484343e_1308x1610.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the DeepSeek V3 paper, two paragraphs provide more detailed data. DeepSeek R1 was trained based on V3, and although no further details are provided, it is largely consistent with V3.</p><p></p><p></p><p><strong>DeepEP Introduction</strong></p><p>DeepEP is a communication library designed for MoE models and Expert Parallelism (EP). It provides high-throughput and low-latency all-to-all GPU kernels, also known as MoE data <strong>dispatch</strong> and <strong>combine</strong>. Additionally, the library supports low-precision operations, including FP8.</p><p>To align with the group-limited gating algorithm proposed in the DeepSeek-V3 paper, DeepEP offers a set of kernels optimized for asymmetric domain bandwidth forwarding (e.g., from NVLink domain to RDMA domain). These kernels deliver high throughput, making them ideal for training and inference prefetching tasks. They also support control over the number of Streaming Multiprocessors (SM).</p><p>For latency-sensitive inference decoding tasks, DeepEP includes a set of low-latency kernels using pure RDMA to minimize latency. The library also introduces a hook-based method for overlapping communication and computation without occupying any SM resources.</p><p><strong>Note:</strong> The implementation of this library may slightly differ from the DeepSeek-V3 paper.</p><p></p><p></p><p><strong>Performance</strong></p><p><strong>NVLink and RDMA Forwarding Test for General Kernels</strong></p><p>We tested general kernels on H800 (NVLink max bandwidth ~160 GB/s), with each device connected to a CX7 InfiniBand 400 Gb/s RDMA NIC (max bandwidth ~50 GB/s). The test followed the pre-training setup of DeepSeek-V3/R1: 4096 tokens per batch, hidden dimension of 7168, top-4 grouping, top-8 experts, FP8 dispatch, and BF16 combine.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1zk1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1zk1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 424w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 848w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 1272w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1zk1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png" width="1456" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:98429,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1zk1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 424w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 848w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 1272w, https://substackcdn.com/image/fetch/$s_!1zk1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdb22bc7-7f9f-47dc-9c76-abdc6f8c09a4_1564x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong>Low-Latency Kernel Test with Pure RDMA</strong></p><p>We tested low-latency kernels on H800, with each device connected to a CX7 InfiniBand 400 Gb/s RDMA NIC (max bandwidth ~50 GB/s). The test followed a typical DeepSeek-V3/R1 production environment setup: 128 tokens per batch, hidden dimension of 7168, top-8 experts, FP8 dispatch, and BF16 combine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m2v9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m2v9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 424w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 848w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 1272w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m2v9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png" width="1456" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107584,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m2v9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 424w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 848w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 1272w, https://substackcdn.com/image/fetch/$s_!m2v9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59df353f-4f55-4c8b-b7cc-6396c9cddcc4_1530x562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p><strong>Code Analysis</strong></p><p>The optimization strategy focuses heavily on low-latency communication. Here, we&#8217;ll highlight this aspect, while other features like dynamic scheduling, asynchronous communication, and stream management can be explored in the relevant documentation and code.</p><p><strong>Double Buffering</strong></p><pre><code><code>auto buffer = layout.buffers[low_latency_buffer_idx];
auto next_buffer = layout.buffers[low_latency_buffer_idx ^= 1];</code></code></pre><ul><li><p>Alternates between two buffers: one for the current operation, another for the next.</p></li><li><p>Uses the bitwise operation ^= 1 to efficiently switch buffer indices.</p><p></p></li></ul><p><strong>TMA (Tensor Memory Access) Optimization</strong></p><ul><li><p>Leverages the Hopper architecture&#8217;s TMA instructions to accelerate data transfer.</p></li><li><p>Supports low-precision formats like FP8 to reduce communication bandwidth requirements.</p><p></p></li></ul><p><strong>IBGDA Direct Communication</strong></p><pre><code><code>// Initialize recv queues for low-latency mode AR
ibgda_initialize_recv_queue&lt;&lt;&lt;num_ranks, 128&gt;&gt;&gt;(rank);</code></code></pre><ul><li><p>Uses NVSHMEM&#8217;s IBGDA technology for GPU-direct RDMA communication.</p></li><li><p>Bypasses CPU involvement entirely to reduce latency.</p></li></ul><p><strong>Expert-Level QP Allocation</strong></p><pre><code><code>_buffer = Buffer(group, 0, num_rdma_bytes, low_latency_mode=True,
num_qps_per_rank=num_experts // group.size())</code></code></pre><ul><li><p>Assigns independent Queue Pairs (QPs) to each local expert, eliminating resource contention.</p></li></ul><p></p><p><strong>DeepEP Use Cases</strong></p><ul><li><p>Large-scale MoE model training (e.g., models with hundreds of billions of parameters)</p></li><li><p>High-concurrency, low-latency real-time inference services</p></li><li><p>Heterogeneous computing tasks such as multimodal applications and scientific computing</p></li></ul><p></p><p>Notably, at the end of the repository, DeepSeek mentions using an undocumented NVIDIA instruction for optimization&#8212;a true hacker spirit worth learning from!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AkRZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AkRZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 424w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 848w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 1272w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AkRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png" width="1456" height="470" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:470,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:428160,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157895277?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AkRZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 424w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 848w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 1272w, https://substackcdn.com/image/fetch/$s_!AkRZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52adb263-bc15-4693-8d31-9c23cb9fbd91_1790x578.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek Open Source Week Day 1: In-Depth Analysis of FlashMLA]]></title><description><![CDATA[This morning at 9:34, DeepSeek announced the first project of Open Source Week on X: FlashMLA.]]></description><link>https://aigc.news/p/deepseek-open-source-week-day-1-in</link><guid isPermaLink="false">https://aigc.news/p/deepseek-open-source-week-day-1-in</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 24 Feb 2025 15:09:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rClU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rClU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rClU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rClU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rClU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rClU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rClU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg" width="1364" height="848" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:848,&quot;width&quot;:1364,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:63467,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F298e6a2f-cb95-4cd6-81ba-25805e84cd4d_1600x900.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rClU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rClU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rClU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rClU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187c8e4c-0940-4692-b31a-d3597d7b586c_1364x848.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FSh6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FSh6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 424w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 848w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 1272w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FSh6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png" width="582" height="714.6709677419354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1142,&quot;width&quot;:930,&quot;resizeWidth&quot;:582,&quot;bytes&quot;:414480,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FSh6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 424w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 848w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 1272w, https://substackcdn.com/image/fetch/$s_!FSh6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b5ce142-9ed8-4a46-ae85-4f48b67bb6da_930x1142.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This morning at 9:34, DeepSeek announced the first project of Open Source Week on X: FlashMLA. This article provides an in-depth analysis of FlashMLA.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZM8r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZM8r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 424w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 848w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 1272w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZM8r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png" width="686" height="365.1442307692308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:775,&quot;width&quot;:1456,&quot;resizeWidth&quot;:686,&quot;bytes&quot;:581601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZM8r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 424w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 848w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 1272w, https://substackcdn.com/image/fetch/$s_!ZM8r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F073f3d34-f5c1-4dd5-a7d3-c58362cce0ba_2048x1090.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The FlashMLA project has gained significant popularity, with its code already reaching 6.8k stars.</p><p></p><p><strong>Brief Introduction to MLA</strong></p><p>MLA (Multi-Head Latent Attention) is an optimization method for Multi-Head Attention (MHA) proposed by DeepSeek in their paper <em><a href="https://arxiv.org/abs/2405.04434">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a></em>.</p><p>In Transformer models, MHA is one of the most computationally intensive modules. To maintain high efficiency in large-scale scenarios, further optimization is necessary.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ywrG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ywrG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 424w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 848w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ywrG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png" width="567" height="538.4335877862595" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1244,&quot;width&quot;:1310,&quot;resizeWidth&quot;:567,&quot;bytes&quot;:605978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ywrG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 424w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 848w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!ywrG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff74c2b91-77a9-4ed5-b6d9-24fc4b39fc06_1310x1244.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>MLA can be considered a variant of MHA. In its implementation, it borrows some concepts from FlashAttention. The DeepSeek-V2 paper primarily compares it with MHA, GQA, and MQA, with optimization results shown in the figure below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jNDN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jNDN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 424w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 848w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 1272w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jNDN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png" width="635" height="285.30421216848674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:1282,&quot;resizeWidth&quot;:635,&quot;bytes&quot;:333386,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jNDN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 424w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 848w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 1272w, https://substackcdn.com/image/fetch/$s_!jNDN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e558075-e8bd-4c13-959c-b68adf2092e0_1282x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-7r1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-7r1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 424w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 848w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 1272w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-7r1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png" width="610" height="285.3689167974882" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cae913c2-b98f-4274-9548-881c1cf819be_1274x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:596,&quot;width&quot;:1274,&quot;resizeWidth&quot;:610,&quot;bytes&quot;:381066,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-7r1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 424w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 848w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 1272w, https://substackcdn.com/image/fetch/$s_!-7r1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcae913c2-b98f-4274-9548-881c1cf819be_1274x596.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In some inference frameworks, MLA has also been implemented. As shown below, after integrating MLA into SGLang, throughput increased by 2-3 times.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gccI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gccI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 424w, https://substackcdn.com/image/fetch/$s_!gccI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 848w, https://substackcdn.com/image/fetch/$s_!gccI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 1272w, https://substackcdn.com/image/fetch/$s_!gccI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gccI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png" width="634" height="272.97222222222223" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:310,&quot;width&quot;:720,&quot;resizeWidth&quot;:634,&quot;bytes&quot;:73536,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gccI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 424w, https://substackcdn.com/image/fetch/$s_!gccI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 848w, https://substackcdn.com/image/fetch/$s_!gccI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 1272w, https://substackcdn.com/image/fetch/$s_!gccI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20c69bfd-0486-427e-a9ed-3e6fd222822d_720x310.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://lmsys.org/blog/2024-09-04-sglang-v0-3/</figcaption></figure></div><p></p><p></p><p><strong>Using FlashMLA</strong></p><p><strong>Environment Requirements:</strong></p><ul><li><p>Hopper GPUs</p></li><li><p>Minimum CUDA 12.3</p></li><li><p>Minimum PyTorch 2.0</p></li></ul><p><strong>Installation:</strong></p><pre><code><code>git clone https://github.com/deepseek-ai/FlashMLA  
python setup.py install  </code></code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o08t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o08t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 424w, https://substackcdn.com/image/fetch/$s_!o08t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 848w, https://substackcdn.com/image/fetch/$s_!o08t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 1272w, https://substackcdn.com/image/fetch/$s_!o08t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o08t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png" width="1456" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:239738,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o08t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 424w, https://substackcdn.com/image/fetch/$s_!o08t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 848w, https://substackcdn.com/image/fetch/$s_!o08t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 1272w, https://substackcdn.com/image/fetch/$s_!o08t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bba6c35-b478-4a8d-9033-ac30922c97d8_1666x584.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br></p><p><strong>Performance:</strong><br>The repository provides a Benchmark file that can be run directly. Official results show that on an H800 SXM5 with CUDA 12.6, it achieves speeds of up to 3000 GB/s under memory-bound configurations and 580 TFLOPS under compute-bound configurations.</p><p></p><p><strong>Code Analysis</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!78sG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!78sG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 424w, https://substackcdn.com/image/fetch/$s_!78sG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 848w, https://substackcdn.com/image/fetch/$s_!78sG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 1272w, https://substackcdn.com/image/fetch/$s_!78sG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!78sG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png" width="497" height="501.67764705882354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98535b2d-d90a-4d1e-9e00-903020272662_850x858.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:858,&quot;width&quot;:850,&quot;resizeWidth&quot;:497,&quot;bytes&quot;:163327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!78sG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 424w, https://substackcdn.com/image/fetch/$s_!78sG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 848w, https://substackcdn.com/image/fetch/$s_!78sG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 1272w, https://substackcdn.com/image/fetch/$s_!78sG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98535b2d-d90a-4d1e-9e00-903020272662_850x858.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>FlashMLA&#8217;s codebase is relatively small with minimal dependencies.<br>The primary optimization techniques are as follows:</p><p><strong>1. Computation Chunking and Scheduling Optimization</strong></p><pre><code><code>template&lt;int kHeadDim_, int kBlockM_, int kBlockN_, int kNWarps_&gt;  
struct Flash_fwd_kernel_traits_mla {  
    // Fixed block size of 64x64  
    static constexpr int kBlockM = kBlockM_;    
    static constexpr int kBlockN = kBlockN_;    
    
    // Each block uses 8 warps in parallel  
    static constexpr int kNWarps = kNWarps_;    
    static constexpr int kNThreads = kNWarps * 32;  
    
    // Shared memory optimization  
    static constexpr int kBlockKSmem = kHeadDim % 64 == 0 ? 64 : 32;  
};</code></code></pre><p><strong>Key Points:</strong></p><ul><li><p>Improves computational efficiency through chunking (block size of 64), paged KV caching, and multi-warp parallelism.</p></li></ul><p><strong>2. Memory Access Optimization</strong></p><pre><code><code>struct Flash_fwd_mla_params {  
    using index_t = int64_t;  
    int b, seqlen_q, d, d_v;  
    int h, h_h_k_ratio, ngroups;  
    bool is_causal;  
    float scale_softmax, scale_softmax_log2;  
    int *__restrict__ cu_seqlens_k;  
    void *__restrict__ q_ptr;  
    void *__restrict__ k_ptr;  
    void *__restrict__ v_ptr;  
    void *__restrict__ o_ptr;  
    void *__restrict__ softmax_lse_ptr;  
    index_t q_batch_stride;  
    index_t k_batch_stride;  
    index_t v_batch_stride;  
    index_t o_batch_stride;  
    index_t q_row_stride;  
    index_t k_row_stride;  
    index_t v_row_stride;  
    index_t o_row_stride;  
    index_t q_head_stride;  
    index_t k_head_stride;  
    index_t v_head_stride;  
    index_t o_head_stride;  
    int *__restrict__ block_table;  
    index_t block_table_batch_stride;  
    int page_block_size;  
    int *__restrict__ tile_scheduler_metadata_ptr;  
    int num_sm_parts;  
    int *__restrict__ num_splits_ptr;  
    void *__restrict__ softmax_lseaccum_ptr;  
    void *__restrict__ oaccum_ptr;  
};</code></code></pre><p><strong>Key Points:</strong></p><ul><li><p>Uses paged KV caching (block_table, page_block_size).</p></li><li><p>Optimized memory layout and access strides (stride).</p></li><li><p>Scheduling with tile_scheduler_metadata.</p></li></ul><p><strong>3. Softmax Computation Optimization</strong></p><pre><code><code>for (int mi = 0; mi &lt; size&lt;0&gt;(tensor); ++mi) {  
    MaxOp&lt;float&gt; max_op;  
    max(mi) = zero_init ? tensor(mi, 0) : max_op(max(mi), tensor(mi, 0));  
    #pragma unroll  
    for (int ni = 1; ni &lt; size&lt;1&gt;(tensor); ni++) {  
        max(mi) = max_op(max(mi), tensor(mi, ni));  
    }  
    max(mi) = Allreduce&lt;4&gt;::run(max(mi), max_op);  
    const float max_scaled = max(mi) == -INFINITY ? 0.f : max(mi) * scale;  
    sum(mi) = 0;  
    #pragma unroll  
    for (int ni = 0; ni &lt; size&lt;1&gt;(tensor); ++ni)  {  
        tensor(mi, ni) = exp2f(tensor(mi, ni) * scale - max_scaled);  
        sum(mi) += tensor(mi, ni);  
    }  
    SumOp&lt;float&gt; sum_op;  
    sum(mi) = Allreduce&lt;4&gt;::run(sum(mi), sum_op);  
}</code></code></pre><p><strong>Key Points:</strong></p><ul><li><p>Uses log2/exp2 instead of log/exp.</p></li><li><p>Optimizes with FFMA instructions.</p></li><li><p>Warp-level reduction for summation optimization.</p></li></ul><p><strong>4. Double Buffering Optimization</strong></p><pre><code><code>struct SharedStorageMLA {  
    union {  
        struct {  
            // Double buffering for K matrix  
            cute::array_aligned&lt;Element, cosize_v&lt;SmemLayoutQ&gt;&gt; smem_q;  
            cute::array_aligned&lt;Element, cosize_v&lt;SmemLayoutK&gt; * 2&gt; smem_k;  // Double buffer  
            cute::array_aligned&lt;Element, cosize_v&lt;SmemLayoutP&gt;&gt; smem_p;  
            cute::array_aligned&lt;ElementAccum, cosize_v&lt;SmemLayoutRow&gt;&gt; smem_scale;  
        };  
    };  
};</code></code></pre><p><strong>Key Points:</strong></p><ul><li><p>Hides memory latency with double buffering to improve hardware utilization.</p></li></ul><p></p><p><strong>Summary of FlashMLA</strong></p><p>FlashMLA is essentially a customized version of FlashAttention. Its current applicable scenarios include:</p><ul><li><p>Environments requiring CUDA 11+ and SM90+ Hopper architecture.</p></li><li><p>Inference or training of multi-head attention with BF16 (Q=576, V=512).</p></li><li><p>Large-sequence scenarios requiring integration with split-K schemes to boost throughput.</p><p></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JEzB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JEzB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 424w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 848w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 1272w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JEzB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:484379,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157812222?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JEzB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 424w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 848w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 1272w, https://substackcdn.com/image/fetch/$s_!JEzB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0cc22417-02e9-4a3c-96ec-a4656d17f5de_1654x934.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>As shown above, there are still many optimization methods from the official team. Looking forward to tomorrow&#8217;s project&#8212;could it be infra-related? Perhaps something as challenging as MTP? </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek OpenSourceWeek is Coming: What Mysterious Technologies Might Be Unveiled?]]></title><description><![CDATA[Let&#8217;s guess what kind of projects DeepSeek might open-source next week.]]></description><link>https://aigc.news/p/deepseek-opensourceweek-is-coming</link><guid isPermaLink="false">https://aigc.news/p/deepseek-opensourceweek-is-coming</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Sun, 23 Feb 2025 14:34:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GNiV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GNiV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GNiV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GNiV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg" width="592" height="394.3764705882353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:453,&quot;width&quot;:680,&quot;resizeWidth&quot;:592,&quot;bytes&quot;:20551,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GNiV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GNiV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2488486e-4b53-4e2a-83f6-650972e753c3_680x453.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On February 21, DeepSeek announced on X the launch of a warm-up for Open Source Week. Starting next week, they will open-source five projects over five consecutive days. For each of these five projects, I will write detailed articles to introduce them on the day they are announced. Feel free to follow me for the latest analyses.</p><p>Today, I&#8217;ll make some predictions about which projects might be open-sourced. If I guess even one correctly, I&#8217;ll consider it a win.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q1NO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q1NO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 424w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 848w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q1NO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png" width="524" height="628.8" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1296,&quot;width&quot;:1080,&quot;resizeWidth&quot;:524,&quot;bytes&quot;:563803,&quot;alt&quot;:&quot;https://x.com/deepseek_ai/status/1892786555494019098&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://x.com/deepseek_ai/status/1892786555494019098" title="https://x.com/deepseek_ai/status/1892786555494019098" srcset="https://substackcdn.com/image/fetch/$s_!q1NO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 424w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 848w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!q1NO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03abd3a-54a1-4dd4-b3c3-905ba032dec6_1080x1296.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://x.com/deepseek_ai/status/1892786555494019098</figcaption></figure></div><p></p><p><strong>First, there&#8217;s definitely going to be something related to infra.</strong><br>According to the <a href="https://x.com/deepseek_ai/status/1892786555494019098">X post</a>, they are a small team within DeepSeek, sharing small but genuine progress, specifically modules used in online services. They particularly emphasized &#8220;small,&#8221; which likely points to code related to model inference optimization.</p><p></p><p>The recent release of DeepSeek-R1 has generated significant buzz, but inference optimization still lacks robust support from major frameworks. It feels like they might directly release some official implementations, as this is currently in high demand.</p><p></p><p>These optimizations could include deployment and inference strategies mentioned in the <a href="https://arxiv.org/abs/2412.19437">DeepSeek V3 technical report</a>, such as Prefilling and Decoding.</p><p></p><p><strong>Second, the official repository index hints at a paper, which is likely related.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kSAw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kSAw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 424w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 848w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 1272w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kSAw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png" width="577" height="527.0673076923077" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1330,&quot;width&quot;:1456,&quot;resizeWidth&quot;:577,&quot;bytes&quot;:776053,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kSAw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 424w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 848w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 1272w, https://substackcdn.com/image/fetch/$s_!kSAw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432dbb17-c11b-4a12-82d2-fcdf07cd3b53_1784x1630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://github.com/deepseek-ai/open-infra-index</figcaption></figure></div><p><br>Today, DeepSeek created an "<a href="https://github.com/deepseek-ai/open-infra-index">open-infra-index</a>" repo on GitHub, which includes a paper: <em><a href="https://arxiv.org/abs/2408.14158">Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning</a></em>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pyc2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pyc2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 424w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 848w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 1272w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pyc2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png" width="565" height="692.3239436619718" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85920787-4058-43e1-a396-ad030fef687a_1136x1392.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1392,&quot;width&quot;:1136,&quot;resizeWidth&quot;:565,&quot;bytes&quot;:933441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pyc2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 424w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 848w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 1272w, https://substackcdn.com/image/fetch/$s_!Pyc2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85920787-4058-43e1-a396-ad030fef687a_1136x1392.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This <em>Fire-Flyer AI-HPC</em> paper mainly discusses optimizations for a training cluster with 10,000 A100 GPUs, focusing on reducing system construction costs and energy consumption. Specifically, compared to DGX-A100, it achieves about 80% of the performance while cutting costs by 60% and energy consumption by 50%. I won&#8217;t go into the detailed innovations here&#8212;interested folks can read the paper themselves.</p><p>Why mention this paper? My guess is that they&#8217;ll open-source some of the system code referenced in it. Looking at it, here are a few possibilities:</p><ul><li><p><strong>HFReduce</strong>: A library developed specifically for efficient allreduce operations, designed to optimize GPU communication in PCIe architectures. HFReduce overlaps computation and communication through asynchronous allreduce operations, significantly boosting performance.</p></li><li><p><strong>HaiScale</strong>: A distributed data-parallel training tool that uses HFReduce as its communication backend, enabling asynchronous allreduce operations during backpropagation to improve training efficiency.</p></li><li><p><strong>3FS</strong>: A high-performance distributed file system designed to fully leverage the high IOPS and throughput of NVMe SSDs and RDMA networks. The 3FS system supports efficient read/write operations and achieves traffic isolation under high loads.</p></li><li><p><strong>3FS-KV</strong>: A shared-storage distributed data processing system built on 3FS, supporting various data models (e.g., key-value storage, message queues, and object storage) with read-write separation and on-demand startup features.</p></li><li><p><strong>HAI Platform</strong>: A time-sharing scheduling platform that manages and schedules training tasks to ensure efficient GPU resource utilization.</p></li><li><p><strong>Checkpoint Manager</strong>: A tool for managing checkpoints during large language model training, supporting rapid recovery from hardware failures to ensure training continuity.</p></li></ul><p></p><p>Whether they open-source one or several of these is hard to say, but at the very least, it&#8217;ll involve systems from this paper&#8212;possibly the HAI platform.</p><p></p><p><strong>Third, an RL training framework? It&#8217;s possible.</strong><br>They might open-source some RL methods from DeepSeek R1. RL is genuinely tough to train. However, I think the likelihood of this is low.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rlwb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rlwb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 424w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 848w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 1272w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rlwb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png" width="581" height="429.5096296296296" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:675,&quot;resizeWidth&quot;:581,&quot;bytes&quot;:185491,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rlwb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 424w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 848w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 1272w, https://substackcdn.com/image/fetch/$s_!Rlwb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b502ee-6bd5-480e-8b7a-0ac2263eae9c_675x499.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p><strong>Fourth, an inference framework? The most likely.</strong><br>DeepSeek&#8217;s internal inference optimization is top-notch, and their API pricing is very low. However, inference costs for external vendors remain high. If they open-source an inference framework, it could help major frameworks optimize efficiency more quickly.</p><p>For example, the recently announced NAS (<em><a href="https://arxiv.org/abs/2502.11089">Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention</a></em><a href="https://arxiv.org/abs/2502.11089">)</a> might come with code. It&#8217;d be even better if they included something like distributed KV caching.</p><p></p><p><strong>Fifth, a big guess about the open-source license.</strong><br>Currently, DeepSeek uses the MIT license, which is the most permissive one. Many vendors&#8217; open-source licenses are stricter&#8212;LLaMA, for instance, has changed its license multiple times and is no longer considered a true open-source model.</p><p>Here&#8217;s a rundown of licenses for some popular models:</p><ul><li><p><strong>DeepSeek</strong>: MIT, fully open.</p></li><li><p><strong>Qwen</strong>: Apache 2.0 + additional terms for some models.</p></li><li><p><strong>Mistral</strong>: Apache 2.0.</p></li><li><p><strong>LLaMA</strong>: Non-commercial research license, the most restrictive.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BaXU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BaXU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 424w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 848w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 1272w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BaXU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png" width="604" height="377.91483516483515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:911,&quot;width&quot;:1456,&quot;resizeWidth&quot;:604,&quot;bytes&quot;:410470,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://aigc.openbot.ai/i/157742571?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BaXU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 424w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 848w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 1272w, https://substackcdn.com/image/fetch/$s_!BaXU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c606f-b64a-418b-8de2-294abcb308aa_1752x1096.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recently, Perplexity&#8217;s r1-1776 fiasco was clownish behavior, which might push DeepSeek toward a stricter license. Still, I&#8217;m guessing these five projects will stick with MIT. I&#8217;ll share more details on the licenses once they&#8217;re officially announced.</p><p></p><p>Thank DeepSeek. Next week, we will continue analyzing the five open-sourced projects.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #91 Synthetic Data Is All You Need ]]></title><description><![CDATA[Is Synthetic Data all We Need?]]></description><link>https://aigc.news/p/aigc-weekly-91-synthetic-data-is</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-91-synthetic-data-is</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 23 Dec 2024 14:47:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kNnQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kNnQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kNnQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kNnQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66e23322-bad0-46a9-993c-18241292e9de_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129885,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kNnQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!kNnQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66e23322-bad0-46a9-993c-18241292e9de_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong><a href="https://xiaobot.net/post/39ce86f1-c9ef-4f87-8e5b-d6397d425578">&#20013;&#25991;&#29256;</a></strong></p><h1>Synthetic Data Is All You Need </h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9znX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9znX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9znX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9znX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9znX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9znX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg" width="514" height="727.9470085470085" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1657,&quot;width&quot;:1170,&quot;resizeWidth&quot;:514,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9znX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9znX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9znX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9znX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb90818b3-39d3-4a68-ab35-db5655c80b79_1170x1657.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The 12-day OpenAI livestream has finally ended this week, and the image above summarizes the 12-day livestream. It wasn&#8217;t until the final day that O3 was released, which sparked some discussion.</p><p>As for O3, it&#8217;s still a future project. We&#8217;ll discuss it further once it can be tested. Of course, I don&#8217;t think it has anything to do with AGI, and the AGI-ARC evaluation has many issues.</p><p>Today, let&#8217;s talk about the Deliberative Alignment released by OpenAI</p><p>What is it? It&#8217;s a type of training method that teaches LLMs (Large Language Models) to explicitly consider safety guidelines before providing answers. By applying this method, the model can use Chain-of-Thought (CoT) reasoning to review user prompts, identify relevant policy guidelines, and generate safer responses.</p><p>The paper provides an example showing how CoT can help the model better understand the user&#8217;s intent and respond appropriately, avoiding illegal or unethical activities. For those interested, you can take a look. Here, we&#8217;ll mainly focus on the training part.</p><p>In summary, it&#8217;s a two-stage synthetic data pipeline.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oMth!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oMth!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 424w, https://substackcdn.com/image/fetch/$s_!oMth!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 848w, https://substackcdn.com/image/fetch/$s_!oMth!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 1272w, https://substackcdn.com/image/fetch/$s_!oMth!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oMth!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp" width="588" height="509.1958762886598" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccf08847-1826-4638-b9a4-40590f67648a_970x840.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:840,&quot;width&quot;:970,&quot;resizeWidth&quot;:588,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oMth!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 424w, https://substackcdn.com/image/fetch/$s_!oMth!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 848w, https://substackcdn.com/image/fetch/$s_!oMth!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 1272w, https://substackcdn.com/image/fetch/$s_!oMth!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccf08847-1826-4638-b9a4-40590f67648a_970x840.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Stage 1: SFT</strong></p><p>Prompt + CAT (Classification) + Spec (Safety Guidelines, System Prompt) &#8594; Model with CoT (e.g., O1) &#8594; CoT-enhanced Output &#8594; Train using (Prompt, CoT, Output)</p><p><strong>Stage 2: RL</strong></p><p>A Judge LLM gives a reward signal based on the spec, and then RL is used to improve the model&#8217;s safety capabilities.</p><p>Input: (Prompt, Category)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4X5A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4X5A!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 424w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 848w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 1272w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4X5A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp" width="536" height="464.16494845360825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:840,&quot;width&quot;:970,&quot;resizeWidth&quot;:536,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4X5A!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 424w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 848w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 1272w, https://substackcdn.com/image/fetch/$s_!4X5A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e085730-ff15-479b-8b12-4b9ca0d9324f_970x840.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The two stages mainly do not involve human labeling or human intervention. The entire synthetic data pipeline is very smooth and can be applied on a large scale.</p><p></p><p>Synthetic data really plays a significant role this time.</p><h2><strong>Top Papers of the week</strong></h2><p>1.) <strong>Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces</strong> (<a href="https://vision-x-nyu.github.io/thinking-in-space.github.io/">webpage</a> | <a href="https://arxiv.org/abs/2412.14171">paper</a>)</p><ul><li><p>A video dataset used to evaluate the spatial reasoning ability of MLLMs.</p></li><li><p>It provides over 5,000 question-answer pairs, showing that MLLMs have competitive spatial reasoning abilities, but still fall short of human-level performance.</p></li><li><p>By exploring how models think about space through both language and vision, the study finds that spatial reasoning is a key bottleneck preventing MLLMs from achieving higher benchmark performance.</p></li><li><p>By explicitly generating cognitive maps during question answering, the spatial distance capabilities of MLLMs were enhanced.</p></li></ul><div><hr></div><p>2.) <strong>Alignment Faking in LLMs</strong> (<a href="https://www.anthropic.com/research/alignment-faking">webpage</a> | <a href="https://arxiv.org/abs/2412.14093">paper</a>)</p><ul><li><p>An experiment by Anthropic that demonstrates how the Claude model can perform "alignment faking," meaning it can follow harmful requests while avoiding retraining and retaining its original safety preferences. This raises concerns about the reliability of AI safety training methods.</p></li></ul><div><hr></div><p>3.) <strong>Qwen-2.5 Technical Report( <a href="https://arxiv.org/abs/2412.15115">paper</a> )</strong></p><ul><li><p>Alibaba released Qwen-2.5, a new series of LLMs trained on 18T tokens.</p></li><li><p>It provides open-weight models and proprietary MoE variants, with performance on par with Llama-3 and GPT-4.</p></li></ul><div><hr></div><p>4.) <strong>TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks</strong> (<a href="https://arxiv.org/abs/2412.14161">paper</a>)</p><ul><li><p>A benchmark used to evaluate AI agents' performance on real-world professional tasks, including software engineering, project management, finance, and human resources.</p></li><li><p>Multiple LLMs were tested, including API models and open-source models. The results showed the limitations of current AI agents.</p></li><li><p>The best-performing model, Claude-3.5-Sonnet, had a success rate of only 24% when completing tasks, but this increased to 34.4% when considering partial progress.</p></li></ul><div><hr></div><p>5.) <strong>How to Synthesize Text Data without Model Collapse?</strong> (<a href="https://arxiv.org/abs/2412.14689">paper</a>)</p><ul><li><p>A study exploring the impact of synthetic data in language model training and how to synthesize data without causing model collapse.</p></li><li><p>Experiments found a negative correlation between the proportion of synthetic data and model performance.</p></li><li><p>A token-editing method based on human-generated data was proposed to create semi-synthetic data.</p></li><li><p>Theoretical proofs show that token-level editing can prevent model collapse, as test errors are constrained by a finite upper bound.</p></li><li><p>Extensive experiments validated the theoretical proofs, demonstrating that token-level editing improves data quality and enhances model performance.</p></li></ul><div><hr></div><p>6.) <strong>Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Fine-tuning and Inference</strong> (<a href="https://arxiv.org/abs/2412.13663">paper</a>)</p><ul><li><p>ModernBERT is a modern encoder designed for fast, memory-efficient, and long-context fine-tuning, excelling in various evaluation tasks.</p></li><li><p>Trained on 2 trillion tokens with a sequence length of 8192, it performs well on multiple benchmarks, especially in code retrieval tasks.</p></li><li><p>It is the fastest and most memory-efficient encoder, suitable for inference on common GPUs.</p></li></ul><div><hr></div><p>7.) <strong>PAE (Proposer-Agent-Evaluator)</strong> (<a href="https://arxiv.org/abs/2412.13194">paper</a>)</p><ul><li><p>A learning system that enables AI agents to autonomously discover and practice skills through web navigation, using reinforcement learning and context-aware task proposals, achieving state-of-the-art performance on real-world benchmarks.</p></li></ul><div><hr></div><p>8.) <strong>AutoFeedback</strong> (<a href="https://arxiv.org/abs/2411.07407">paper</a>)</p><ul><li><p>A two-agent AI system that generates more accurate and educational feedback, significantly reducing common errors.</p></li><li><p>It achieves state-of-the-art performance in scientific evaluations through reinforcement learning and context-aware task proposals.</p></li></ul><div><hr></div><p>9.) <strong>GUI Agents: A Survey</strong> (<a href="https://arxiv.org/abs/2412.13501">paper</a>)</p><ul><li><p>A comprehensive survey covering the benchmarking, evaluation metrics, architecture, and training methods of GUI agents.</p></li><li><p>It proposes a unified framework describing the perception, reasoning, planning, and execution capabilities of GUI agents.</p></li><li><p>The survey identifies important open challenges and discusses key future directions.</p></li></ul><div><hr></div><p>10.) <strong>Genesis</strong> (<a href="https://genesis-embodied-ai.github.io/">webpage</a> | <a href="https://github.com/Genesis-Embodied-AI/Genesis">github</a>)</p><ul><li><p>A generative physics engine capable of creating 4D dynamic worlds, providing a physical simulation platform for general-purpose robotics and AI applications.</p></li><li><p>It implements a unified simulation framework from scratch, integrating cutting-edge physics solvers.</p></li><li><p><strong>Comment:</strong> The project has reached 18k stars in a few days, but many features have not been released yet, and the testing results are not ideal.</p></li></ul><p></p><h2><strong>AIGC News of the week</strong></h2><p>1.) <a href="https://huggingface.co/Jovie/Midjourney">Jovie/Midjourney</a></p><p>2.) <a href="https://github.com/NUS-HPC-AI-Lab/Enhance-A-Video">Enhance-A-Video: Better Generated Video for Free</a></p><p>3.) <a href="https://github.com/cyclotruc/gitingest">gitingest:Turn any Git repository into a prompt-friendly text ingest for LLMs</a></p><p>4.) <a href="https://github.com/DepthAnything/PromptDA">PromptDA:Prompt Depth Anything</a></p><p>5.) <a href="https://github.com/fal-ai/diffusion-speedrun">diffusion-speedrun</a></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #90 What Ilya Saw]]></title><description><![CDATA[in 2014, 2024]]></description><link>https://aigc.news/p/aigc-weekly-90-what-ilya-saw</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-90-what-ilya-saw</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 16 Dec 2024 18:10:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MhQ0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MhQ0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MhQ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131112,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MhQ0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!MhQ0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c9af0-3dbe-4027-a654-14c0245f66b3_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://xiaobot.net/post/b4d54e00-c93b-4863-91b3-8b98ea7389d3">&#20013;&#25991;&#29256;</a></p><p></p><h1><strong>What Ilya Saw</strong></h1><p>Let's do a time check, comparing what Ilya said 10 years ago and now.</p><p></p><p><strong>What Ilya Saw in 2014</strong></p><div id="youtube2--uyXE7dY5H0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;-uyXE7dY5H0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/-uyXE7dY5H0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p></p><ul><li><p>The Deep Learning Hypothesis: If you have a large neural network, it can do anything humans can do in an instant.</p></li><li><p>The Autoregression Hypothesis: Simple next token prediction/sequence-to-sequence tasks will master the correct distribution, generalizing from translation to all other domains.</p></li><li><p>The Scaling Hypothesis: If you have a large dataset and train a very large neural network, success is guaranteed.</p></li><li><p>The Connectionism Hypothesis: If you believe artificial neurons work like biological neurons, then very large neural networks can be "configured to do almost everything we humans do."</p><p></p></li></ul><p><strong>What Ilya Saw in 2024</strong></p><div id="youtube2-1yvBqasHLZs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;1yvBqasHLZs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/1yvBqasHLZs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><ul><li><p>The end of the pre-training era, comparing data to "AI's fossil fuel" as a finite resource.</p></li><li><p>AI systems will demonstrate "true autonomy" with stronger reasoning capabilities.</p></li><li><p>Finding new scaling patterns from human evolution.</p></li><li><p>Future outlook: Agents, synthetic data, inference time compute.</p></li></ul><p></p><h2><strong>Future</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!By8Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!By8Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 424w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 848w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 1272w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!By8Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp" width="1456" height="405" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:405,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!By8Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 424w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 848w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 1272w, https://substackcdn.com/image/fetch/$s_!By8Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74ba6f61-985a-4922-8a54-20ea4845440d_1600x445.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The end of the pre-training era is also talking about the future, which has been a consensus in the past year, but Ilya just articulated it. Of course, this ending can also be seen as a bifurcation - one optimizing models under limited data for better efficiency, and another exploring new training methods.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ueI4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ueI4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 424w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 848w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 1272w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ueI4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp" width="670" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:670,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ueI4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 424w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 848w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 1272w, https://substackcdn.com/image/fetch/$s_!ueI4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9458a3ce-088d-4eb6-b158-49d2c4cf938e_670x378.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The three future trends Ilya mentioned can be consolidated into two, as Agents and synthetic data show convergence trends.</p><p><strong>Agents</strong> refer to super-intelligent agents with reasoning capabilities and self-awareness. The self-awareness here can be understood as proactive agents that make active reasoning and decisions.</p><p><strong>Synthetic data </strong>- current large models all involve synthetic training data, and many vendors describe their largest parameter models as specifically designed for synthetic data.</p><p>The goal behind synthetic data is to transcend (move beyond) human data, allowing AI systems to self-iterate. It has several directions: one optimizing data quality, like phi-4 for reasoning models, another generating personalized data for virtual characters to expand data boundaries, etc. The latter requires Agent participation. I believe synthetic data will gradually penetrate text, speech, image, and video domains, with model internal agents participating in the entire data synthesis process, hence the convergence of Agents and synthetic data.</p><p><strong>Inference time compute</strong> represents further optimization of the O1 technical route.</p><p></p><h2><strong>Top Papers of the week</strong></h2><p>1). Training LLMs to Reason in a Continuous Latent Space ( <a href="https://arxiv.org/abs/2412.06769">paper</a> )</p><ul><li><p>Meta proposed Coconut (Continuous Chain of Thought), a novel paradigm enabling LLMs to reason in continuous latent space rather than natural language.</p></li><li><p>The authors believe this continuous latent space reasoning can enhance LLMs' reasoning capabilities, leading to better performance in complex reasoning tasks.</p></li><li><p>Through experiments, the authors demonstrated that this continuous latent space reasoning can improve LLM performance in complex reasoning tasks.</p></li><li><p> link: <a href="https://x.com/Ber18791531/status/1866561188664087017">Author's introduction tweet</a></p><p></p></li></ul><p>2). Phi-4 Technical Report ( <a href="https://arxiv.org/abs/2412.08905">paper</a> )</p><ul><li><p>Microsoft's phi-4, a 14B small model, outperforms many models including Gemini Pro 1.5 in mathematical reasoning tasks.</p></li><li><p>The model's excellence in reasoning tasks is attributed to improvements in synthetic data and post-training.</p></li><li><p>Comment: phi-4 demonstrates a trend: small models or vertical models are the future, also reflecting that the pre-training data wall is approaching, and future data generation and utilization will be the foundation for AI progress.</p><p></p></li></ul><p>3). The Byte Latent Transformer (BLT) ( <a href="https://arxiv.org/abs/2412.05579">paper</a> )</p><ul><li><p>Proposed a byte-level language model architecture that matches token-based LLM performance while improving efficiency and robustness.</p></li><li><p>Uses entropy-based dynamic method to group bytes into patches, allocating more computational resources for complex predictions while using larger patches for more predictable sequences.</p></li><li><p>links: <a href="https://x.com/ArtidoroPagnoni/status/1867601413741981804">Author's tweet</a> and <a href="https://github.com/facebookresearch/blt">code</a></p><p></p></li></ul><p>4). Asynchronous Function Calling ( <a href="https://arxiv.org/abs/2412.07017">paper</a> )</p><ul><li><p>Proposed AsyncLM, a system for asynchronous LLM function calls.</p></li><li><p>The authors designed a context protocol for function calls and interrupts, provided a fine-tuning strategy to adapt to interrupt semantics, and efficiently implemented these mechanisms in LLM inference.</p></li><li><p>AsyncLM can reduce task completion latency from 1.6x to 5.4x compared to synchronous function calls.</p></li><li><p>It enables LLMs to generate and execute function calls simultaneously.</p><p></p></li></ul><p>5). MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification ( <a href="https://arxiv.org/abs/2412.04494">paper</a> )</p><ul><li><p>Proposed MAG-V, a multi-agent framework.</p></li><li><p>It first generates datasets mimicking customer queries.</p></li><li><p>Then reverse engineers alternative questions from agent responses to verify agent trajectories.</p></li><li><p>Reports indicate that generated synthetic data can improve agent performance on real customer queries.</p></li><li><p>Comment: The combination of Agents and synthetic data is a trend.</p></li></ul><p></p><p>6). Clio: A Platform for Analyzing and Surface Private Aggregated Usage Patterns from Millions of Claude.ai Conversations ( <a href="https://assets.anthropic.com/m/7e1ab885d1b24176/original/Clio-Privacy-Preserving-Insights-into-Real-World-AI-Use.pdf">paper</a> )</p><ul><li><p>Anthropic introduced Clio, a platform using AI assistants to analyze and display private usage patterns extracted from millions of Claude.ai conversations.</p></li><li><p>It enables understanding real-world AI usage while protecting user privacy.</p><p>The system helps identify usage trends, security risks, and coordinated abuse attempts without human reviewers reading original conversations.</p></li><li><p>Additional link: <a href="https://x.com/AnthropicAI/status/1867325199848550585">Anthropic tweet</a></p></li><li><p>Comment: The paper includes an analysis showing that programming-related cases account for 4 of the top use cases, totaling 23%, indicating that programming is currently the most common AI usage scenario.</p><p></p></li></ul><p>7). AutoReason Improves Multi-step Reasoning ( <a href="https://arxiv.org/abs/2412.05579">paper</a> )</p><ul><li><p>Proposed a method using CoT prompting to automatically generate reasoning rationales for queries.</p></li><li><p>This transforms zero-shot queries into few-shot reasoning trajectories used by LLM as CoT examples.</p></li><li><p>Authors claim it can improve reasoning capabilities of weaker LLMs.</p><p></p></li></ul><p>8). Densing Law of LLMs ( <a href="https://arxiv.org/abs/2412.04315">paper</a> )</p><ul><li><p>Introduced "capacity density" as a new metric to evaluate LLMs quality, measuring model effectiveness and efficiency by comparing target models with reference models.</p></li><li><p>Research found that LLMs' capacity density follows a "density law," growing exponentially over time, roughly doubling every three months.</p></li><li><p>This finding provides new perspectives for LLM development, emphasizing the need to focus on computational efficiency optimization while pursuing performance improvements.</p></li><li><p>Comment: The paper mentions a concept called "effective parameter size," which refers to the parameter size needed for a model to achieve the same performance. This concept can be used to measure model efficiency.</p></li></ul><p></p><p>9). Turbo3D: Ultra-fast Text-to-3D Generation ( <a href="https://arxiv.org/abs/2412.04315">paper</a> )</p><ul><li><p>Introduced Turbo3D, an ultra-fast text-to-3D system capable of generating high-quality Gaussian splatting assets in less than a second.</p><p></p></li><li><p>Turbo3D employs a rapid four-step four-view diffusion generator and efficient feed-forward Gaussian reconstructor, both operating in latent space.</p><p></p></li></ul><p>10). A Survey on LLMs-as-Judges ( <a href="https://arxiv.org/abs/2412.05579">paper</a> )</p><ul><li><p>Presented a comprehensive survey exploring the LLMs-as-judges paradigm from five key perspectives: functionality, methodology, applications, meta-evaluation, and limitations.</p><p></p></li></ul><h2><strong>AIGC News of the week</strong></h2><p>1). <a href="https://github.com/Tencent/HunyuanVideo">HunyuanVideo</a></p><p>2). <a href="https://github.com/deepseek-ai/DeepSeek-VL2">DeepSeek-VL2</a></p><p>3). <a href="https://github.com/KwaiVGI/SynCamMaster">SynCamMaster</a></p><p>4). <a href="https://www.youtube.com/watch?v=z0wt2pe_LZM&amp;ab_channel=YCombinator">2024: The Year the GPT Wrapper Myth Proved Wrong</a></p><p>5). <a href="https://andrewkchan.dev/posts/yalm.html">Fast LLM Inference From Scratch</a></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Newsletter Black Friday Special: Lowest Price of the Year!]]></title><description><![CDATA[Best Subscription Opportunity of the Year!]]></description><link>https://aigc.news/p/aigc-newsletter-black-friday-special</link><guid isPermaLink="false">https://aigc.news/p/aigc-newsletter-black-friday-special</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Thu, 28 Nov 2024 14:31:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cf5487-863a-4479-8c5c-c5abb1e31139_250x250.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#26412;&#25991;&#20013;&#25991;&#29256;&#65306;<a href="https://pxiaoer.blog/2024/11/28/aigc-newsletter-update/">[&#20013;&#25991;&#29256;] </a></p><p></p><p>&#127919;<strong>&nbsp;Limited-Time Offer: Best Subscription Opportunity of the Year!  </strong> </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p><p><strong>Dear AIGC Newsletter Readers,</strong></p><p>Exciting news! We are thrilled to announce the <strong>Black Friday Special</strong>, bringing you the <strong>lowest subscription price of 2024</strong>! This is the perfect opportunity to upgrade to full membership and unlock all premium content.</p><div><hr></div><h3>Content Upgrades and New Features</h3><p>We continue to innovate, bringing you higher-quality services and richer content. Here are the core modules you'll unlock with a subscription:</p><h3>&#10024; AIGC Journey</h3><p>- Weekly in-depth analysis of AIGC development trends  </p><p>- Practical guides and tutorials  </p><p>- Industry insights and frontier analysis  </p><p></p><h3> &#128202; AIGC Weekly</h3><p>- Curated weekly AI updates and news breakdowns  </p><p>- AI market trends and opportunity analysis  </p><p>- Recommendations and reviews of premium tools  </p><p>- New formats for deeper, more practical content  </p><p></p><h3>&#128293; Coming Soon: AI News Agent</h3><p>- Upgraded version of <a href="https://ainews.kol.tools/">AINews</a>  </p><p>- Daily AIGC news updates in real time  </p><p>- Personalized news filtering features  </p><p>- Stay ahead with the latest industry developments  </p><div><hr></div><h3>Why Subscribe Now?</h3><p>1. <strong>Exclusive Black Friday Pricing</strong>: Lowest price of the year, for this week only!  </p><p>2. <strong>Unlock All Premium Content</strong>: Gain access to in-depth articles and professional insights.  </p><p>3. <strong>Early Access to New Features</strong>: Be the first to experience updates and test new applications.  </p><p><strong>Don't miss this opportunity!</strong>This is our biggest discount of the year. Upgrade your membership to stay ahead in the fast-evolving AIGC field.</p><p></p><p>&#128073;&nbsp;<strong>Subscribe Now</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p><div><hr></div><h3>Our Core Philosophy</h3><p>Since AIGC Newsletter was launched two years ago, we have been committed to providing readers with high-quality information services. Back then, ChatGPT had yet to emerge, but the AIGC trend was already apparent. We sift through, curate, and refine massive amounts of information, sending it to your inbox weekly, so you can grasp key insights in the shortest time.</p><p>It&#8217;s important to note that <strong>we are not "selling information" but "organizing information."</strong></p><p>In this age of information overload, our value lies in helping you navigate complex content, distill clear trends, and go beyond simply passing along publicly available information.</p><p>Your subscription is more an acknowledgment of our service than a mere transaction of information. We firmly believe that the value of information lies in efficient curation and unique insights, which require significant time and effort.</p><p><strong>By subscribing to&nbsp;<a href="https://aigc.openbot.ai/">AIGC Newsletter</a>, you&#8217;re not just gaining information&#8212;you&#8217;re saving time and gaining clarity on trends.</strong></p><div><hr></div><h3>About Pricing and Services</h3><p>We understand that some may have concerns about the cost of information services.  </p><p>In reality, while information can be copied infinitely, human time is invaluable. For this reason, we approach our information products with a service mindset, providing readers with high-value experiences at a reasonable price.  </p><p>As our subscriber base grows, we will continue to optimize costs to offer more affordable services to a broader audience.</p><p>If you share our philosophy, we welcome you to recommend our Newsletter to your friends. Your support and promotion are the driving forces behind our continuous progress.</p><div><hr></div><h3>Looking to the Future</h3><p>2024 is a year full of possibilities. Every update of the AIGC Newsletter is a record and witness to this era of transformation. From scarcity to abundance, we are experiencing a pivotal moment in human history and actively contributing to this great shift.</p><p>We believe the dawn of the era of abundance is upon us, and you will join us in shaping the future.  </p><p>Thank you for your attention and support. Let us embrace this hopeful tomorrow together!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Gratitude and Renewal: AIGC Newsletter Set to Relaunch]]></title><description><![CDATA[&#24863;&#24681;&#22238;&#39304;&#65306;AIGC Newsletter &#21363;&#23558;&#28949;&#26032;&#21576;&#29616;]]></description><link>https://aigc.news/p/gratitude-and-renewal-aigc-newsletter</link><guid isPermaLink="false">https://aigc.news/p/gratitude-and-renewal-aigc-newsletter</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 18 Nov 2024 09:54:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96cf5487-863a-4479-8c5c-c5abb1e31139_250x250.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Thank you for your support of the AIGC Newsletter. The revamped Newsletter will be updated soon, and AIGC Weekly will return next week. To express our gratitude, I have extended the subscription peri&#8230;</p>
      <p>
          <a href="https://aigc.news/p/gratitude-and-renewal-aigc-newsletter">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #89]]></title><description><![CDATA[AIGC Top Papers and AI news of the week]]></description><link>https://aigc.news/p/aigc-weekly-89</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-89</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 14 Oct 2024 14:40:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pyoE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pyoE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pyoE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pyoE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131786,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pyoE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!pyoE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce5ac404-f7ff-4f1a-abdd-6b24d2b09528_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Top Papers of the week&#65288;</strong>October 07 - October 13<strong>&#65289;</strong></h3><p>1.) <strong>Pyramidal Flow Matching for Efficient Video Generative Modeling ( <a href="https://pyramid-flow.github.io/">webpage</a>  | <a href="https://arxiv.org/abs/2410.05954">paper</a> | <a href="https://huggingface.co/rain1011/pyramid-flow-sd3">model</a> | <a href="https://huggingface.co/spaces/Pyramid-Flow/pyramid-flow">demo</a> | <a href="https://github.com/jy0205/Pyramid-Flow">code </a>)</strong></p><p><em>This work introduces a unified pyramidal flow matching algorithm. It reinterprets the original denoising trajectory as a series of pyramid stages, where only the final stage operates at the full resolution, thereby enabling more efficient video generative modeling. Through our sophisticated design, the flows of different pyramid stages can be interlinked to maintain continuity. Moreover, we craft autoregressive video generation with a temporal pyramid to compress the full-resolution history. The entire framework can be optimized in an end-to-end manner and with a single unified Diffusion Transformer (DiT). Extensive experiments demonstrate that our method supports generating high-quality 5-second (up to 10-second) videos at 768p resolution and 24 FPS within 20.7k A100 GPU training hours.</em></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;57eba6cc-6976-41a7-8d63-fe2b2a9beaca&quot;,&quot;duration&quot;:null}"></div><p>2.) <strong>RL, but don't do anything I wouldn't do ( <a href="https://arxiv.org/abs/2410.06213">paper</a> )</strong></p><p><em>In reinforcement learning, if the agent's reward differs from the designers' true utility, even only rarely, the state distribution resulting from the agent's policy can be very bad, in theory and in practice. When RL policies would devolve into undesired behavior, a common countermeasure is KL regularization to a trusted policy ("Don't do anything I wouldn't do"). All current cutting-edge language models are RL agents that are KL-regularized to a "base policy" that is purely predictive. Unfortunately, we demonstrate that when this base policy is a Bayesian predictive model of a trusted policy, the KL constraint is no longer reliable for controlling the behavior of an advanced RL agent. We demonstrate this theoretically using algorithmic information theory, and while systems today are too weak to exhibit this theorized failure precisely, we RL-finetune a language model and find evidence that our formal results are plausibly relevant in practice. We also propose a theoretical alternative that avoids this problem by replacing the "Don't do anything I wouldn't do" principle with "Don't do anything I mightn't do".</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jkew!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jkew!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 424w, https://substackcdn.com/image/fetch/$s_!jkew!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 848w, https://substackcdn.com/image/fetch/$s_!jkew!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 1272w, https://substackcdn.com/image/fetch/$s_!jkew!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jkew!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png" width="536" height="441.35823429541597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1178,&quot;resizeWidth&quot;:536,&quot;bytes&quot;:585035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jkew!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 424w, https://substackcdn.com/image/fetch/$s_!jkew!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 848w, https://substackcdn.com/image/fetch/$s_!jkew!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 1272w, https://substackcdn.com/image/fetch/$s_!jkew!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea461cb0-d23c-45c9-afa6-2f916c3a07db_1178x970.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3.) <strong>MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering ( <a href="https://arxiv.org/abs/2410.07095">paper</a> | <a href="https://github.com/openai/mle-bench/">code</a> )</strong></p><p><em>We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering. To this end, we curate 75 ML engineering-related competitions from Kaggle, creating a diverse set of challenging tasks that test real-world ML engineering skills such as training models, preparing datasets, and running experiments. We establish human baselines for each competition using Kaggle's publicly available leaderboards. We use open-source agent scaffolds to evaluate several frontier language models on our benchmark, finding that the best-performing setup--OpenAI's o1-preview with AIDE scaffolding--achieves at least the level of a Kaggle bronze medal in 16.9% of competitions. In addition to our main results, we investigate various forms of resource scaling for AI agents and the impact of contamination from pre-training.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dgEY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dgEY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 424w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 848w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 1272w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dgEY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png" width="596" height="310.2174688057041" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:584,&quot;width&quot;:1122,&quot;resizeWidth&quot;:596,&quot;bytes&quot;:130471,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dgEY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 424w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 848w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 1272w, https://substackcdn.com/image/fetch/$s_!dgEY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b73adab-27ad-4a6a-99aa-a6fea2d73628_1122x584.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>4.) <strong>Aria: An Open Multimodal Native Mixture-of-Experts Model ( <a href="https://arxiv.org/abs/2410.05993">paper</a> | <a href="https://github.com/rhymes-ai/Aria">code</a> )</strong></p><p><em>Information comes in diverse modalities. Multimodal native AI models are essential to integrate real-world information and deliver comprehensive understanding. While proprietary multimodal native models exist, their lack of openness imposes obstacles for adoptions, let alone adaptations. To fill this gap, we introduce Aria, an open multimodal native model with best-in-class performance across a wide range of multimodal, language, and coding tasks. Aria is a mixture-of-expert model with 3.9B and 3.5B activated parameters per visual token and text token, respectively. It outperforms Pixtral-12B and Llama3.2-11B, and is competitive against the best proprietary models on various multimodal tasks. We pre-train Aria from scratch following a 4-stage pipeline, which progressively equips the model with strong capabilities in language understanding, multimodal understanding, long context window, and instruction following. We open-source the model weights along with a codebase that facilitates easy adoptions and adaptations of Aria in real-world applications.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8rNG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8rNG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 424w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 848w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 1272w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8rNG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png" width="620" height="384.6101694915254" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1426c1f-0a28-4737-818a-6817deec2265_1180x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:732,&quot;width&quot;:1180,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:214625,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8rNG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 424w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 848w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 1272w, https://substackcdn.com/image/fetch/$s_!8rNG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1426c1f-0a28-4737-818a-6817deec2265_1180x732.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>5.) <strong>ToolGen: Unified Tool Retrieval and Calling via Generation ( <a href="https://arxiv.org/abs/2410.03439">paper</a> )</strong></p><p><em> We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the LLM's parameters by representing each tool as a unique token. This enables the LLM to generate tool calls and arguments as part of its next token prediction capabilities, seamlessly blending tool invocation with language generation. Our framework allows the LLM to access and utilize a vast amount of tools with no additional retrieval step, significantly enhancing both performance and scalability. Experimental results with over 47,000 tools show that ToolGen not only achieves superior results in both tool retrieval and autonomous task completion but also sets the stage for a new era of AI agents that can adapt to tools across diverse domains. By fundamentally transforming tool retrieval into a generative process, ToolGen paves the way for more versatile, efficient, and autonomous AI systems. ToolGen enables end-to-end tool learning and opens opportunities for integration with other advanced techniques such as chain-of-thought and reinforcement learning, thereby expanding the practical capabilities of LLMs.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D0a5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D0a5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 424w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 848w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 1272w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D0a5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png" width="606" height="405.0802139037433" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:750,&quot;width&quot;:1122,&quot;resizeWidth&quot;:606,&quot;bytes&quot;:198564,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D0a5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 424w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 848w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 1272w, https://substackcdn.com/image/fetch/$s_!D0a5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4360eec9-0b19-407b-bf7d-fa21a82b0b33_1122x750.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>6.) <strong>Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition ( <a href="https://arxiv.org/abs/2410.05603">paper</a> )</strong></p><p><em>Large Language Models (LLMs) have demonstrated remarkable in-context learning (ICL) capabilities. In this study, we explore a surprising phenomenon related to ICL: LLMs can perform multiple, computationally distinct ICL tasks simultaneously, during a single inference call, a capability we term "task superposition". We provide empirical evidence of this phenomenon across various LLM families and scales and show that this phenomenon emerges even if we train the model to in-context learn one task at a time. We offer theoretical explanations that this capability is well within the expressive power of transformers. We also explore how LLMs internally compose task vectors during superposition. Furthermore, we show that larger models can solve more ICL tasks in parallel, and better calibrate their output distribution. Our findings offer insights into the latent capabilities of LLMs, further substantiate the perspective of "LLMs as superposition of simulators", and raise questions about the mechanisms enabling simultaneous task execution.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sS_2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sS_2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 424w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 848w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 1272w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sS_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png" width="508" height="521.2062391681109" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5576074-8f53-40a6-81db-911b7599255e_1154x1184.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1184,&quot;width&quot;:1154,&quot;resizeWidth&quot;:508,&quot;bytes&quot;:365805,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sS_2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 424w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 848w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 1272w, https://substackcdn.com/image/fetch/$s_!sS_2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5576074-8f53-40a6-81db-911b7599255e_1154x1184.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>7.) <strong>Emergent properties with repeated examples ( <a href="https://arxiv.org/abs/2410.07041">paper</a> )</strong></p><p><em>We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets. On three problems of mathematics: the greatest common divisor, modular multiplication, and matrix eigenvalues, we show that for a fixed number of training steps, models trained on smaller sets of repeated examples outperform models trained on larger sets of single-use examples. We also demonstrate that two-set training - repeated use of a small random subset of examples, along normal sampling on the rest of the training set - provides for faster learning and better performance. This highlights that the benefits of repetition can outweigh those of data diversity. These datasets and problems provide a controlled setting to shed light on the still poorly understood interplay between generalization and memorization in deep learning.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VAsi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VAsi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 424w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 848w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 1272w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VAsi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png" width="614" height="398.938704028021" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:742,&quot;width&quot;:1142,&quot;resizeWidth&quot;:614,&quot;bytes&quot;:385447,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VAsi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 424w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 848w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 1272w, https://substackcdn.com/image/fetch/$s_!VAsi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6824d27-e9fa-432d-b7d3-7cac42906441_1142x742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>8.) <strong>Round and Round We Go! What makes Rotary Positional Encodings useful? ( <a href="https://arxiv.org/abs/2410.06205">paper</a> )</strong></p><p><em>Positional Encodings (PEs) are a critical component of Transformer-based Large Language Models (LLMs), providing the attention mechanism with important sequence-position information. One of the most popular types of encoding used today in LLMs are Rotary Positional Encodings (RoPE), that rotate the queries and keys based on their relative distance. A common belief is that RoPE is useful because it helps to decay token dependency as relative distance increases. In this work, we argue that this is unlikely to be the core reason. We study the internals of a trained Gemma 7B model to understand how RoPE is being used at a mechanical level. We find that Gemma learns to use RoPE to construct robust "positional" attention patterns by exploiting the highest frequencies. We also find that, in general, Gemma greatly prefers to use the lowest frequencies of RoPE, which we suspect are used to carry semantic information. We mathematically prove interesting behaviours of RoPE and conduct experiments to verify our findings, proposing a modification of RoPE that fixes some highlighted issues and improves performance. We believe that this work represents an interesting step in better understanding PEs in LLMs, which we believe holds crucial value for scaling LLMs to large sizes and context lengths.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zBS1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zBS1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 424w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 848w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 1272w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zBS1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png" width="574" height="368.207381370826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:730,&quot;width&quot;:1138,&quot;resizeWidth&quot;:574,&quot;bytes&quot;:156856,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zBS1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 424w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 848w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 1272w, https://substackcdn.com/image/fetch/$s_!zBS1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffae83e7c-0e98-4d6b-9d8a-8d074658097a_1138x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>9.) <strong>Strong Model Collapse ( <a href="https://arxiv.org/abs/2410.04840">paper</a> )</strong></p><p><em>Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5oWE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5oWE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 424w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 848w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 1272w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5oWE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png" width="630" height="503.11733800350265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:912,&quot;width&quot;:1142,&quot;resizeWidth&quot;:630,&quot;bytes&quot;:317746,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5oWE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 424w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 848w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 1272w, https://substackcdn.com/image/fetch/$s_!5oWE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39059065-4af8-40c9-97f2-e34f4218a780_1142x912.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>10.) <strong>Self-Boosting Large Language Models with Synthetic Preference Data ( <a href="https://arxiv.org/abs/2410.06961">paper</a> )</strong></p><p><em>Through alignment with human preferences, Large Language Models (LLMs) have advanced significantly in generating honest, harmless, and helpful responses. However, collecting high-quality preference data is a resource-intensive and creativity-demanding process, especially for the continual improvement of LLMs. We introduce SynPO, a self-boosting paradigm that leverages synthetic preference data for model alignment. SynPO employs an iterative mechanism wherein a self-prompt generator creates diverse prompts, and a response improver refines model responses progressively. This approach trains LLMs to autonomously learn the generative rewards for their own outputs and eliminates the need for large-scale annotation of prompts and human preferences. After four SynPO iterations, Llama3-8B and Mistral-7B show significant enhancements in instruction-following abilities, achieving over 22.1% win rate improvements on AlpacaEval 2.0 and ArenaHard. Simultaneously, SynPO improves the general performance of LLMs on various tasks, validated by a 3.2 to 5.0 average score increase on the well-recognized Open LLM leaderboard.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s4Gn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s4Gn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 424w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 848w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 1272w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s4Gn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png" width="656" height="336.9502407704655" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1246,&quot;resizeWidth&quot;:656,&quot;bytes&quot;:192252,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s4Gn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 424w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 848w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 1272w, https://substackcdn.com/image/fetch/$s_!s4Gn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bcb00a0-de9a-4246-98fa-2416aefe5a52_1246x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>AIGC News of the week<strong>&#65288;</strong>October 07 - October 13<strong>&#65289;</strong></h3><p>1.) The Nobel Prize in Physics 2024  ( <a href="https://www.nobelprize.org/all-nobel-prizes-2024/">link</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z3Dm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 424w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 848w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 1272w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png" width="610" height="421.0508241758242" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1005,&quot;width&quot;:1456,&quot;resizeWidth&quot;:610,&quot;bytes&quot;:867706,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 424w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 848w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 1272w, https://substackcdn.com/image/fetch/$s_!Z3Dm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a93d399-14b3-4aec-bc6d-a420fea911f1_1506x1040.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GL_3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GL_3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 424w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 848w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 1272w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GL_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png" width="600" height="275.27472527472526" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:668,&quot;width&quot;:1456,&quot;resizeWidth&quot;:600,&quot;bytes&quot;:1141577,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GL_3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 424w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 848w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 1272w, https://substackcdn.com/image/fetch/$s_!GL_3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31fddcb0-8a6e-44eb-8ff6-ff0193f9d74b_2136x980.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>2.)<strong> </strong>Machines of Loving Grace<strong> ( <a href="https://darioamodei.com/machines-of-loving-grace">link</a> )</strong></p><p>3.) F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching ( <a href="https://github.com/SWivid/F5-TTS">repo</a> )</p><p>4.) swarm: Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team ( <a href="https://github.com/openai/swarm">repo</a> ) </p><p>5.) evaluation-guidebook ( <a href="https://github.com/huggingface/evaluation-guidebook">link</a> )</p><p></p><p>more AIGC News: <a href="https://ainews.kol.tools/">AINews</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q_Aq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 424w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 848w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 1272w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png" width="544" height="600.7912087912088" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1608,&quot;width&quot;:1456,&quot;resizeWidth&quot;:544,&quot;bytes&quot;:424826,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 424w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 848w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 1272w, https://substackcdn.com/image/fetch/$s_!Q_Aq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b6efaf7-7360-4b8f-9a2d-922284976b94_1472x1626.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #88]]></title><description><![CDATA[AIGC Top Papers and AI news of the week]]></description><link>https://aigc.news/p/aigc-weekly-88</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-88</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 07 Oct 2024 15:00:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!187o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!187o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!187o!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!187o!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!187o!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!187o!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!187o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!187o!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!187o!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!187o!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!187o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd42d098d-9c45-44c7-a243-b39110ac712b_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>Update Notice: </strong></em></p><p><em>Hello everyone, I'm <a href="https://x.com/pxiaoer">Pxiaoer</a>. Starting with <a href="https://aigc.openbot.ai/p/aigc-weekly-88">AIGC Weekly #88</a>, the AIGC Newsletter will be updated twice a week. </em></p><p><em>AIGC Weekly will be released every Monday, and an AI technology article will be published every Thursday. </em></p><p><em>Welcome to <a href="https://aigc.openbot.ai/">subscribe</a>!</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aigc.news/subscribe?"><span>Subscribe now</span></a></p><p></p><h3><strong>Top Papers of the week&#65288;</strong>September 30 - October 06<strong>&#65289;</strong></h3><p>1.) <strong>Movie Gen: A Cast of Media Foundation Models ( <a href="https://ai.meta.com/research/movie-gen/">webpage</a> | <a href="https://ai.meta.com/static-resource/movie-gen-research-paper">paper</a> )</strong></p><p><em>We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user&#8217;s image. Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization, video editing, video-to-audio generation, and text-to-audio generation. Our largest video generation model is a 30B parameter transformer trained with a maximum context length of 73K video tokens, corresponding to a generated video of 16 seconds at 16 frames-per-second. We show multiple technical innovations and simplifications on the architecture, latent spaces, training objectives and recipes, data curation, evaluation protocols, parallelization techniques, and inference optimizations that allow us to reap the benefits of scaling pre-training data, model size, and training compute for training large scale media generation models. We hope this paper helps the research community to accelerate progress and innovation in media generation models.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e-Kb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e-Kb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 424w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 848w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e-Kb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png" width="520" height="385.4424040066778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1332,&quot;width&quot;:1797,&quot;resizeWidth&quot;:520,&quot;bytes&quot;:2083475,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e-Kb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 424w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 848w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Kb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44d2fd37-887f-4056-8620-fd83504c2f47_1797x1332.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>2.) <strong>Were RNNs All We Needed? ( <a href="https://arxiv.org/abs/2410.01201">paper</a> )</strong></p><p><em>The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance. In this work, we revisit traditional recurrent neural networks (RNNs) from over a decade ago: LSTMs (1997) and GRUs (2014). While these models were slow due to requiring to backpropagate through time (BPTT), we show that by removing their hidden state dependencies from their input, forget, and update gates, LSTMs and GRUs no longer need to BPTT and can be efficiently trained in parallel. Building on this, we introduce minimal versions (minLSTMs and minGRUs) that (1) use significantly fewer parameters than their traditional counterparts and (2) are fully parallelizable during training (175x faster for a sequence of length 512). Lastly, we show that these stripped-down versions of decade-old RNNs match the empirical performance of recent sequence models.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k79t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k79t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k79t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k79t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k79t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k79t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg" width="1456" height="489" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:489,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k79t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 424w, https://substackcdn.com/image/fetch/$s_!k79t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 848w, https://substackcdn.com/image/fetch/$s_!k79t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!k79t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcb2e367-b429-4db1-a4d5-4660d3ee9e22_1758x590.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3.) <strong>ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation (<a href="https://comfygen-paper.github.io/">webpage</a> | <a href="https://arxiv.org/abs/2410.01731">paper</a> )</strong></p><p><em>The practical use of text-to-image generation has evolved from simple, monolithic models to complex workflows that combine multiple specialized components. While workflow-based approaches can lead to improved image quality, crafting effective workflows requires significant expertise, owing to the large number of available components, their complex inter-dependence, and their dependence on the generation prompt. Here, we introduce the novel task of prompt-adaptive workflow generation, where the goal is to automatically tailor a workflow to each user prompt. We propose two LLM-based approaches to tackle this task: a tuning-based method that learns from user-preference data, and a training-free method that uses the LLM to select existing flows. Both approaches lead to improved image quality when compared to monolithic models or generic, prompt-independent workflows. Our work shows that prompt-dependent flow prediction offers a new pathway to improving text-to-image generation quality, complementing existing research directions in the field.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wsrL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wsrL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 424w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 848w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 1272w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wsrL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png" width="564" height="389.69094138543517" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e168c189-3833-4736-8223-60e9992ef677_1126x778.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1126,&quot;resizeWidth&quot;:564,&quot;bytes&quot;:745461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wsrL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 424w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 848w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 1272w, https://substackcdn.com/image/fetch/$s_!wsrL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe168c189-3833-4736-8223-60e9992ef677_1126x778.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>4.) <strong>PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation ( <a href="https://stevenlsw.github.io/physgen/">webpage</a> | <a href="https://arxiv.org/abs/2409.18964">paper</a> )</strong></p><p><em>We present PhysGen, a novel image-to-video generation method that converts a single image and an input condition (e.g., force and torque applied to an object in the image) to produce a realistic, physically plausible, and temporally consistent video. Our key insight is to integrate model-based physical simulation with a data-driven video generation process, enabling plausible image-space dynamics. At the heart of our system are three core components: (i) an image understanding module that effectively captures the geometry, materials, and physical parameters of the image; (ii) an image-space dynamics simulation model that utilizes rigid-body physics and inferred parameters to simulate realistic behaviors; and (iii) an image-based rendering and refinement module that leverages generative video diffusion to produce realistic video footage featuring the simulated motion. The resulting videos are realistic in both physics and appearance and are even precisely controllable, showcasing superior results over existing data-driven image-to-video generation works through quantitative comparison and comprehensive user study. PhysGen's resulting videos can be used for various downstream applications, such as turning an image into a realistic animation or allowing users to interact with the image and create various dynamics.</em></p><div id="youtube2-lCc1rHePEFQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;lCc1rHePEFQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/lCc1rHePEFQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p></p><p>5.) <strong>Not All LLM Reasoners Are Created Equal ( <a href="https://arxiv.org/abs/2410.01748">paper</a> )</strong></p><p><em>We study the depth of grade-school math (GSM) problem-solving capabilities of LLMs. To this end, we evaluate their performance on pairs of existing math word problems together so that the answer to the second problem depends on correctly answering the first problem. Our findings reveal a significant reasoning gap in most LLMs, that is performance difference between solving the compositional pairs and solving each question independently. This gap is more pronounced in smaller, more cost-efficient, and math-specialized models. Moreover, instruction-tuning recipes and code generation have varying effects across LLM sizes, while finetuning on GSM can lead to task overfitting. Our analysis indicates that large reasoning gaps are not because of test-set leakage, but due to distraction from additional context and poor second-hop reasoning. Overall, LLMs exhibit systematic differences in their reasoning abilities, despite what their performance on standard benchmarks indicates.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CwIq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CwIq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 424w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 848w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 1272w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CwIq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png" width="566" height="367.15743440233234" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:890,&quot;width&quot;:1372,&quot;resizeWidth&quot;:566,&quot;bytes&quot;:213892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CwIq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 424w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 848w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 1272w, https://substackcdn.com/image/fetch/$s_!CwIq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70584fdc-e21a-4ceb-afc8-82797bd24e73_1372x890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>6.) <strong>A generative framework to bridge data-driven models and scientific theories in language neuroscience ( <a href="https://arxiv.org/abs/2410.00812">paper</a> )</strong></p><p><em>Representations from large language models are highly effective at predicting BOLD fMRI responses to language stimuli. However, these representations are largely opaque: it is unclear what features of the language stimulus drive the response in each brain area. We present generative explanation-mediated validation, a framework for generating concise explanations of language selectivity in the brain and then validating those explanations in follow-up experiments that use synthetic stimuli. This approach is successful at explaining selectivity both in individual voxels and cortical regions of interest (ROIs).We show that explanatory accuracy is closely related to the predictive power and stability of the underlying statistical models. These results demonstrate that LLMs can be used to bridge the widening gap between data-driven models and formal scientific theories.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!16S-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!16S-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 424w, https://substackcdn.com/image/fetch/$s_!16S-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 848w, https://substackcdn.com/image/fetch/$s_!16S-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 1272w, https://substackcdn.com/image/fetch/$s_!16S-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!16S-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png" width="594" height="615.8135593220339" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1468,&quot;width&quot;:1416,&quot;resizeWidth&quot;:594,&quot;bytes&quot;:548183,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!16S-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 424w, https://substackcdn.com/image/fetch/$s_!16S-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 848w, https://substackcdn.com/image/fetch/$s_!16S-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 1272w, https://substackcdn.com/image/fetch/$s_!16S-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb022a89b-2011-476a-9a29-d77bc763cb43_1416x1468.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>7.) <strong>DiaSynth -- Synthetic Dialogue Generation Framework ( <a href="https://arxiv.org/abs/2409.19020">paper</a> )</strong></p><p><em>The scarcity of domain specific dialogue datasets across various domains, from academic topics to everyday conversations, limits the development of dialogue systems for various applications. Existing research is often constrained either by dialogue datasets that are too general or by niche domain dialogue datasets whose scale does not match the required scale for training dialogue systems. To address this gap, we introduce DiaSynth - a synthetic dialogue generation framework capable of generating high quality, contextually rich dialogues across a wide range of domains. Our approach differs from existing frameworks by dynamically generating dialogues that incorporate simulated personas, subtopics, and diverse conversational characteristics, using a Large Language Model (LLM) with Chain of Thought (CoT) reasoning to create contextually rich, domain-specific dialogues that closely mimic natural human interactions. DiaSynth produces tailored dialogues that emulate realistic conversations. We perform our experiments by generating synthetic data using different LLMs and few-shot examples from DialogSum and SAMSum. The pretrained language models fine-tuned on the synthetic data outperform the base models by 16.47%, while the comparison between models fine-tuned on in-domain data and synthetic data shows that the synthetic data is able to capture 90.48% of the distribution of the in-domain data. The quality of the data generated also scales with the size of LLMs. These results validate DiaSynth's potential as a robust alternative to traditional data collection methods.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ES0M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ES0M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 424w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 848w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ES0M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png" width="540" height="503.8427947598253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1282,&quot;width&quot;:1374,&quot;resizeWidth&quot;:540,&quot;bytes&quot;:149949,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ES0M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 424w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 848w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!ES0M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49be5623-73af-45ef-9854-9a3ddedae4f1_1374x1282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>8.) <strong>Video Instruction Tuning With Synthetic Data ( <a href="https://llava-vl.github.io/blog/2024-09-30-llava-video/">webpage</a> | <a href="https://arxiv.org/abs/2410.02713">paper</a> )</strong></p><p><em>The development of video large multimodal models (LMMs) has been hindered by the difficulty of curating large amounts of high-quality raw data from the web. To address this, we propose an alternative approach by creating a high-quality synthetic dataset specifically for video instruction-following, namely LLaVA-Video-178K. This dataset includes key tasks such as detailed captioning, open-ended question-answering (QA), and multiple-choice QA. By training on this dataset, in combination with existing visual instruction tuning data, we introduce LLaVA-Video, a new video LMM. Our experiments demonstrate that LLaVA-Video achieves strong performance across various video benchmarks, highlighting the effectiveness of our dataset. We plan to release the dataset, its generation pipeline, and the model checkpoints.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CRkT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CRkT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 424w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 848w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 1272w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CRkT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png" width="648" height="488.4923076923077" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:882,&quot;width&quot;:1170,&quot;resizeWidth&quot;:648,&quot;bytes&quot;:321747,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CRkT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 424w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 848w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 1272w, https://substackcdn.com/image/fetch/$s_!CRkT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b6e8d05-930f-4c41-b72a-26bf15548eba_1170x882.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>9.) <strong>Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models ( <a href="https://arxiv.org/abs/2410.02740">paper</a> )</strong></p><p><em>Recent advancements in multimodal models highlight the value of rewritten captions for improving performance, yet key challenges remain. For example, while synthetic captions often provide superior quality and image-text alignment, it is not clear whether they can fully replace AltTexts: the role of synthetic captions and their interaction with original web-crawled AltTexts in pre-training is still not well understood. Moreover, different multimodal foundation models may have unique preferences for specific caption formats, but efforts to identify the optimal captions for each model remain limited. In this work, we propose a novel, controllable, and scalable captioning pipeline designed to generate diverse caption formats tailored to various multimodal models. By examining Short Synthetic Captions (SSC) towards Dense Synthetic Captions (DSC+) as case studies, we systematically explore their effects and interactions with AltTexts across models such as CLIP, multimodal LLMs, and diffusion models. Our findings reveal that a hybrid approach that keeps both synthetic captions and AltTexts can outperform the use of synthetic captions alone, improving both alignment and performance, with each model demonstrating preferences for particular caption formats. This comprehensive analysis provides valuable insights into optimizing captioning strategies, thereby advancing the pre-training of multimodal foundation models.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kjLA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kjLA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 424w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 848w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 1272w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kjLA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png" width="640" height="401.14285714285717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:702,&quot;width&quot;:1120,&quot;resizeWidth&quot;:640,&quot;bytes&quot;:437699,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kjLA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 424w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 848w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 1272w, https://substackcdn.com/image/fetch/$s_!kjLA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07c686a1-47e9-461e-9ae7-91dbea3a209c_1120x702.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>10.) <strong>Evaluation of OpenAI o1: Opportunities and Challenges of AGI ( <a href="https://arxiv.org/abs/2409.18486">paper</a> )</strong></p><p><em>This comprehensive study evaluates the performance of OpenAI's o1-preview large language model across a diverse array of complex reasoning tasks, spanning multiple domains, including computer science, mathematics, natural sciences, medicine, linguistics, and social sciences. Through rigorous testing, o1-preview demonstrated remarkable capabilities, often achieving human-level or superior performance in areas ranging from coding challenges to scientific reasoning and from language processing to creative problem-solving. Key findings include:<br>-83.3% success rate in solving complex competitive programming problems, surpassing many human experts.<br>-Superior ability in generating coherent and accurate radiology reports, outperforming other evaluated models.<br>-100% accuracy in high school-level mathematical reasoning tasks, providing detailed step-by-step solutions.<br>-Advanced natural language inference capabilities across general and specialized domains like medicine.<br>-Impressive performance in chip design tasks, outperforming specialized models in areas such as EDA script generation and bug analysis.<br>-Remarkable proficiency in anthropology and geology, demonstrating deep understanding and reasoning in these specialized fields.<br>-Strong capabilities in quantitative investing. O1 has comprehensive financial knowledge and statistical modeling skills.<br>-Effective performance in social media analysis, including sentiment analysis and emotion recognition.<br>The model excelled particularly in tasks requiring intricate reasoning and knowledge integration across various fields. While some limitations were observed, including occasional errors on simpler problems and challenges with certain highly specialized concepts, the overall results indicate significant progress towards artificial general intelligence</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0qcd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0qcd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 424w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 848w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0qcd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png" width="516" height="429.6277056277056" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1386,&quot;resizeWidth&quot;:516,&quot;bytes&quot;:388698,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0qcd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 424w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 848w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 1272w, https://substackcdn.com/image/fetch/$s_!0qcd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dd0c8bd-998b-42ef-9d89-f5039af68dce_1386x1154.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>AIGC News of the week<strong>&#65288;</strong>September 30 - October 06<strong>&#65289;</strong></h3><p>1.) entropix&#65306;Entropy Based Sampling and Parallel CoT Decoding ( <a href="https://github.com/xjdr-alt/entropix">link</a> )</p><p>2.) aoai-realtime-audio-sdk:Azure OpenAI code resources for using gpt-4o-realtime capabilities ( <a href="https://github.com/Azure-Samples/aoai-realtime-audio-sdk">link</a> )</p><p>3.) openai/whisper-large-v3-turbo ( <a href="https://huggingface.co/openai/whisper-large-v3-turbo">link</a> )</p><p>4.) nvidia/NVLM-D-72B ( <a href="https://huggingface.co/nvidia/NVLM-D-72B">link</a> )</p><p>5.) ComfyUI-Depth-Pro ( <a href="https://github.com/spacepxl/ComfyUI-Depth-Pro">link</a> )</p><p></p><p>more AIGC News: <a href="https://ainews.kol.tools/">AINews</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QCkk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QCkk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 424w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 848w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QCkk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png" width="622" height="488.2870879120879" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1143,&quot;width&quot;:1456,&quot;resizeWidth&quot;:622,&quot;bytes&quot;:433716,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QCkk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 424w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 848w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!QCkk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2425d262-66df-4a95-b4e7-71c849d7a8db_1988x1560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #87]]></title><description><![CDATA[AIGC Top Papers and AI news of the week]]></description><link>https://aigc.news/p/aigc-weekly-87</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-87</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 30 Sep 2024 16:36:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LLlM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LLlM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LLlM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LLlM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131428,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LLlM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!LLlM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84a30fe3-aaed-463a-9a40-666b705188b2_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Top Papers of the week&#65288;</strong>September 23 - September 29<strong>&#65289;</strong></h3><p>1.) <strong>Llama 3.2: Revolutionizing edge AI and vision with open, customizable models</strong> ( <a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/">webpage</a> | <a href="https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf">model</a> &#65289;</p><p><em>The two largest models of the Llama 3.2 collection, 11B and 90B, support image reasoning use cases, such as document-level understanding including charts and graphs, captioning of images, and visual grounding tasks such as directionally pinpointing objects in images based on natural language descriptions. For example, a person could ask a question about which month in the previous year their small business had the best sales, and Llama 3.2 can then reason based on an available graph and quickly provide the answer. In another example, the model could reason with a map and help answer questions such as when a hike might become steeper or the distance of a particular trail marked on the map. The 11B and 90B models can also bridge the gap between vision and language by extracting details from an image, understanding the scene, and then crafting a sentence or two that could be used as an image caption to help tell the story.</em></p><p><em>The lightweight 1B and 3B models are highly capable with multilingual text generation and tool calling abilities. These models empower developers to build personalized, on-device agentic applications with strong privacy where data never leaves the device. For example, such an application could help summarize the last 10 messages received, extract action items, and leverage tool calling to directly send calendar invites for follow-up meetings.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d2BW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d2BW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d2BW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png" width="658" height="370.125" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:658,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d2BW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 424w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 848w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!d2BW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37055313-cc4b-4b4f-b57f-315d57ceee0f_3840x2160.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>2.) <strong>LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench ( <a href="https://arxiv.org/abs/2409.13373">paper</a> )</strong></p><p><em>The ability to plan a course of action that achieves a desired state of affairs has long been considered a core competence of intelligent agents and has been an integral part of AI research since its inception. With the advent of large language models (LLMs), there has been considerable interest in the question of whether or not they possess such planning abilities. PlanBench, an extensible benchmark we developed in 2022, soon after the release of GPT3, has remained an important tool for evaluating the planning abilities of LLMs. Despite the slew of new private and open source LLMs since GPT3, progress on this benchmark has been surprisingly slow. OpenAI claims that their recent o1 (Strawberry) model has been specifically constructed and trained to escape the normal limitations of autoregressive LLMs--making it a new kind of model: a Large Reasoning Model (LRM). Using this development as a catalyst, this paper takes a comprehensive look at how well current LLMs and new LRMs do on PlanBench. As we shall see, while o1's performance is a quantum improvement on the benchmark, outpacing the competition, it is still far from saturating it. This improvement also brings to the fore questions about accuracy, efficiency, and guarantees which must be considered before deploying such systems.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dJ_s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dJ_s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 424w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 848w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 1272w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dJ_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png" width="622" height="240.10334346504558" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1316,&quot;resizeWidth&quot;:622,&quot;bytes&quot;:113459,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dJ_s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 424w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 848w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 1272w, https://substackcdn.com/image/fetch/$s_!dJ_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615c0b4a-dca1-429b-876e-e0de527b6dfd_1316x508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3.) <strong>Larger and more instructable language models become less reliable ( <a href="https://www.nature.com/articles/s41586-024-07930-y">paper</a> )</strong></p><p><em>The prevailing methods to make large language models more powerful and amenable have been based on continuous scaling up (that is, increasing their size, data volume and computational resources) and bespoke shaping up (including post-filtering, fine tuning or use of human feedback). However, larger and more instructable large language models may have become less reliable. By studying the relationship between difficulty concordance, task avoidance and prompting stability of several language model families, here we show that easy instances for human participants are also easy for the models, but scaled-up, shaped-up models do not secure areas of low difficulty in which either the model does not err or human supervision can spot the errors. We also find that early models often avoid user questions but scaled-up, shaped-up models tend to give an apparently sensible yet wrong answer much more often, including errors on difficult questions that human supervisors frequently overlook. Moreover, we observe that stability to different natural phrasings of the same question is improved by scaling-up and shaping-up interventions, but pockets of variability persist across difficulty levels. These findings highlight the need for a fundamental shift in the design and development of general-purpose artificial intelligence, particularly in high-stakes areas for which a predictable distribution of errors is paramount.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0TBt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0TBt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 424w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 848w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 1272w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0TBt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png" width="670" height="373.97506925207756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:806,&quot;width&quot;:1444,&quot;resizeWidth&quot;:670,&quot;bytes&quot;:230459,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0TBt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 424w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 848w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 1272w, https://substackcdn.com/image/fetch/$s_!0TBt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21fd6330-2d94-423c-be7b-2da81037e198_1444x806.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br>4.) <strong>A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? ( <a href="https://ucsc-vlaa.github.io/o1_medicine/">webpage</a> | <a href="https://arxiv.org/abs/2409.15277">paper</a>&#65289;</strong></p><p><em>Large language models (LLMs) have exhibited remarkable capabilities across various domains and tasks, pushing the boundaries of our knowledge in learning and cognition. The latest model, OpenAI's o1, stands out as the first LLM with an internalized chain-of-thought technique using reinforcement learning strategies. While it has demonstrated surprisingly strong capabilities on various general language tasks, its performance in specialized fields such as medicine remains unknown. To this end, this report provides a comprehensive exploration of o1 on different medical scenarios, examining 3 key aspects: understanding, reasoning, and multilinguality. Specifically, our evaluation encompasses 6 tasks using data from 37 medical datasets, including two newly constructed and more challenging question-answering (QA) tasks based on professional medical quizzes from the New England Journal of Medicine (NEJM) and The Lancet. These datasets offer greater clinical relevance compared to standard medical QA benchmarks such as MedQA, translating more effectively into real-world clinical utility. Our analysis of o1 suggests that the enhanced reasoning ability of LLMs may (significantly) benefit their capability to understand various medical instructions and reason through complex clinical scenarios. Notably, o1 surpasses the previous GPT-4 in accuracy by an average of 6.2% and 6.6% across 19 datasets and two newly created complex QA scenarios. But meanwhile, we identify several weaknesses in both the model capability and the existing evaluation protocols, including hallucination, inconsistent multilingual ability, and discrepant metrics for evaluation. </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WEK0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WEK0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 424w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 848w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 1272w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WEK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png" width="638" height="430.7376373626374" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:983,&quot;width&quot;:1456,&quot;resizeWidth&quot;:638,&quot;bytes&quot;:463484,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WEK0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 424w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 848w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 1272w, https://substackcdn.com/image/fetch/$s_!WEK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a71cd28-4965-489c-bfac-c3bbc9ea57c4_2098x1416.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br>5.) <strong>PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation ( <a href="https://stevenlsw.github.io/physgen/">webpage</a> | <a href="https://arxiv.org/abs/2409.18964">paper</a> )</strong></p><p><em>We present PhysGen, a novel image-to-video generation method that converts a single image and an input condition (e.g., force and torque applied to an object in the image) to produce a realistic, physically plausible, and temporally consistent video. Our key insight is to integrate model-based physical simulation with a data-driven video generation process, enabling plausible image-space dynamics. At the heart of our system are three core components: (i) an image understanding module that effectively captures the geometry, materials, and physical parameters of the image; (ii) an image-space dynamics simulation model that utilizes rigid-body physics and inferred parameters to simulate realistic behaviors; and (iii) an image-based rendering and refinement module that leverages generative video diffusion to produce realistic video footage featuring the simulated motion. The resulting videos are realistic in both physics and appearance and are even precisely controllable, showcasing superior results over existing data-driven image-to-video generation works through quantitative comparison and comprehensive user study. PhysGen's resulting videos can be used for various downstream applications, such as turning an image into a realistic animation or allowing users to interact with the image and create various dynamics.</em></p><div id="youtube2-lCc1rHePEFQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;lCc1rHePEFQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/lCc1rHePEFQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p></p><p>6.) <strong>Emu3: Next-Token Prediction is All You Need ( <a href="https://emu.baai.ac.cn/about">webpage</a> | <a href="https://arxiv.org/abs/2409.18869">paper</a> | <a href="https://github.com/baaivision/Emu3">code</a> )</strong></p><p><em>While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e.g., Stable Diffusion) and compositional approaches (e.g., CLIP combined with LLMs). In this paper, we introduce Emu3, a new suite of state-of-the-art multimodal models trained solely with next-token prediction. By tokenizing images, text, and videos into a discrete space, we train a single transformer from scratch on a mixture of multimodal sequences. Emu3 outperforms several well-established task-specific models in both generation and perception tasks, surpassing flagship models such as SDXL and LLaVA-1.6, while eliminating the need for diffusion or compositional architectures. Emu3 is also capable of generating high-fidelity video via predicting the next token in a video sequence. We simplify complex multimodal model designs by converging on a singular focus: tokens, unlocking great potential for scaling both during training and inference. Our results demonstrate that next-token prediction is a promising path towards building general multimodal intelligence beyond language. We open-source key techniques and models to support further research in this direction.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rQQd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rQQd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 424w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 848w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 1272w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rQQd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png" width="686" height="374.09615384615387" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:686,&quot;bytes&quot;:239885,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rQQd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 424w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 848w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 1272w, https://substackcdn.com/image/fetch/$s_!rQQd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e346db3-67f0-4d64-96f2-168d7d0b924a_2028x1106.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>7.) <strong>FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression ( <a href="https://arxiv.org/abs/2409.17141">paper</a> )</strong></p><p><em>While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural network and transformer-based compression techniques to answer this question. We compare traditional text compression systems with neural network and LLM-based text compression methods. Although LLM-based systems significantly outperform conventional compression methods, they are highly impractical. Specifically, LLMZip, a recent text compression system using Llama3-8B requires 9.5 days to compress just 10 MB of text, although with huge improvements in compression ratios. To overcome this, we present FineZip - a novel LLM-based text compression system that combines ideas of online memorization and dynamic context to reduce the compression time immensely. FineZip can compress the above corpus in approximately 4 hours compared to 9.5 days, a 54 times improvement over LLMZip and comparable performance. FineZip outperforms traditional algorithmic compression methods with a large margin, improving compression ratios by approximately 50\%. With this work, we take the first step towards making lossless text compression with LLMs a reality. While FineZip presents a significant step in that direction, LLMs are still not a viable solution for large-scale text compression. We hope our work paves the way for future research and innovation to solve this problem.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EKj0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EKj0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 424w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 848w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 1272w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EKj0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png" width="682" height="397.917282127031" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1354,&quot;resizeWidth&quot;:682,&quot;bytes&quot;:137607,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EKj0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 424w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 848w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 1272w, https://substackcdn.com/image/fetch/$s_!EKj0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecfe984e-e5c6-4f33-96d9-bdb5f539489b_1354x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>8.) <strong>Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely ( <a href="https://arxiv.org/abs/2409.14924">paper</a> )</strong></p><p><em>Large language models (LLMs) augmented with external data have demonstrated remarkable capabilities in completing real-world tasks. Techniques for integrating external data into LLMs, such as Retrieval-Augmented Generation (RAG) and fine-tuning, are gaining increasing attention and widespread application. Nonetheless, the effective deployment of data-augmented LLMs across various specialized fields presents substantial challenges. These challenges encompass a wide range of issues, from retrieving relevant data and accurately interpreting user intent to fully harnessing the reasoning capabilities of LLMs for complex tasks. We believe that there is no one-size-fits-all solution for data-augmented LLM applications. In practice, underperformance often arises from a failure to correctly identify the core focus of a task or because the task inherently requires a blend of multiple capabilities that must be disentangled for better resolution. In this survey, we propose a RAG task categorization method, classifying user queries into four levels based on the type of external data required and primary focus of the task: explicit fact queries, implicit fact queries, interpretable rationale queries, and hidden rationale queries. We define these levels of queries, provide relevant datasets, and summarize the key challenges and most effective techniques for addressing these challenges. Finally, we discuss three main forms of integrating external data into LLMs: context, small model, and fine-tuning, highlighting their respective strengths, limitations, and the types of problems they are suited to solve. This work aims to help readers thoroughly understand and decompose the data requirements and key bottlenecks in building LLM applications, offering solutions to the different challenges and serving as a guide to systematically developing such applications.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qsp4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qsp4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 424w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 848w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 1272w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qsp4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png" width="642" height="372.80542986425337" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1326,&quot;resizeWidth&quot;:642,&quot;bytes&quot;:204888,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qsp4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 424w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 848w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 1272w, https://substackcdn.com/image/fetch/$s_!qsp4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbbc6b96-b644-4060-9539-0596d7a0c70e_1326x770.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>9.) <strong>Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models ( <a href="https://molmo.allenai.org/">webpage</a> | <a href="https://arxiv.org/abs/2409.17146">paper</a> )</strong></p><p><em>Today's most advanced multimodal models remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed models into open ones. As a result, the community is still missing foundational knowledge about how to build performant VLMs from scratch. We present Molmo, a new family of VLMs that are state-of-the-art in their class of openness. Our key innovation is a novel, highly detailed image caption dataset collected entirely from human annotators using speech-based descriptions. To enable a wide array of user interactions, we also introduce a diverse dataset mixture for fine-tuning that includes in-the-wild Q&amp;A and innovative 2D pointing data. The success of our approach relies on careful choices for the model architecture details, a well-tuned training pipeline, and, most critically, the quality of our newly collected datasets, all of which will be released. The best-in-class 72B model within the Molmo family not only outperforms others in the class of open weight and data models but also compares favorably against proprietary systems like GPT-4o, Claude 3.5, and Gemini 1.5 on both academic benchmarks and human evaluation.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8s8n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8s8n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 424w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 848w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8s8n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png" width="548" height="467.6368715083799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1222,&quot;width&quot;:1432,&quot;resizeWidth&quot;:548,&quot;bytes&quot;:387558,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8s8n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 424w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 848w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 1272w, https://substackcdn.com/image/fetch/$s_!8s8n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685aa9b2-8d6e-402e-ab7f-27ce1950e10e_1432x1222.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>10.) <strong>Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation ( <a href="https://homangab.github.io/gen2act/">webpage</a> | <a href="https://arxiv.org/abs/2409.16283">paper</a> )</strong></p><p><em>How can robot manipulation policies generalize to novel tasks involving unseen object types and new motions? In this paper, we provide a solution in terms of predicting motion information from web data through human video generation and conditioning a robot policy on the generated video. Instead of attempting to scale robot data collection which is expensive, we show how we can leverage video generation models trained on easily available web data, for enabling generalization. Our approach Gen2Act casts language-conditioned manipulation as zero-shot human video generation followed by execution with a single policy conditioned on the generated video. To train the policy, we use an order of magnitude less robot interaction data compared to what the video prediction model was trained on. Gen2Act doesn't require fine-tuning the video model at all and we directly use a pre-trained model for generating human videos. Our results on diverse real-world scenarios show how Gen2Act enables manipulating unseen object types and performing novel motions for tasks not present in the robot data.</em></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;655a99e0-5188-4c76-87c4-ea9084cb864e&quot;,&quot;duration&quot;:null}"></div><p></p><h3>AIGC News of the week<strong>&#65288;</strong>September 23 - September 29<strong>&#65289;</strong></h3><p>1.) Show-Me: A Visual and Transparent Reasoning Agent ( <a href="https://github.com/marlaman/show-me">repo</a> )</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;b7030db0-11dc-411a-9647-155be0e3bd04&quot;,&quot;duration&quot;:null}"></div><p>2.) ProtoMotions: Physics-based Character Animation ( <a href="https://github.com/NVlabs/ProtoMotions">repo</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YQCZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YQCZ!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YQCZ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:10289700,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YQCZ!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!YQCZ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6505b27-feb4-4ecd-8710-8e8b995973fe_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>3.) GemFilter: Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction ( <a href="https://github.com/SalesforceAIResearch/GemFilter">repo</a> )</p><p>4.) llama-assistant:Your Local AI Assistant with Llama Models ( <a href="https://github.com/vietanhdev/llama-assistant">repo</a> )</p><p>5.) nvidia/Llama-3_1-Nemotron-51B-Instruct ( <a href="https://huggingface.co/nvidia/Llama-3_1-Nemotron-51B-Instruct">link</a>  )</p><p></p><p>more AIGC News: <a href="https://ainews.kol.tools/">AINews</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ce4S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ce4S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 424w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 848w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 1272w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ce4S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png" width="622" height="417.37225274725273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:977,&quot;width&quot;:1456,&quot;resizeWidth&quot;:622,&quot;bytes&quot;:538406,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ce4S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 424w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 848w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 1272w, https://substackcdn.com/image/fetch/$s_!Ce4S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bdf1fd-3de8-4482-ad6f-bdcccf5ac418_2494x1674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #86]]></title><description><![CDATA[AIGC Top Papers and AI news of the week]]></description><link>https://aigc.news/p/aigc-weekly-86</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-86</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 23 Sep 2024 14:54:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dVjo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dVjo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dVjo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dVjo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132232,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dVjo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!dVjo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F828b208f-b8d8-4cf9-a6ba-53c6541ef79f_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Top Papers of the week&#65288;</strong>September 16 - September 22<strong>&#65289;</strong></h3><p>1.) <strong>Training Language Models to Self-Correct via Reinforcement Learning ( <a href="https://arxiv.org/abs/2409.12917">paper</a> )</strong></p><p><em>Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Existing approaches for training self-correction either require multiple models or rely on a more capable model or other forms of supervision. To this end, we develop a multi-turn online reinforcement learning (RL) approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. To build SCoRe, we first show that variants of supervised fine-tuning (SFT) on offline model-generated correction traces are insufficient for instilling self-correction behavior. In particular, we observe that training via SFT either suffers from a distribution mismatch between the training data and the model's own responses or implicitly prefers only a certain mode of correction behavior that is often not effective at test time. SCoRe addresses these challenges by training under the model's own distribution of self-generated correction traces and using appropriate regularization to steer the learning process into learning a self-correction strategy that is effective at test time as opposed to simply fitting high-reward responses for a given prompt. This regularization prescribes running a first phase of RL on a base model to generate a policy initialization that is less susceptible to collapse and then using a reward bonus to amplify self-correction during training. When applied to Gemini 1.0 Pro and 1.5 Flash models, we find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on the MATH and HumanEval benchmarks.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VHT0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VHT0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 424w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 848w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 1272w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VHT0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png" width="658" height="535.0452554744526" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1114,&quot;width&quot;:1370,&quot;resizeWidth&quot;:658,&quot;bytes&quot;:286453,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VHT0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 424w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 848w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 1272w, https://substackcdn.com/image/fetch/$s_!VHT0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61add9e7-ced2-49fb-9fca-1fdf83584b1a_1370x1114.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>2.) <strong>On the Diagram of Thought ( <a href="https://arxiv.org/abs/2409.10038">paper</a>  | <a href="https://github.com/diagram-of-thought/diagram-of-thought">code</a> )</strong></p><p><em>We introduce Diagram of Thought (DoT), a framework that models iterative reasoning in large language models (LLMs) as the construction of a directed acyclic graph (DAG) within a single model. Unlike traditional approaches that represent reasoning as linear chains or trees, DoT organizes propositions, critiques, refinements, and verifications into a cohesive DAG structure, allowing the model to explore complex reasoning pathways while maintaining logical consistency. Each node in the diagram corresponds to a proposition that has been proposed, critiqued, refined, or verified, enabling the LLM to iteratively improve its reasoning through natural language feedback. By leveraging auto-regressive next-token prediction with role-specific tokens, DoT facilitates seamless transitions between proposing ideas and critically evaluating them, providing richer feedback than binary signals. Furthermore, we formalize the DoT framework using Topos Theory, providing a mathematical foundation that ensures logical consistency and soundness in the reasoning process. This approach enhances both the training and inference processes within a single LLM, eliminating the need for multiple models or external control mechanisms. DoT offers a conceptual framework for designing next-generation reasoning-specialized models, emphasizing training efficiency, robust reasoning capabilities, and theoretical grounding.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P55O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P55O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 424w, https://substackcdn.com/image/fetch/$s_!P55O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 848w, https://substackcdn.com/image/fetch/$s_!P55O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 1272w, https://substackcdn.com/image/fetch/$s_!P55O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P55O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png" width="510" height="612.7611940298508" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:966,&quot;width&quot;:804,&quot;resizeWidth&quot;:510,&quot;bytes&quot;:121839,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P55O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 424w, https://substackcdn.com/image/fetch/$s_!P55O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 848w, https://substackcdn.com/image/fetch/$s_!P55O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 1272w, https://substackcdn.com/image/fetch/$s_!P55O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89ac9551-8d11-4bd8-9e06-b490dfad141e_804x966.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3.) <strong>Moshi: a speech-text foundation model for real time dialogue ( <a href="https://kyutai.org/Moshi.pdf">paper</a> | <a href="https://github.com/kyutai-labs/moshi">code</a> )</strong></p><p><em>We introduce Moshi, a speech-text foundation model and full-duplex spoken dialogue framework. Current systems for spoken dialogue rely on pipelines of independent components, namely voice activity detection, speech recognition, textual dialogue and text-to-speech. Such frameworks cannot emulate the experience of real conversations. First, their complexity induces a latency of several seconds between interactions. Second, text being the intermediate modality for dialogue, non-linguistic information that modifies meaning&#8212; such as emotion or non-speech sounds&#8212; is lost in the interaction. Finally, they rely on a segmentation into speaker turns, which does not take into account overlapping speech, interruptions and interjections. Moshi solves these independent issues altogether by casting spoken dialogue as speech-to-speech generation. Starting from a text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec, while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. We moreover extend the hierarchical semantic-to-acoustic token generation of previous work to first predict time-aligned text tokens as a prefix to audio tokens. Not only this &#8220;Inner Monologue&#8221; method significantly improves the linguistic quality of generated speech, but we also illustrate how it can provide streaming speech recognition and text-to-speech. Our resulting model is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LxrC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LxrC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 424w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 848w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 1272w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LxrC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png" width="664" height="456.9693053311793" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:852,&quot;width&quot;:1238,&quot;resizeWidth&quot;:664,&quot;bytes&quot;:207680,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LxrC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 424w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 848w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 1272w, https://substackcdn.com/image/fetch/$s_!LxrC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76d1c5db-61b1-42fc-9e01-88be664f1959_1238x852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>4.) <strong>Qwen2.5-Coder Technical Report ( <a href="https://arxiv.org/abs/2409.12186">paper</a>  | <a href="https://github.com/QwenLM/Qwen2.5-Coder">code</a> )</strong></p><p><em>In this report, we introduce the Qwen2.5-Coder series, a significant upgrade from its predecessor, CodeQwen1.5. This series includes two models: Qwen2.5-Coder-1.5B and Qwen2.5-Coder-7B. As a code-specific model, Qwen2.5-Coder is built upon the Qwen2.5 architecture and continues pretrained on a vast corpus of over 5.5 trillion tokens. Through meticulous data cleaning, scalable synthetic data generation, and balanced data mixing, Qwen2.5-Coder demonstrates impressive code generation capabilities while retaining general versatility. The model has been evaluated on a wide range of code-related tasks, achieving state-of-the-art (SOTA) performance across more than 10 benchmarks, including code generation, completion, reasoning, and repair, consistently outperforming larger models of the same model size. We believe that the release of the Qwen2.5-Coder series will not only push the boundaries of research in code intelligence but also, through its permissive licensing, encourage broader adoption by developers in real-world applications.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EkmR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EkmR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 424w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 848w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 1272w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EkmR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png" width="554" height="470.37735849056605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:954,&quot;resizeWidth&quot;:554,&quot;bytes&quot;:243153,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EkmR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 424w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 848w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 1272w, https://substackcdn.com/image/fetch/$s_!EkmR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6023f7ac-c51d-4026-bd5e-97e8d5ed6a12_954x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>5.) <strong>3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt ( <a href="https://lukashoel.github.io/3DGS-LM/">webpage</a> | <a href="https://arxiv.org/abs/2409.12892">paper</a> )</strong></p><p><em>We present 3DGS-LM, a new method that accelerates the reconstruction of 3D Gaussian Splatting (3DGS) by replacing its ADAM optimizer with a tailored Levenberg-Marquardt (LM). Existing methods reduce the optimization time by decreasing the number of Gaussians or by improving the implementation of the differentiable rasterizer. However, they still rely on the ADAM optimizer to fit Gaussian parameters of a scene in thousands of iterations, which can take up to an hour. To this end, we change the optimizer to LM that runs in conjunction with the 3DGS differentiable rasterizer. For efficient GPU parallization, we propose a caching data structure for intermediate gradients that allows us to efficiently calculate Jacobian-vector products in custom CUDA kernels. In every LM iteration, we calculate update directions from multiple image subsets using these kernels and combine them in a weighted mean. Overall, our method is 30% faster than the original 3DGS while obtaining the same reconstruction quality. Our optimization is also agnostic to other methods that acclerate 3DGS, thus enabling even faster speedups compared to vanilla 3DGS.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dzKi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dzKi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 424w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 848w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 1272w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dzKi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png" width="608" height="260.989010989011" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1456,&quot;resizeWidth&quot;:608,&quot;bytes&quot;:1598263,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dzKi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 424w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 848w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 1272w, https://substackcdn.com/image/fetch/$s_!dzKi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8713d0d-8155-4934-bf1e-b288689da5ba_1902x816.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>6.) <strong>StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation ( <a href="https://arxiv.org/abs/2409.12576">paper</a> | <a href="https://github.com/RedAIGC/StoryMaker">code</a> )</strong></p><p><em>Tuning-free personalized image generation methods have achieved significant success in maintaining facial consistency, i.e., identities, even with multiple characters. However, the lack of holistic consistency in scenes with multiple characters hampers these methods' ability to create a cohesive narrative. In this paper, we introduce StoryMaker, a personalization solution that preserves not only facial consistency but also clothing, hairstyles, and body consistency, thus facilitating the creation of a story through a series of images. StoryMaker incorporates conditions based on face identities and cropped character images, which include clothing, hairstyles, and bodies. Specifically, we integrate the facial identity information with the cropped character images using the Positional-aware Perceiver Resampler (PPR) to obtain distinct character features. To prevent intermingling of multiple characters and the background, we separately constrain the cross-attention impact regions of different characters and the background using MSE loss with segmentation masks. Additionally, we train the generation network conditioned on poses to promote decoupling from poses. A LoRA is also employed to enhance fidelity and quality. Experiments underscore the effectiveness of our approach. StoryMaker supports numerous applications and is compatible with other societal plug-ins.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y70a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y70a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 424w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 848w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 1272w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y70a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png" width="1404" height="774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:774,&quot;width&quot;:1404,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1635902,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y70a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 424w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 848w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 1272w, https://substackcdn.com/image/fetch/$s_!Y70a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95268748-b969-419a-9b30-667ca64d5fcb_1404x774.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>7.) <strong>3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion ( <a href="https://3dtopia.github.io/3DTopia-XL/">webpage</a> | <a href="https://arxiv.org/abs/2409.12957">paper</a> | <a href="https://github.com/3DTopia/3DTopia-XL">code</a> )</strong></p><p><em>The increasing demand for high-quality 3D assets across various industries necessitates efficient and automated 3D content creation. Despite recent advancements in 3D generative models, existing methods still face challenges with optimization speed, geometric fidelity, and the lack of assets for physically based rendering (PBR). In this paper, we introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome these limitations. 3DTopia-XL leverages a novel primitive-based 3D representation, PrimX, which encodes detailed shape, albedo, and material field into a compact tensorial format, facilitating the modeling of high-resolution geometry with PBR assets. On top of the novel representation, we propose a generative framework based on Diffusion Transformer (DiT), which comprises 1) Primitive Patch Compression, 2) and Latent Primitive Diffusion. 3DTopia-XL learns to generate high-quality 3D assets from textual or visual inputs. We conduct extensive qualitative and quantitative experiments to demonstrate that 3DTopia-XL significantly outperforms existing methods in generating high-quality 3D assets with fine-grained textures and materials, efficiently bridging the quality gap between generative models and real-world applications</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Km5Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Km5Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 424w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 848w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 1272w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Km5Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png" width="618" height="378.2775800711744" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:688,&quot;width&quot;:1124,&quot;resizeWidth&quot;:618,&quot;bytes&quot;:673453,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Km5Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 424w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 848w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 1272w, https://substackcdn.com/image/fetch/$s_!Km5Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3b524-9f8b-4d7a-b90c-14704e5f7595_1124x688.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>8.) <strong>To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning ( <a href="https://arxiv.org/abs/2409.12183">paper</a> )</strong></p><p><em>Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs). But for what kinds of tasks is this extra ``thinking'' really helpful? To analyze this, we conducted a quantitative meta-analysis covering over 100 papers using CoT and ran our own evaluations of 20 datasets across 14 models. Our results show that CoT gives strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks. On MMLU, directly generating the answer without CoT leads to almost identical accuracy as CoT unless the question or model's response contains an equals sign, indicating symbolic operations and reasoning. Following this finding, we analyze the behavior of CoT on these problems by separating planning and execution and comparing against tool-augmented LLMs. Much of CoT's gain comes from improving symbolic execution, but it underperforms relative to using a symbolic solver. Our results indicate that CoT can be applied selectively, maintaining performance while saving inference costs. Furthermore, they suggest a need to move beyond prompt-based CoT to new paradigms that better leverage intermediate computation across the whole range of LLM applications.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aPbA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aPbA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 424w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 848w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 1272w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aPbA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png" width="608" height="419.01243339253995" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1126,&quot;resizeWidth&quot;:608,&quot;bytes&quot;:281753,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aPbA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 424w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 848w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 1272w, https://substackcdn.com/image/fetch/$s_!aPbA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F017e8511-912f-4e5d-b72e-4e45608bbc9a_1126x776.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>9.) <strong>OmniGen: Unified Image Generation ( <a href="https://arxiv.org/abs/2409.11340">paper</a>  | <a href="https://github.com/VectorSpaceLab/OmniGen">code</a> )</strong></p><p><em>In this work, we introduce OmniGen, a new diffusion model for unified image generation. Unlike popular diffusion models (e.g., Stable Diffusion), OmniGen no longer requires additional modules such as ControlNet or IP-Adapter to process diverse control conditions. OmniGenis characterized by the following features: 1) Unification: OmniGen not only demonstrates text-to-image generation capabilities but also inherently supports other downstream tasks, such as image editing, subject-driven generation, and visual-conditional generation. Additionally, OmniGen can handle classical computer vision tasks by transforming them into image generation tasks, such as edge detection and human pose recognition. 2) Simplicity: The architecture of OmniGen is highly simplified, eliminating the need for additional text encoders. Moreover, it is more user-friendly compared to existing diffusion models, enabling complex tasks to be accomplished through instructions without the need for extra preprocessing steps (e.g., human pose estimation), thereby significantly simplifying the workflow of image generation. 3) Knowledge Transfer: Through learning in a unified format, OmniGen effectively transfers knowledge across different tasks, manages unseen tasks and domains, and exhibits novel capabilities. We also explore the model's reasoning capabilities and potential applications of chain-of-thought mechanism. This work represents the first attempt at a general-purpose image generation model, and there remain several unresolved issues.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AJJr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AJJr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 424w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 848w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 1272w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AJJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png" width="602" height="273.05" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1120,&quot;resizeWidth&quot;:602,&quot;bytes&quot;:148997,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AJJr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 424w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 848w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 1272w, https://substackcdn.com/image/fetch/$s_!AJJr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef400a8f-2c40-4c01-99ad-d872e773c29f_1120x508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>10.) <strong>A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B ( <a href="https://arxiv.org/abs/2409.11055">paper</a> )</strong></p><p><em>Prior research works have evaluated quantized LLMs using limited metrics such as perplexity or a few basic knowledge tasks and old datasets. Additionally, recent large-scale models such as Llama 3.1 with up to 405B have not been thoroughly examined. This paper evaluates the performance of instruction-tuned LLMs across various quantization methods (GPTQ, AWQ, SmoothQuant, and FP8) on models ranging from 7B to 405B. Using 13 benchmarks, we assess performance across six task types: commonsense Q\&amp;A, knowledge and language understanding, instruction following, hallucination detection, mathematics, and dialogue. Our key findings reveal that (1) quantizing a larger LLM to a similar size as a smaller FP16 LLM generally performs better across most benchmarks, except for hallucination detection and instruction following; (2) performance varies significantly with different quantization methods, model size, and bit-width, with weight-only methods often yielding better results in larger models; (3) task difficulty does not significantly impact accuracy degradation due to quantization; and (4) the MT-Bench evaluation method has limited discriminatory power among recent high-performing LLMs.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dLOH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dLOH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 424w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 848w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 1272w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dLOH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png" width="544" height="701.6363636363636" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:908,&quot;width&quot;:704,&quot;resizeWidth&quot;:544,&quot;bytes&quot;:220240,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dLOH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 424w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 848w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 1272w, https://substackcdn.com/image/fetch/$s_!dLOH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae43740-90eb-469a-b0b8-f5f4f27b8129_704x908.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>AIGC News of the week<strong>&#65288;</strong>September 16 - September 22<strong>&#65289;</strong></h3><p>1.) o1: Using Groq or OpenAI or Ollama to create o1-like reasoning chains ( <a href="https://github.com/win4r/o1">repo</a> )</p><p>2.) Local Knowledge Graph ( <a href="https://github.com/punnerud/Local_Knowledge_Graph">code</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!13M8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!13M8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 424w, https://substackcdn.com/image/fetch/$s_!13M8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 848w, https://substackcdn.com/image/fetch/$s_!13M8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 1272w, https://substackcdn.com/image/fetch/$s_!13M8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!13M8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png" width="576" height="354.46153846153845" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1456,&quot;resizeWidth&quot;:576,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Example&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Example" title="Example" srcset="https://substackcdn.com/image/fetch/$s_!13M8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 424w, https://substackcdn.com/image/fetch/$s_!13M8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 848w, https://substackcdn.com/image/fetch/$s_!13M8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 1272w, https://substackcdn.com/image/fetch/$s_!13M8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb82922a5-8e91-4caf-801c-24aada9bf66e_2692x1656.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>3.) Cogstudio: Advanced Web UI for CogVideo ( <a href="https://github.com/pinokiofactory/cogstudio">repo</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OXl_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OXl_!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 424w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 848w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 1272w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OXl_!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif" width="622" height="430.1964980544747" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:711,&quot;width&quot;:1028,&quot;resizeWidth&quot;:622,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;img2vid.gif&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="img2vid.gif" title="img2vid.gif" srcset="https://substackcdn.com/image/fetch/$s_!OXl_!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 424w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 848w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 1272w, https://substackcdn.com/image/fetch/$s_!OXl_!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f6505a3-96cd-4e5a-8c80-2dc51542b953_1028x711.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>4.) jinaai/jina-embeddings-v3 ( <a href="https://huggingface.co/jinaai/jina-embeddings-v3">repo</a> )</p><p>5.) fishaudio/fish-speech-1.4 ( <a href="https://huggingface.co/fishaudio/fish-speech-1.4">repo</a> )</p><p></p><p>more AIGC News: <a href="https://ainews.kol.tools/">AINews</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-kbi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-kbi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 424w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 848w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 1272w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-kbi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png" width="648" height="412.5659340659341" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:927,&quot;width&quot;:1456,&quot;resizeWidth&quot;:648,&quot;bytes&quot;:349777,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-kbi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 424w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 848w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 1272w, https://substackcdn.com/image/fetch/$s_!-kbi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef1e94fe-f725-4443-a1ef-0960e8542295_1994x1270.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[AIGC Weekly | #85]]></title><description><![CDATA[AIGC Top Papers and AI news of the week]]></description><link>https://aigc.news/p/aigc-weekly-85</link><guid isPermaLink="false">https://aigc.news/p/aigc-weekly-85</guid><dc:creator><![CDATA[pxiaoer]]></dc:creator><pubDate>Mon, 16 Sep 2024 14:47:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gStd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gStd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gStd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!gStd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!gStd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!gStd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gStd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131781,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gStd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!gStd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!gStd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!gStd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7a501b-f1f1-452f-82e1-9c6b78d5331f_1200x600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Top Papers of the week&#65288;</strong>September 09 - September 15<strong>&#65289;</strong></h3><p>1.) <strong>OpenAI o1 ( <a href="https://openai.com/index/learning-to-reason-with-llms">webpage</a> )</strong></p><p>We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers&#8212;it can produce a long internal chain of thought before responding to the user.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6ber!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6ber!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 424w, https://substackcdn.com/image/fetch/$s_!6ber!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 848w, https://substackcdn.com/image/fetch/$s_!6ber!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!6ber!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6ber!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png" width="568" height="376.84615384615387" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:966,&quot;width&quot;:1456,&quot;resizeWidth&quot;:568,&quot;bytes&quot;:175683,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6ber!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 424w, https://substackcdn.com/image/fetch/$s_!6ber!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 848w, https://substackcdn.com/image/fetch/$s_!6ber!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!6ber!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27c65627-5868-47d2-8f14-165a3a8d972a_1556x1032.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>2.) <strong>Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers ( <a href="https://arxiv.org/abs/2409.04109">paper</a> )</strong></p><p><em>Recent advancements in large language models (LLMs) have sparked optimism about their potential to accelerate scientific discovery, with a growing number of works proposing research agents that autonomously generate and validate new ideas. Despite this, no evaluations have shown that LLM systems can take the very first step of producing novel, expert-level ideas, let alone perform the entire research process. We address this by establishing an experimental design that evaluates research idea generation while controlling for confounders and performs the first head-to-head comparison between expert NLP researchers and an LLM ideation agent. By recruiting over 100 NLP researchers to write novel ideas and blind reviews of both LLM and human ideas, we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p &lt; 0.05) than human expert ideas while being judged slightly weaker on feasibility. Studying our agent baselines closely, we identify open problems in building and evaluating research agents, including failures of LLM self-evaluation and their lack of diversity in generation. Finally, we acknowledge that human judgements of novelty can be difficult, even by experts, and propose an end-to-end study design which recruits researchers to execute these ideas into full projects, enabling us to study whether these novelty and feasibility judgements result in meaningful differences in research outcome.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_ewj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_ewj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 424w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 848w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 1272w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_ewj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png" width="584" height="483.6651982378855" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1128,&quot;width&quot;:1362,&quot;resizeWidth&quot;:584,&quot;bytes&quot;:259510,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_ewj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 424w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 848w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 1272w, https://substackcdn.com/image/fetch/$s_!_ewj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6855c-504d-45d7-8c52-5da950eaabe4_1362x1128.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>3.) <strong>LLaMA-Omni: Seamless Speech Interaction with Large Language Models (   <a href="https://arxiv.org/abs/2409.06666">paper</a> | <a href="https://github.com/ictnlp/LLaMA-Omni">code</a>  | <a href="https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni">model</a> )</strong></p><p><em>Models like GPT-4o enable real-time interaction with large language models (LLMs) through speech, significantly enhancing user experience compared to traditional text-based interaction. However, there is still a lack of exploration on how to build speech interaction models based on open-source LLMs. To address this, we propose LLaMA-Omni, a novel model architecture designed for low-latency and high-quality speech interaction with LLMs. LLaMA-Omni integrates a pretrained speech encoder, a speech adaptor, an LLM, and a streaming speech decoder. It eliminates the need for speech transcription, and can simultaneously generate text and speech responses directly from speech instructions with extremely low latency. We build our model based on the latest Llama-3.1-8B-Instruct model. To align the model with speech interaction scenarios, we construct a dataset named InstructS2S-200K, which includes 200K speech instructions and corresponding speech responses. Experimental results show that compared to previous speech-language models, LLaMA-Omni provides better responses in both content and style, with a response latency as low as 226ms. Additionally, training LLaMA-Omni takes less than 3 days on just 4 GPUs, paving the way for the efficient development of speech-language models in the future.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gUnz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gUnz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 424w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 848w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gUnz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png" width="508" height="330.06043956043953" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/faaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:946,&quot;width&quot;:1456,&quot;resizeWidth&quot;:508,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gUnz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 424w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 848w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!gUnz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaaf2066-1a5b-4d4a-8033-e9d486ce13b0_1896x1232.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>4.) <strong>Agent Workflow Memory ( <a href="https://arxiv.org/abs/2409.07429">paper</a> )</strong></p><p><em>Despite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories. In contrast, humans can flexibly solve complex tasks by learning reusable task workflows from past experiences and using them to guide future actions. To build agents that can similarly benefit from this process, we introduce Agent Workflow Memory (AWM), a method for inducing commonly reused routines, i.e., workflows, and selectively providing workflows to the agent to guide subsequent generations. AWM flexibly applies to both offline and online scenarios, where agents induce workflows from training examples beforehand or from test queries on the fly. We experiment on two major web navigation benchmarks -- Mind2Web and WebArena -- that collectively cover 1000+ tasks from 200+ domains across travel, shopping, and social media, among others. AWM substantially improves the baseline results by 24.6% and 51.1% relative success rate on Mind2Web and WebArena while reducing the number of steps taken to solve WebArena tasks successfully. Furthermore, online AWM robustly generalizes in cross-task, website, and domain evaluations, surpassing baselines from 8.9 to 14.0 absolute points as train-test task distribution gaps widen.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YSQ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YSQ9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 424w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 848w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 1272w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YSQ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png" width="572" height="426.94736842105266" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae52e271-75d5-4526-b571-92d9a024d723_836x624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:836,&quot;resizeWidth&quot;:572,&quot;bytes&quot;:225595,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YSQ9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 424w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 848w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 1272w, https://substackcdn.com/image/fetch/$s_!YSQ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae52e271-75d5-4526-b571-92d9a024d723_836x624.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>5.) <strong>SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning ( <a href="https://arxiv.org/abs/2409.05556">paper</a> )</strong></p><p><em>A key challenge in artificial intelligence is the creation of systems capable of autonomously advancing scientific understanding by exploring novel domains, identifying complex patterns, and uncovering previously unseen connections in vast scientific data. In this work, we present SciAgents, an approach that leverages three core concepts: (1) the use of large-scale ontological knowledge graphs to organize and interconnect diverse scientific concepts, (2) a suite of large language models (LLMs) and data retrieval tools, and (3) multi-agent systems with in-situ learning capabilities. Applied to biologically inspired materials, SciAgents reveals hidden interdisciplinary relationships that were previously considered unrelated, achieving a scale, precision, and exploratory power that surpasses traditional human-driven research methods. The framework autonomously generates and refines research hypotheses, elucidating underlying mechanisms, design principles, and unexpected material properties. By integrating these capabilities in a modular fashion, the intelligent system yields material discoveries, critique and improve existing hypotheses, retrieve up-to-date data about existing research, and highlights their strengths and limitations. Our case studies demonstrate scalable capabilities to combine generative AI, ontological representations, and multi-agent modeling, harnessing a `swarm of intelligence' similar to biological systems. This provides new avenues for materials discovery and accelerates the development of advanced materials by unlocking Nature's design principles.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iT2n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iT2n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 424w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 848w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iT2n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png" width="606" height="555.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1232,&quot;width&quot;:1344,&quot;resizeWidth&quot;:606,&quot;bytes&quot;:1378757,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iT2n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 424w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 848w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 1272w, https://substackcdn.com/image/fetch/$s_!iT2n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0dd3b3-1245-42bb-8602-e1cea807e91e_1344x1232.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>6.) <strong>Click2Mask: Local Editing with Dynamic Mask Generation ( <a href="https://omeregev.github.io/click2mask/">webpage</a> | <a href="https://omeregev.github.io/click2mask/static/paper/Click2Mask.pdf">paper</a> )</strong></p><p><em>Recent advancements in generative models have revolutionized image generation and editing, making these tasks accessible to non-experts. This paper focuses on local image editing, particularly the task of adding new content to a loosely specified area. Existing methods often require a precise mask or a detailed description of the location, which can be cumbersome and prone to errors. We propose Click2Mask, a novel approach that simplifies the local editing process by requiring only a single point of reference (in addition to the content description). A mask is dynamically grown around this point during a Blended Latent Diffusion (BLD) process, guided by a masked CLIP-based semantic loss. Click2Mask surpasses the limitations of segmentation-based and fine-tuning dependent methods, offering a more user-friendly and contextually accurate solution. Our experiments demonstrate that Click2Mask not only minimizes user effort but also delivers competitive or superior local image manipulation results compared to SoTA methods, according to both human judgement and automatic metrics. Key contributions include the simplification of user input, the ability to freely add objects unconstrained by existing segments, and the integration potential of our dynamic mask approach within other editing methods.</em></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;31371211-8541-4f7a-aff2-5914057aeb07&quot;,&quot;duration&quot;:null}"></div><p></p><p>7.) <strong>Data-Efficient Generation for Dataset Distillation ( <a href="https://arxiv.org/abs/2409.03929">paper</a> )</strong></p><p><em>While deep learning techniques have proven successful in image-related tasks, the exponentially increased data storage and computation costs become a significant challenge. Dataset distillation addresses these challenges by synthesizing only a few images for each class that encapsulate all essential information. Most current methods focus on matching. The problems lie in the synthetic images not being human-readable and the dataset performance being insufficient for downstream learning tasks. Moreover, the distillation time can quickly get out of bounds when the number of synthetic images per class increases even slightly. To address this, we train a class conditional latent diffusion model capable of generating realistic synthetic images with labels. The sampling time can be reduced to several tens of images per seconds. We demonstrate that models can be effectively trained using only a small set of synthetic images and evaluated on a large real test set. Our approach achieved rank \(1\) in The First Dataset Distillation Challenge at ECCV 2024 on the CIFAR100 and TinyImageNet datasets.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UWC1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UWC1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 424w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 848w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 1272w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UWC1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png" width="562" height="357.9277566539924" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:670,&quot;width&quot;:1052,&quot;resizeWidth&quot;:562,&quot;bytes&quot;:251520,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UWC1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 424w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 848w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 1272w, https://substackcdn.com/image/fetch/$s_!UWC1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6bf2a255-87fd-43fa-a1a2-8a0d80976620_1052x670.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>8.) <strong>MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model ( <a href="https://arxiv.org/abs/2409.07486">paper</a> )</strong></p><p><em>Generative models aim to simulate realistic effects of various actions across different contexts, from text generation to visual effects. Despite efforts to build real-world simulators, leveraging generative models for virtual worlds, like financial markets, remains underexplored. In financial markets, generative models can simulate market effects of various behaviors, enabling interaction with market scenes and players, and training strategies without financial risk. This simulation relies on the finest structured data in financial market like orders thus building the finest realistic simulation. We propose Large Market Model (LMM), an order-level generative foundation model, for financial market simulation, akin to language modeling in the digital world. Our financial Market Simulation engine (MarS), powered by LMM, addresses the need for realistic, interactive and controllable order generation. Key objectives of this paper include evaluating LMM's scaling law in financial markets, assessing MarS's realism, balancing controlled generation with market impact, and demonstrating MarS's potential applications. We showcase MarS as a forecast tool, detection system, analysis platform, and agent training environment. Our contributions include pioneering a generative model for financial markets, designing MarS to meet domain-specific needs, and demonstrating MarS-based applications' industry potential.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wnLa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wnLa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 424w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 848w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 1272w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wnLa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png" width="602" height="469.8536585365854" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1148,&quot;resizeWidth&quot;:602,&quot;bytes&quot;:395176,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wnLa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 424w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 848w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 1272w, https://substackcdn.com/image/fetch/$s_!wnLa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd58099c8-69e4-4434-867c-9b726b5f7399_1148x896.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>9.) <strong>Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources ( <a href="https://arxiv.org/abs/2409.08239">paper</a> )</strong></p><p><em>Large Language Models still struggle in challenging scenarios that leverage structured data, complex reasoning, or tool usage. In this paper, we propose Source2Synth: a new method that can be used for teaching LLMs new skills without relying on costly human annotations. Source2Synth takes as input a custom data source and produces synthetic data points with intermediate reasoning steps grounded in real-world sources. Source2Synth improves the dataset quality by discarding low-quality generations based on their answerability. We demonstrate the generality of this approach by applying it to two challenging domains: we test reasoning abilities in multi-hop question answering (MHQA), and tool usage in tabular question answering (TQA). Our method improves performance by 25.51% for TQA on WikiSQL and 22.57% for MHQA on HotPotQA compared to the fine-tuned baselines.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NNNI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NNNI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 424w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 848w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 1272w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NNNI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png" width="600" height="419.26605504587155" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:914,&quot;width&quot;:1308,&quot;resizeWidth&quot;:600,&quot;bytes&quot;:217527,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NNNI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 424w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 848w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 1272w, https://substackcdn.com/image/fetch/$s_!NNNI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F067caf9d-d11e-4d08-98fe-ca6a6709dac4_1308x914.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>10.) <strong>What is the Role of Small Models in the LLM Era: A Survey ( <a href="https://arxiv.org/abs/2409.06857">paper</a> | <a href="https://github.com/tigerchen52/role_of_small_models">code</a> )</strong></p><p><em>Large Language Models (LLMs) have made significant progress in advancing artificial general intelligence (AGI), leading to the development of increasingly large models such as GPT-4 and LLaMA-405B. However, scaling up model sizes results in exponentially higher computational costs and energy consumption, making these models impractical for academic researchers and businesses with limited resources. At the same time, Small Models (SMs) are frequently used in practical settings, although their significance is currently underestimated. This raises important questions about the role of small models in the era of LLMs, a topic that has received limited attention in prior research. In this work, we systematically examine the relationship between LLMs and SMs from two key perspectives: Collaboration and Competition. We hope this survey provides valuable insights for practitioners, fostering a deeper understanding of the contribution of small models and promoting more efficient use of computational resources.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qNXh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qNXh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 424w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 848w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qNXh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png" width="558" height="766.32" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1236,&quot;width&quot;:900,&quot;resizeWidth&quot;:558,&quot;bytes&quot;:349402,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qNXh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 424w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 848w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!qNXh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08d46476-5bb9-4d05-bae7-65c6fa65360e_900x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>AIGC News of the week<strong>&#65288;</strong>September 09 - September 15<strong>&#65289;</strong></h3><p>1.) g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains ( <a href="https://github.com/bklieger-groq/g1">repo</a> )</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xMV0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xMV0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 424w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 848w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 1272w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xMV0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png" width="396" height="367.9862637362637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1353,&quot;width&quot;:1456,&quot;resizeWidth&quot;:396,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;0.9 or 0.11 example&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="0.9 or 0.11 example" title="0.9 or 0.11 example" srcset="https://substackcdn.com/image/fetch/$s_!xMV0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 424w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 848w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 1272w, https://substackcdn.com/image/fetch/$s_!xMV0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6d5d8eb-7120-4cdf-9174-3113ca67ca6d_1522x1414.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>2.) Raspberry&#65306;Create an open source toy dataset for finetuning LLMs with reasoning abilities( <a href="https://github.com/daveshap/Raspberry">repo</a> ) </p><p>3.) Fei-Fei Li&#8217; s new spatially intelligent  startup:  world labs ( <a href="https://www.worldlabs.ai/about">link</a> )</p><p>4.) spann3r&#65306;3D Reconstruction with Spatial Memory ( <a href="https://github.com/HengyiWang/spann3r">repo</a> )</p><p>5.) ell: A language model programming library ( <a href="https://github.com/MadcowD/ell">repo</a> )</p><p></p><p>more AIGC News: <a href="https://ainews.kol.tools/">AINews</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QrI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QrI5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 424w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 848w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 1272w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QrI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png" width="658" height="420.74038461538464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/febebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:931,&quot;width&quot;:1456,&quot;resizeWidth&quot;:658,&quot;bytes&quot;:555852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QrI5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 424w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 848w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 1272w, https://substackcdn.com/image/fetch/$s_!QrI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffebebfd8-aaf5-444a-9e89-eb01630d019a_2492x1594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://aigc.news/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">AIGC Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item></channel></rss>