Skip to content

Commit 5f4168d

Browse files
committed
Site updated: 2025-06-14 13:07:24
1 parent b9f07c0 commit 5f4168d

File tree

7 files changed

+34
-40
lines changed

7 files changed

+34
-40
lines changed

2025/06/14/A1-matmul/index.html

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@
2828
<meta property="og:description" content="Task 1: MalMul with multi-head variant在 task 1 中,我们要实现两个矩阵相乘的逻辑,我们有以下两个矩阵: A1:一个 3D 的输入张量,形状为 [batch_size, seq_len, hidden_size],batch_size 表示序列的数量,seqlen 表示一个序列的最大长度,hidden_size 表示序列中每一个 token 拥有的维度">
2929
<meta property="og:locale" content="en_US">
3030
<meta property="article:published_time" content="2025-06-14T04:57:11.000Z">
31-
<meta property="article:modified_time" content="2025-06-14T04:57:57.809Z">
32-
<meta property="article:author" content="John Doe">
31+
<meta property="article:modified_time" content="2025-06-14T05:07:13.993Z">
32+
<meta property="article:author" content="DeepEngine">
3333
<meta name="twitter:card" content="summary">
3434

3535
<link rel="canonical" href="https://big-trex.github.io/2025/06/14/A1-matmul/">
@@ -154,7 +154,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
154154

155155
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
156156
<meta itemprop="image" content="/LLM-Blog/images/avatar.gif">
157-
<meta itemprop="name" content="John Doe">
157+
<meta itemprop="name" content="DeepEngine">
158158
<meta itemprop="description" content="">
159159
</span>
160160

@@ -174,7 +174,7 @@ <h1 class="post-title" itemprop="name headline">
174174
<span class="post-meta-item-text">Posted on</span>
175175

176176

177-
<time title="Created: 2025-06-14 12:57:11 / Modified: 12:57:57" itemprop="dateCreated datePublished" datetime="2025-06-14T12:57:11+08:00">2025-06-14</time>
177+
<time title="Created: 2025-06-14 12:57:11 / Modified: 13:07:13" itemprop="dateCreated datePublished" datetime="2025-06-14T12:57:11+08:00">2025-06-14</time>
178178
</span>
179179

180180

@@ -196,13 +196,10 @@ <h4 id="Task-1-MalMul-with-multi-head-variant"><a href="#Task-1-MalMul-with-mult
196196
<p>朴素的矩阵乘法仅对 <code>A1</code><code>batch_size</code> 维度,针对每个序列索引i,都执行 <code>O1[i] = A1[i] @ W1</code> 计算,从而得到形状为 <code>[b, s, e]</code> 的张量 <code>O1</code></p>
197197
<p>在多头矩阵乘法中,我们首先将输入张量 <code>A1</code> 和权重张量 <code>W1</code><code>h</code> 维度均分为 <code>num_heads</code> 个子维度(记为 <code>nh</code>,表示头的数量),由此得到形状为 <code>[b, s, nh, hd]</code> 的四维张量 <code>A2</code> 和形状为 <code>[nh, hd, e]</code> 的三维张量 <code>W2</code>。接下来,对于 <code>A2</code><code>batch_size</code> 维度下的每个序列,遍历其 <code>num_heads</code> 维度上的每个 <code>[s, hd]</code> 矩阵,并将其与 W2 中 <code>num_heads</code> 维度下对应的 <code>[hd, e]</code> 矩阵进行乘法运算。通过多头并行计算,最终输出一个形状为 <code>[b, s, nh, e]</code> 的四维张量 <code>O2</code></p>
198198
<h5 id="TODO"><a href="#TODO" class="headerlink" title="TODO"></a>TODO</h5><p>完成 <code>matmul_with_importance</code><strong>Task1</strong> 的部分,实现上述多头矩阵乘法的逻辑,输入张量 <code>A1</code><code>W1</code>,返回计算值 <code>O2</code></p>
199-
<blockquote>
200-
<p>[!IMPORTANT]</p>
201-
<ol>
202-
<li>输入的张量是 A1 和 W1,你需要自己将其转换为 A2 和 W2 再进行计算,请注意 torch 中 <code>reshape</code>, <code>view</code>, <code>transpose</code>, <code>permute</code>等函数的用法和区别。</li>
203-
<li>虽然逻辑上矩阵的乘法是用遍历进行计算的,但请勿使用 for 循环的方式进行实现,请自行查阅 pytorch 的计算函数,如 <code>@</code>, <code>torch.bmm</code> , <code>torch.mm</code> , <code>torch.matmul</code> , <code>torch.einsum</code> 等。</li>
204-
</ol>
205-
</blockquote>
199+
<div class="note info">
200+
<blockquote><p>[!IMPORTANT]</p><ol><li>输入的张量是 A1 和 W1,你需要自己将其转换为 A2 和 W2 再进行计算,请注意 torch 中 <code>reshape</code>, <code>view</code>, <code>transpose</code>, <code>permute</code>等函数的用法和区别。</li><li>虽然逻辑上矩阵的乘法是用遍历进行计算的,但请勿使用 for 循环的方式进行实现,请自行查阅 pytorch 的计算函数,如 <code>@</code>, <code>torch.bmm</code> , <code>torch.mm</code> , <code>torch.matmul</code> , <code>torch.einsum</code> 等。</li></ol></blockquote>
201+
</div>
202+
206203
<blockquote>
207204
<p>[!NOTE]</p>
208205
<ol>
@@ -318,7 +315,7 @@ <h5 id="TODO-2"><a href="#TODO-2" class="headerlink" title="TODO"></a>TODO</h5><
318315

319316
<div class="site-overview-wrap sidebar-panel">
320317
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
321-
<p class="site-author-name" itemprop="name">John Doe</p>
318+
<p class="site-author-name" itemprop="name">DeepEngine</p>
322319
<div class="site-description" itemprop="description"></div>
323320
</div>
324321
<div class="site-state-wrap motion-element">
@@ -358,7 +355,7 @@ <h5 id="TODO-2"><a href="#TODO-2" class="headerlink" title="TODO"></a>TODO</h5><
358355
<span class="with-love">
359356
<i class="fa fa-heart"></i>
360357
</span>
361-
<span class="author" itemprop="copyrightHolder">John Doe</span>
358+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
362359
</div>
363360
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
364361
</div>

2025/06/14/hello-world/index.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
<meta property="og:locale" content="en_US">
3030
<meta property="article:published_time" content="2025-06-14T04:45:15.042Z">
3131
<meta property="article:modified_time" content="2025-06-14T04:45:15.042Z">
32-
<meta property="article:author" content="John Doe">
32+
<meta property="article:author" content="DeepEngine">
3333
<meta name="twitter:card" content="summary">
3434

3535
<link rel="canonical" href="https://big-trex.github.io/2025/06/14/hello-world/">
@@ -154,7 +154,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
154154

155155
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
156156
<meta itemprop="image" content="/LLM-Blog/images/avatar.gif">
157-
<meta itemprop="name" content="John Doe">
157+
<meta itemprop="name" content="DeepEngine">
158158
<meta itemprop="description" content="">
159159
</span>
160160

@@ -283,7 +283,7 @@ <h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerl
283283

284284
<div class="site-overview-wrap sidebar-panel">
285285
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
286-
<p class="site-author-name" itemprop="name">John Doe</p>
286+
<p class="site-author-name" itemprop="name">DeepEngine</p>
287287
<div class="site-description" itemprop="description"></div>
288288
</div>
289289
<div class="site-state-wrap motion-element">
@@ -323,7 +323,7 @@ <h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerl
323323
<span class="with-love">
324324
<i class="fa fa-heart"></i>
325325
</span>
326-
<span class="author" itemprop="copyrightHolder">John Doe</span>
326+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
327327
</div>
328328
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
329329
</div>

archives/2025/06/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
<meta property="og:url" content="https://big-trex.github.io/archives/2025/06/index.html">
2626
<meta property="og:site_name" content="LLM-Assignment-Doc">
2727
<meta property="og:locale" content="en_US">
28-
<meta property="article:author" content="John Doe">
28+
<meta property="article:author" content="DeepEngine">
2929
<meta name="twitter:card" content="summary">
3030

3131
<link rel="canonical" href="https://big-trex.github.io/archives/2025/06/">
@@ -260,7 +260,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
260260

261261
<div class="site-overview-wrap sidebar-panel">
262262
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
263-
<p class="site-author-name" itemprop="name">John Doe</p>
263+
<p class="site-author-name" itemprop="name">DeepEngine</p>
264264
<div class="site-description" itemprop="description"></div>
265265
</div>
266266
<div class="site-state-wrap motion-element">
@@ -300,7 +300,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
300300
<span class="with-love">
301301
<i class="fa fa-heart"></i>
302302
</span>
303-
<span class="author" itemprop="copyrightHolder">John Doe</span>
303+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
304304
</div>
305305
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
306306
</div>

archives/2025/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
<meta property="og:url" content="https://big-trex.github.io/archives/2025/index.html">
2626
<meta property="og:site_name" content="LLM-Assignment-Doc">
2727
<meta property="og:locale" content="en_US">
28-
<meta property="article:author" content="John Doe">
28+
<meta property="article:author" content="DeepEngine">
2929
<meta name="twitter:card" content="summary">
3030

3131
<link rel="canonical" href="https://big-trex.github.io/archives/2025/">
@@ -260,7 +260,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
260260

261261
<div class="site-overview-wrap sidebar-panel">
262262
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
263-
<p class="site-author-name" itemprop="name">John Doe</p>
263+
<p class="site-author-name" itemprop="name">DeepEngine</p>
264264
<div class="site-description" itemprop="description"></div>
265265
</div>
266266
<div class="site-state-wrap motion-element">
@@ -300,7 +300,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
300300
<span class="with-love">
301301
<i class="fa fa-heart"></i>
302302
</span>
303-
<span class="author" itemprop="copyrightHolder">John Doe</span>
303+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
304304
</div>
305305
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
306306
</div>

archives/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
<meta property="og:url" content="https://big-trex.github.io/archives/index.html">
2626
<meta property="og:site_name" content="LLM-Assignment-Doc">
2727
<meta property="og:locale" content="en_US">
28-
<meta property="article:author" content="John Doe">
28+
<meta property="article:author" content="DeepEngine">
2929
<meta name="twitter:card" content="summary">
3030

3131
<link rel="canonical" href="https://big-trex.github.io/archives/">
@@ -260,7 +260,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
260260

261261
<div class="site-overview-wrap sidebar-panel">
262262
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
263-
<p class="site-author-name" itemprop="name">John Doe</p>
263+
<p class="site-author-name" itemprop="name">DeepEngine</p>
264264
<div class="site-description" itemprop="description"></div>
265265
</div>
266266
<div class="site-state-wrap motion-element">
@@ -300,7 +300,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
300300
<span class="with-love">
301301
<i class="fa fa-heart"></i>
302302
</span>
303-
<span class="author" itemprop="copyrightHolder">John Doe</span>
303+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
304304
</div>
305305
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
306306
</div>

css/main.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1168,7 +1168,7 @@ pre .javascript .function {
11681168
}
11691169
.links-of-author a::before,
11701170
.links-of-author span.exturl::before {
1171-
background: #00b32b;
1171+
background: #f574b0;
11721172
border-radius: 50%;
11731173
content: ' ';
11741174
display: inline-block;

index.html

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
<meta property="og:url" content="https://big-trex.github.io/index.html">
2626
<meta property="og:site_name" content="LLM-Assignment-Doc">
2727
<meta property="og:locale" content="en_US">
28-
<meta property="article:author" content="John Doe">
28+
<meta property="article:author" content="DeepEngine">
2929
<meta name="twitter:card" content="summary">
3030

3131
<link rel="canonical" href="https://big-trex.github.io/">
@@ -149,7 +149,7 @@ <h1 class="site-title">LLM-Assignment-Doc</h1>
149149

150150
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
151151
<meta itemprop="image" content="/LLM-Blog/images/avatar.gif">
152-
<meta itemprop="name" content="John Doe">
152+
<meta itemprop="name" content="DeepEngine">
153153
<meta itemprop="description" content="">
154154
</span>
155155

@@ -170,7 +170,7 @@ <h2 class="post-title" itemprop="name headline">
170170
<span class="post-meta-item-text">Posted on</span>
171171

172172

173-
<time title="Created: 2025-06-14 12:57:11 / Modified: 12:57:57" itemprop="dateCreated datePublished" datetime="2025-06-14T12:57:11+08:00">2025-06-14</time>
173+
<time title="Created: 2025-06-14 12:57:11 / Modified: 13:07:13" itemprop="dateCreated datePublished" datetime="2025-06-14T12:57:11+08:00">2025-06-14</time>
174174
</span>
175175

176176

@@ -192,13 +192,10 @@ <h4 id="Task-1-MalMul-with-multi-head-variant"><a href="#Task-1-MalMul-with-mult
192192
<p>朴素的矩阵乘法仅对 <code>A1</code><code>batch_size</code> 维度,针对每个序列索引i,都执行 <code>O1[i] = A1[i] @ W1</code> 计算,从而得到形状为 <code>[b, s, e]</code> 的张量 <code>O1</code></p>
193193
<p>在多头矩阵乘法中,我们首先将输入张量 <code>A1</code> 和权重张量 <code>W1</code><code>h</code> 维度均分为 <code>num_heads</code> 个子维度(记为 <code>nh</code>,表示头的数量),由此得到形状为 <code>[b, s, nh, hd]</code> 的四维张量 <code>A2</code> 和形状为 <code>[nh, hd, e]</code> 的三维张量 <code>W2</code>。接下来,对于 <code>A2</code><code>batch_size</code> 维度下的每个序列,遍历其 <code>num_heads</code> 维度上的每个 <code>[s, hd]</code> 矩阵,并将其与 W2 中 <code>num_heads</code> 维度下对应的 <code>[hd, e]</code> 矩阵进行乘法运算。通过多头并行计算,最终输出一个形状为 <code>[b, s, nh, e]</code> 的四维张量 <code>O2</code></p>
194194
<h5 id="TODO"><a href="#TODO" class="headerlink" title="TODO"></a>TODO</h5><p>完成 <code>matmul_with_importance</code><strong>Task1</strong> 的部分,实现上述多头矩阵乘法的逻辑,输入张量 <code>A1</code><code>W1</code>,返回计算值 <code>O2</code></p>
195-
<blockquote>
196-
<p>[!IMPORTANT]</p>
197-
<ol>
198-
<li>输入的张量是 A1 和 W1,你需要自己将其转换为 A2 和 W2 再进行计算,请注意 torch 中 <code>reshape</code>, <code>view</code>, <code>transpose</code>, <code>permute</code>等函数的用法和区别。</li>
199-
<li>虽然逻辑上矩阵的乘法是用遍历进行计算的,但请勿使用 for 循环的方式进行实现,请自行查阅 pytorch 的计算函数,如 <code>@</code>, <code>torch.bmm</code> , <code>torch.mm</code> , <code>torch.matmul</code> , <code>torch.einsum</code> 等。</li>
200-
</ol>
201-
</blockquote>
195+
<div class="note info">
196+
<blockquote><p>[!IMPORTANT]</p><ol><li>输入的张量是 A1 和 W1,你需要自己将其转换为 A2 和 W2 再进行计算,请注意 torch 中 <code>reshape</code>, <code>view</code>, <code>transpose</code>, <code>permute</code>等函数的用法和区别。</li><li>虽然逻辑上矩阵的乘法是用遍历进行计算的,但请勿使用 for 循环的方式进行实现,请自行查阅 pytorch 的计算函数,如 <code>@</code>, <code>torch.bmm</code> , <code>torch.mm</code> , <code>torch.matmul</code> , <code>torch.einsum</code> 等。</li></ol></blockquote>
197+
</div>
198+
202199
<blockquote>
203200
<p>[!NOTE]</p>
204201
<ol>
@@ -254,7 +251,7 @@ <h5 id="TODO-2"><a href="#TODO-2" class="headerlink" title="TODO"></a>TODO</h5><
254251

255252
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
256253
<meta itemprop="image" content="/LLM-Blog/images/avatar.gif">
257-
<meta itemprop="name" content="John Doe">
254+
<meta itemprop="name" content="DeepEngine">
258255
<meta itemprop="description" content="">
259256
</span>
260257

@@ -374,7 +371,7 @@ <h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerl
374371

375372
<div class="site-overview-wrap sidebar-panel">
376373
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
377-
<p class="site-author-name" itemprop="name">John Doe</p>
374+
<p class="site-author-name" itemprop="name">DeepEngine</p>
378375
<div class="site-description" itemprop="description"></div>
379376
</div>
380377
<div class="site-state-wrap motion-element">
@@ -414,7 +411,7 @@ <h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerl
414411
<span class="with-love">
415412
<i class="fa fa-heart"></i>
416413
</span>
417-
<span class="author" itemprop="copyrightHolder">John Doe</span>
414+
<span class="author" itemprop="copyrightHolder">DeepEngine</span>
418415
</div>
419416
<div class="powered-by">Powered by <a href="https://hexo.io/" class="theme-link" rel="noopener" target="_blank">Hexo</a> & <a href="https://muse.theme-next.org/" class="theme-link" rel="noopener" target="_blank">NexT.Muse</a>
420417
</div>

0 commit comments

Comments
 (0)