Skip to content

Commit de0ec75

Browse files
committed
Site updated: 2025-06-17 16:01:01
1 parent 8f2c14f commit de0ec75

File tree

3 files changed

+10
-22
lines changed

3 files changed

+10
-22
lines changed

2025/06/17/A2-norm-emb/index.html

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
<meta property="og:description" content="Task 1: 均方根层归一化 (RMS Norm) 均方根层归一化(RMS Norm)是深度学习中应用最广泛的归一化模块,尤其在自然语言处理(NLP)和大语言模型(LLM)领域。该模块以形状为 [batch_size, seqlen, hidden_size] 的张量为输入(记为 X,形状为 [b, s, h]),并沿着隐藏层 h 维度,执行带可学习缩放变换的均方根归一化操作,得到输出 Y,形状">
2929
<meta property="og:locale" content="zh_CN">
3030
<meta property="article:published_time" content="2025-06-17T06:17:05.000Z">
31-
<meta property="article:modified_time" content="2025-06-17T07:59:19.357Z">
31+
<meta property="article:modified_time" content="2025-06-17T07:59:55.884Z">
3232
<meta property="article:author" content="DeepEngine">
3333
<meta property="article:tag" content="RMSNorm">
3434
<meta property="article:tag" content="Vocab Embedding">
@@ -181,7 +181,7 @@ <h1 class="post-title" itemprop="name headline">
181181
<span class="post-meta-item-text">发表于</span>
182182

183183

184-
<time title="创建时间:2025-06-17 14:17:05 / 修改时间:15:59:19" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time>
184+
<time title="创建时间:2025-06-17 14:17:05 / 修改时间:15:59:55" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time>
185185
</span>
186186
<span class="post-meta-item">
187187
<span class="post-meta-item-icon">
@@ -213,9 +213,7 @@ <h4 id="task-1-均方根层归一化-rms-norm">Task 1: 均方根层归一化 (RM
213213
<code>h</code> 维度,执行带可学习缩放变换的均方根归一化操作,得到输出
214214
<code>Y</code>,形状为 <code>[b, s, h]</code>。具体公式如下所示:</p>
215215
<p>$$ Y = $$</p>
216-
<p><span class="math display">$$
217-
RMS[X]=\sqrt{\frac{1}{h} \sum_{i=1}^{h}x_i^2 + \epsilon}
218-
$$</span></p>
216+
<p>$$ RMS[X]= $$</p>
219217
<p>其中,<span
220218
class="math inline"><em>R</em><em>M</em><em>S</em>[<em>X</em>]</span>
221219
表示 <code>X</code> 的均方根,对于 <code>i in batch_size</code>
@@ -236,12 +234,8 @@ <h4 id="task-1-均方根层归一化-rms-norm">Task 1: 均方根层归一化 (RM
236234
class="math inline"><em>γ</em></span> 的隐藏层维度 <code>h</code>
237235
均匀划分为 <code>Xg</code> 组,并对第 <code>i</code> 组分别应用 <span
238236
class="math inline">(1)(2)</span> 式中的 <em>RMS Norm</em>
239-
操作,具体公式如下: <span class="math display">$$
240-
Y_{g_i}=\frac{X_{g_i}}{RMS[X_{g_i}]} \odot \gamma_{g_i}
241-
$$</span></p>
242-
<p><span class="math display">$$
243-
RMS[X_{g_i}]=\sqrt{\frac{1}{gz} \sum_{j=1}^{gz}x_{g_i, j}^2 + \epsilon}
244-
$$</span></p>
237+
操作,具体公式如下: $$ Y_{g_i}= _{g_i} $$</p>
238+
<p>$$ RMS[X_{g_i}]= $$</p>
245239
<p>此外,我们还应该为该 <em>Group RMS Norm</em> 模块实现一个名为
246240
<code>reset_parameters</code> 的参数初始化方法,用于为可学习的参数矩阵
247241
<span class="math inline"><em>γ</em></span>

css/main.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1168,7 +1168,7 @@ pre .javascript .function {
11681168
}
11691169
.links-of-author a::before,
11701170
.links-of-author span.exturl::before {
1171-
background: #85ff61;
1171+
background: #1163da;
11721172
border-radius: 50%;
11731173
content: ' ';
11741174
display: inline-block;

default-index/index.html

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ <h2 class="post-title" itemprop="name headline">
175175
<span class="post-meta-item-text">发表于</span>
176176

177177

178-
<time title="创建时间:2025-06-17 14:17:05 / 修改时间:15:59:19" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time>
178+
<time title="创建时间:2025-06-17 14:17:05 / 修改时间:15:59:55" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time>
179179
</span>
180180
<span class="post-meta-item">
181181
<span class="post-meta-item-icon">
@@ -207,9 +207,7 @@ <h4 id="task-1-均方根层归一化-rms-norm">Task 1: 均方根层归一化 (RM
207207
<code>h</code> 维度,执行带可学习缩放变换的均方根归一化操作,得到输出
208208
<code>Y</code>,形状为 <code>[b, s, h]</code>。具体公式如下所示:</p>
209209
<p>$$ Y = $$</p>
210-
<p><span class="math display">$$
211-
RMS[X]=\sqrt{\frac{1}{h} \sum_{i=1}^{h}x_i^2 + \epsilon}
212-
$$</span></p>
210+
<p>$$ RMS[X]= $$</p>
213211
<p>其中,<span
214212
class="math inline"><em>R</em><em>M</em><em>S</em>[<em>X</em>]</span>
215213
表示 <code>X</code> 的均方根,对于 <code>i in batch_size</code>
@@ -230,12 +228,8 @@ <h4 id="task-1-均方根层归一化-rms-norm">Task 1: 均方根层归一化 (RM
230228
class="math inline"><em>γ</em></span> 的隐藏层维度 <code>h</code>
231229
均匀划分为 <code>Xg</code> 组,并对第 <code>i</code> 组分别应用 <span
232230
class="math inline">(1)(2)</span> 式中的 <em>RMS Norm</em>
233-
操作,具体公式如下: <span class="math display">$$
234-
Y_{g_i}=\frac{X_{g_i}}{RMS[X_{g_i}]} \odot \gamma_{g_i}
235-
$$</span></p>
236-
<p><span class="math display">$$
237-
RMS[X_{g_i}]=\sqrt{\frac{1}{gz} \sum_{j=1}^{gz}x_{g_i, j}^2 + \epsilon}
238-
$$</span></p>
231+
操作,具体公式如下: $$ Y_{g_i}= _{g_i} $$</p>
232+
<p>$$ RMS[X_{g_i}]= $$</p>
239233
<p>此外,我们还应该为该 <em>Group RMS Norm</em> 模块实现一个名为
240234
<code>reset_parameters</code> 的参数初始化方法,用于为可学习的参数矩阵
241235
<span class="math inline"><em>γ</em></span>

0 commit comments

Comments
 (0)