|
28 | 28 | <meta property="og:description" content="Task 1: 均方根层归一化 (RMS Norm) 均方根层归一化(RMS Norm)是深度学习中应用最广泛的归一化模块,尤其在自然语言处理(NLP)和大语言模型(LLM)领域。该模块以形状为 [batch_size, seqlen, hidden_size] 的张量为输入(记为 X,形状为 [b, s, h]),并沿着隐藏层 h 维度,执行带可学习缩放变换的均方根归一化操作,得到输出 Y,形状"> |
29 | 29 | <meta property="og:locale" content="zh_CN"> |
30 | 30 | <meta property="article:published_time" content="2025-06-17T06:17:05.000Z"> |
31 | | -<meta property="article:modified_time" content="2025-06-17T06:52:51.886Z"> |
| 31 | +<meta property="article:modified_time" content="2025-06-17T07:03:44.066Z"> |
32 | 32 | <meta property="article:author" content="DeepEngine"> |
33 | 33 | <meta property="article:tag" content="RMSNorm"> |
34 | 34 | <meta property="article:tag" content="Vocab Embedding"> |
@@ -181,7 +181,7 @@ <h1 class="post-title" itemprop="name headline"> |
181 | 181 | <span class="post-meta-item-text">发表于</span> |
182 | 182 |
|
183 | 183 |
|
184 | | - <time title="创建时间:2025-06-17 14:17:05 / 修改时间:14:52:51" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time> |
| 184 | + <time title="创建时间:2025-06-17 14:17:05 / 修改时间:15:03:44" itemprop="dateCreated datePublished" datetime="2025-06-17T14:17:05+08:00">2025-06-17</time> |
185 | 185 | </span> |
186 | 186 | <span class="post-meta-item"> |
187 | 187 | <span class="post-meta-item-icon"> |
@@ -212,9 +212,9 @@ <h4 id="task-1-均方根层归一化-rms-norm">Task 1: 均方根层归一化 (RM |
212 | 212 | <code>X</code>,形状为 <code>[b, s, h]</code>),并沿着隐藏层 |
213 | 213 | <code>h</code> 维度,执行带可学习缩放变换的均方根归一化操作,得到输出 |
214 | 214 | <code>Y</code>,形状为 <code>[b, s, h]</code>。具体公式如下所示: <span |
215 | | -class="math display">$$ |
| 215 | +class="math display">$$\begin{equation} |
216 | 216 | Y=\frac{X}{RMS[X]} \odot \gamma |
217 | | -$$</span></p> |
| 217 | +\end{equation}$$</span></p> |
218 | 218 | <p><span class="math display">$$ |
219 | 219 | RMS[X]=\sqrt{\frac{1}{h} \sum_{i=1}^{h}x_i^2 + \epsilon} |
220 | 220 | $$</span></p> |
|
0 commit comments