The Positional Encoding is not using sin / cos? #551

mw66 · 2024-09-05T04:12:41Z

Line 178 in 9755682

pos_emb = self.transformer.wpe(pos) # position embeddings of shape (t, n_embd)

E.g compared with the section Positional Encoding of the following article:

So, did I miss something?

Is this an overlook, or simplification? And how does this affect the training result?

Anyone can help explain?

Thanks.

jhauret · 2024-10-04T22:34:51Z

We used learned position embeddings instead of the sinusoidal version proposed in the original work.

Provide feedback