Confused by this conv1d operation #9

airkid · 2019-05-18T07:20:17Z

Hi, I'm reading this code for study and it helps me a lot.
I'm confused by this line:

Line 74 in 254c133

nn.Conv1d(d_model, d_ff, 1),

from the source paper of BERT, I've not found any description that BERT use a conv1d layer in transformer instead of linear transformation.

And from http://nlp.seas.harvard.edu/2018/04/03/attention.html#position-wise-feed-forward-networks, this is implement by a mlp.

Can anyone kindly help me with this problem?

The text was updated successfully, but these errors were encountered:

ne7ermore · 2019-05-19T02:19:59Z

It is the same airkid <notifications@github.com> 于2019年5月18日周六下午3:20写道：

Hi, I'm reading this code for study and it helps me a lot. I'm confused by this line: https://github.com/ne7ermore/torch-light/blob/254c1333eef5ee35a1b5e036f267b81ddad17f96/BERT/model.py#L74 from the source paper of BERT, I've not found any description that BERT use a conv1d layer in transformer instead of linear transformation. And from http://nlp.seas.harvard.edu/2018/04/03/attention.html#position-wise-feed-forward-networks, this is implement by a mlp. Can anyone kindly help me with this problem? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9?email_source=notifications&email_token=AF6W56X5T5P2MDZLUOSUIFTPV6U3HA5CNFSM4HNZXFYKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUQVMWA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF6W56RMABSJNIUKBJQ5WJDPV6U3HANCNFSM4HNZXFYA> .

Provide feedback