As a algorithm engineer, I’ve heard Transformer for years, and even learned it several times, but never had a chance to use it.
This blog is also a blog of learning transformer, as I find a blog introducing transformer very well.
Transformer is so famous that I don’t need to introduce it anymore, here are some materials to learn it:
- [1] Transformer Code by Hardvard
- [2] Attention is all you need
- [3] Transformer learning blog: This post is all you need
And when we need to do a classification or regression task, we can just use TransformerEncoder as model rather than the whole model.