Jaekeun Lee's Space     Projects     About Me     Blog     Writings     Tags

Transformer

Tags:


Recently, Open AI introduced its third version of language prediction model: GPT-3. The astonishing performance of this autocomplete tool brought excitement and shock to the AI industry. The quality of the model built with 175 billion parameters is very impressive in a sense that it produces human-like outputs. With GPT-3, people built a question-based search engine, HTML generation based on text descriptions, medical question answering machine, and even image autocompletion.

Two year ago, a new embedding technique shocked the world. BERT (Bidirectional Encoder Representations from Transformers) achieved state-of-the-art performance on numerous NLP tasks.

The two NLP models have one thing in common. (Hint: GPT, BERT) They are built upon transformer network.


Transformer

To understand how transformer works,