Optimization
Adam
Batch Normalization
How Does Batch Normalization Help Optimization?
Self-Attention
Self-Attention vs Convolutional Layers
On the Relationship between Self-Attention and Convolutional Layers
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
GAN
Circle GAN
Star GAN
https://openai.com/blog/generative-models/