Reformer: The Efficient Transformer
Recently google introduced new model called "Reformer" which is the new generation of Transformer models, It would solve one of the most important issues in NLP "context windows"! and can Handle up to 1 million words using only 16GB of memory!!!! which was almost impossible to handle even 100k words with traditional attention based Transformer models. Reformer uses locality-sensitive-hashing (LSH) to reduce the complexity of attending over long sequences and reversible residual layers to more efficiently use the memory available.
Sounds that we are going to have new jump up in NLP!!!!