In a paper published in the week on the preprint server Arxiv.org, scientists at Google AI, DeepMind, the Turing Institute, and therefore the University of Cambridge propose Performer, an AI model architecture that scales linearly and performs well on tasks like protein sequence modeling. They claim that it’s the potential to impact research on biological sequence analysis while lowering compute costs and compute complexity, at an equivalent time reducing energy consumption and subsequently carbon emissions.
The performer is an offshoot of Transformer, an architecture proposed by Google AI researchers in 2017. Transformers believe a trainable attention mechanism that specifies dependencies between elements of every input sequence (for instance, amino acids within a protein). It’s this that permits them to realize state-of-the-art leads to areas of machine learning including tongue processing, neural MT, document generation and summarization, and image and music generation. But Transformers scale quadratically with the number of tokens — i.e., the sequence of characters — in an input sequence, which is prohibitively expensive for giant tokens.
By contrast, Performers scale linearly by the number of tokens in an input sequence. Their backbone is fast attention via orthogonal random features (FAVOR), a way that maintains marginal distributions of inputs while recognizing that different inputs are statistically independent. this enables Performers to handle long sequences and remain backward-compatible with pretrained regular Transformers, allowing them to be used beyond the scope of Transformers as a more scalable replacement for attention in computer vision, reinforcement learning, and other AI applications.
To evaluate the architecture, the researchers implemented a Performer on top of the pre-existing Transformer training code designed to model protein interactions. (Performer replaced only the eye component, while all other components remained an equivalent .) Both Performer- and Transformer-based baseline models were fed concatenated protein sequences 8,192 tokens long from the open-source database Tremble, and that they were trained on Google AI-designed third-generation tensor processing units (TPUs) containing 16GB of RAM per chip.
The researchers report that the Transformer-based models overloaded the chips’ memory even at a batch size of 1 per chip. On the opposite hand, Performer trained efficiently at a batch size of 16 per chip while self-improving continuously, increasing its performance as training progressed.
The results show Performer may benefit modern bioinformatics “immensely” by scaling up methods to coach faster, more accurate AI models, the co-authors say. “[This] opens the door to the power to style sets of molecules with pre-specified interaction properties. These approaches might be wont to augment existing physics-based design strategies that are of critical importance for instance within the development of the latest nanoparticle vaccines,” they wrote.
Notably, Performer follows the introduction of Reformer, an evolution of the Transformer that Google AI designed to handle context windows of up to 1 million words. By leveraging techniques like locality-sensitive hashing (LSH) and reversible residual layers to use memory efficiently and reduce complexity over long sequences, it’s ready to run on one AI accelerator chip using only 16GB of memory.
For its part, OpenAI recently debuted Sparse Transformers, an open-source machine learning system that will predict what comes next in text, image, and sound sequences 30 times longer than what’s possible with Transformers. Sparse Transformers form the inspiration of JukeBox, a machine learning framework that generates music — including rudimentary songs — as raw audio during a range of genres and musical styles.