“Having the model closer to you would be really helpful. Using an accelerator like the iMTransformer could help to pare down that footprint, and, in the future, create transformer models that could ingest, process and learn from new data in edge devices. When it comes to large natural-language-processing models, she said, “It’s the difference between a sentence or a paragraph and a book.” But, she added, “The bigger the transformers are, your energy footprint also increases.” Right now the trend is to bulk up transformers so the models get large enough to take on increasingly complex tasks, said Ana Franchesca Laguna, a computer science and engineering PhD at Notre Dame. The “iMTransformer” approach is a transformer accelerator, which works to decrease memory transfer needs by computing in-memory, and reduces the number of operations required by caching reusable model parameters. There’s talk not only of making compute- and energy-hungry transformers more efficient, but of eventually upgrading their design so they can process fresh data in edge devices without having to make the round trip to process the data in the cloud.Ī group of researchers from Notre Dame and China’s Zhejiang University presented a way to reduce memory-processing bottlenecks and computational and energy consumption requirements in an April paper. Designing transformer tech for the edgeīut some researchers are pushing for even more. To meet the growing interest in transformer use, in March the AI chip giant introduced its Hopper h100 transformer engine to streamline transformer model workloads. “The world pivoted in a matter of a few months and everything changed,” Dally said. Dally was referring to an influential 2017 Google research paper presenting an innovative architecture forming the backbone of transformer networks that is reliant on “ attention mechanisms” or “self-attention,” a new way to process the data inputs and outputs of models. Four years ago, everybody was using these recurrent neural networks for these language models and then the attention paper was introduced, and all of a sudden, everybody is using transformers,” said Bill Dally, chief scientist at Nvidia during an AI conference last week held by Stanford’s HAI. “It’s interesting how fast technology for neural networks changes. Now chipmakers and researchers want to make them speedier and more nimble. They’re used across audio-, video- and computer-vision-related tasks, drug discovery and more. Some transformers, particularly some open-source, large natural-language-processing transformer models, even have names that are recognizable to people outside AI, such as GPT-3 and BERT. Over the last few years, these models, known for their massive size, large amount of data inputs, big scale of parameters - and, by extension, high carbon footprint and cost - have grown in favor over other types of neural network architectures. Transformer networks, colloquially known to deep-learning practitioners and computer engineers as “transformers,” are all the rage in AI.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |