Use analog storage devices to achieve software equivalent accuracy of transformer-based deep neural networks

2021-11-18 08:24:08 By : Mr. William Yue

By combining noise perception training to combat inherent PCM drift and noise sources, it is possible to perform a software-equivalent precision path for the GLUE benchmark on the BERT (two-way encoder representation from the transformer).

The latest advances in deep learning are driven by ever-increasing model sizes, and the network has grown to millions or even billions of parameters. Such a large model requires fast and energy-efficient hardware accelerators. We studied the potential of analog AI accelerators based on non-volatile memory, especially phase change memory (PCM), for software equivalent accurate reasoning for natural language processing applications. We combined noise perception training to combat inherent PCM drift and noise sources, as well as reduced precision digital attention block calculations, and demonstrated the software equivalent precision path INT6 of the GLUE benchmark on the BERT (two-way encoder representation from the transformer) .

View this technical paper here. Released in July 2021.

Spoon K, Tsai H, Chen A, Rasch MJ, Ambrogio S, Mackin C, Fasoli A, Friz AM, Narayanan P, Stanisavljevic M, and Burr GW (2021) implement software on a transformer-based deep neural network with analog memory Equivalent precision equipment. front. calculate. Neuroscience. 15:675741. doi: 10.3389/fncom.2021.675741

Name* (Note: This name will be displayed publicly)

Email* (this will not be displayed publicly)

Imec's senior vice president has conducted in-depth research on GAA FETs, interconnects, chiplets and 3D packaging.

Abstraction is the key to custom processor design and verification, but defining the correct language and tool flow is a work in progress.

Why are cyber attacks on the IC supply chain so difficult to prevent.

Higher density interconnections will enable faster data movement, but there is more than one way to achieve this goal.

High NA EUV scanners may cost nearly US$320 million each, but large foundries are already lining up.

Facts have proved that reducing the cost of microLEDs and increasing output is a difficult task, but display companies and LED suppliers are working together to find solutions with production value.

With the development of SiC to higher voltages, BEV users can get faster charging, longer cruising range and lower system costs

Changes that may push this packaging method to the mainstream, as well as future challenges.

Companies and countries are investing tens of billions of dollars in different qubit technologies, but it is too early to predict the winners.

Five process nodes in four years, high NA EUV, 3D-IC, small chip, hybrid bonding, etc.

Imec's senior vice president has conducted in-depth research on GAA FETs, interconnects, chiplets and 3D packaging.

Abstraction is the key to custom processor design and verification, but defining the correct language and tool flow is a work in progress.

From a design perspective, some things will get better, while others will get worse.