With a purpose to built architecture for deep learning, Intel launched its Intel Nervana Neural Network Processor (NNP). This new architecture provides the needed flexibility to support all deep learning primitives while making core hardware components as efficient as possible. Intel also mentioned that it was working with Facebook, the two aim for a chip that’s “highly tuned” and “a leap” in terms of handling inference-related tasks.
“Facebook is pleased to be partnering with Intel on a new generation of power-optimized, highly tuned AI inference chip that will be a leap in inference workload acceleration,” Facebook said in a statement.
The Intel Nervana Neural Network Processor features high-speed on- and off-chip interconnects enabling multiple processors to connect card to card and chassis to chassis, acting almost as one efficient chip and scale to accommodate larger models for deeper insights. Intel announced the Nervana Neural Network Processor (NNP-I) as an AI chip for inference-based workloads that fits into a GPU-like form factor.
With the Intel Nervana Neural Network Processor, Intel puts memory first with a large amount of HBM memory and local SRAM much closer to where compute actually happens. This means more of the model parameters can be stored on-die to save significant power for an increase in performance. Built for the future, not from the past.
Intel Nervana Neural Network Processor is mainly used for media searches, content filtering and malware detection, among other purposes. Intel Nervana also uses a new numeric format, Flexpoint, that allows the sort of scalar computations central to neural network inference to be implemented as fixed-point multiplications and additions, resulting in an even greater increase in parallelism while improving power efficiency.