This site may earn affiliate commissions from the links on this folio. Terms of utilise.

The annual engineering and technical conference known as Hot Fries kicked off yesterday, and Qualcomm was out in forepart to item its new DSP, the Hexagon 680. Digital Indicate Processors (DSPs) aren't something we've discussed much at ExtremeTech, and Qualcomm is putting a major marketing button backside their DSP technology for the first fourth dimension. How does the chip work, what makes it an integral part of Snapdragon 820, and how does it advance heterogeneous calculating?

DSPs are specialized processors defended to digital signal processing. Similar GPUs, DSPs are designed to exploit parallelism. Like CPUs, they frequently make utilize of SIMD (unmarried teaching, multiple data) and VLIW processing to boost throughput and total performance per watt. Also like GPUs, DSPs are designed to perform a very specific subset of tasks. CPUs tin handle these tasks (and sometimes practise), but DSPs offer better performance than general processors, and more flexibility than a traditional ASIC. This relationship is captured in the slide below:

Comparing of FPGAs, ASICs, CPUs, and DSPs

Qualcomm's Hexagon 680 DSP

Qualcomm'southward Hexagon 680 is designed to advance sure workloads at performance efficiencies well above annihilation a modern CPU can offer. The Hexagon 680 is a VLIW (Very Long Instruction Word) processor, pregnant it's designed to extract maximum parallelism per clock cycle and to spread workloads across a wide fix of execution units.

ThreadingModel

The Hexagon 680 DSP

The 680 DSP offers four parallel scalar threads, each with four-manner VLIW support and a shared L1/L2. Each of these scalar groups is clocked at 500MHz for a maximum throughput of 2GHz-equivalent worth of processing. On the vector side of the equation, the 680 has 32 1024-chip vector registers. Each educational activity can address up to iv of these per cycle, for a maximum output of 4096 bits per cycle per didactics. It also includes support for Qualcomm's new Hexagon Vector Instructions, or HVX. The HVX registers tin be controlled by any two of the scalar registers.

Here'due south what this means in aggregate: The Hexagon 680 is designed to permit for extensive threading and to share data across the L1 and L2 caches. There's no penalty to using the HVX units and the scalar units simultaneously, provided that the workload is designed for it. The vector processors don't have admission to L1, but treat L2 as their first level of memory. L1 and L2 are kept coherent and data can exist streamed into L2 from DDR memory at upwards to i.2Gpixels/s. This supports some of the avant-garde capabilities of the Hexagon 680 (we'll talk about these below).

According to Qualcomm, the operation advantages of these new features is enormous. While this data is provided by the company and should exist taken with a grain of salt, there's nothing outlandish here. These kinds of accelerations are typical when moving to a loftier-end defended chip as opposed to executing code on a general-purpose CPU.

DSP benchmarks

Hexagon 680 DSP performance

Qualcomm believes that the programming model for the Hexagon 680 is like plenty to CPU models to allow programmers to use the hardware effectively, but with significant overall improvements.

DSP-vs-CPU

DSP vs. CPU power consumption

Power consumption should also be much reduced, thanks to the simpler nature of the VLIW model and utilize of L2 for vector processing rather than both the L1 and L2. The company also notes that by adopting its DSP for low frequencies, it tin cut leakage current and reduce overall power consumption.

Applications and heterogeneous calculating

The all-time awarding processor on Earth isn't worth much without applications to run on information technology, only the Hexagon 680 DSP delivers on this front as well. Qualcomm claims that the new chip is fully heterogeneous, pregnant it tin share information betwixt CPU, GPU, and the DSP. Qualcomm is besides a founding member of AMD's HSA consortium, and while it isn't calling its heterogeneous compute model past that name, we look the two to be like on a conceptual level. The DSP inside the Snapdragon 820 can exist used to return AR or VR, tapped for improve video playback and encoding, or used past the photographic camera for extensive improvements in depression-light photography. Alternately, HVX can be used to enhance detail in standard photos, equally shown below.

Enhance. Enhance. Enhance.

Enhance. Raise. Enhance.

Qualcomm has stated that the Hexagon 680 tin perform low light enhancement 3x faster than a Krait SoC, while using one/ten every bit much power. Programmers volition exist able to use the DSP and write applications to run on it, which could give the Snapdragon 820 platform a substantial leg upwardly over the contest. DSPs accept shipped on SoCs for a long fourth dimension, but few companies spend equally much time talking up their solutions as part of a heterogeneous compute platform as Qualcomm has.

In the past, a component like the DSP would exist invisible, buried under interest in the CPU and GPU. Qualcomm's decision to talk about the chip is a sign of the times. As visual processing, augmented reality, and virtual reality take the stage, more and more than consumers expect avant-garde capabilities from their smartphones. For lower-tech users, that means high quality photos and video, while gamers and enthusiasts desire cutting-edge performance and meliorate battery life. The Hexagon 680 DSP is meant to speak to all these needs, with power efficiency that will vanquish fifty-fifty the upcoming Kryo CPU, flexibility and heterogeneous compute capability to whet the appetites of programmers and awarding developers, and functioning that appeals to enthusiasts, gamers, and the general public.

After these disclosures, the Kryo is the last piece of the puzzle still to drop into place. Hopefully we'll accept details on the CPU core sooner rather than after.