Geek In Parallel
So what’s the best way to inspire a bunch of software geeks? Tell them that programming can be made simple, and then provide a massively complex explanation of how simple it is.
Hot Chips At the annual Hot Chips symposium on high-performance chippery on Sunday, the assembled chipheads were led through a four-hour deep dive into the latest developments on marrying the power of CPUs, GPUs, DSPs, DMA engines, codecs, and other accelerators through the development of an open source programming model.
The tutorial was conducted by members of the HSA – heterogeneous system architecture – Foundation, a consortium of SoC vendors and IP designers, software companies, academics, and others including such heavyweights as ARM, AMD, and Samsung. The mission of the Foundation, founded last June, is “to make it dramatically easier to program heterogeneous parallel devices.”
Honestly, the next three pages of article are fascinating in that they dodge any real detail and yet give enough of an idea to people who follow this stuff that they’re serious about trying to fix what’s becoming a massive problem in modern technology: how do you take full advantage of all the resources available to the program?
I’ve had a hand in looking at this problem for a good ten years, and it was probably seven years ago when some of us first started discussing the implications of the future that led to this. Many others have been looking at the problem for much longer… I’d be there are some smart people at companies like Google who have had this on their mind since the company’s inception, since Google is an easy example of a large variety of different processing cores that need to execute as quickly as possible.
Why so fast? Beyond the simple answer of, “because,” there’s a much bigger point. Most of these systems are not things that easily scale down in power when any code is running. When idle, it’s relatively easy to lower silicon power. When running, there needs to be a lot of area dedicated to actually scaling the running systems. Accomplishing the task as fast as possible allows the systems to either move to the next task or run fast to an idle state.
And why is this so hard? Honestly, it’s hard because it is. It’s hard because any process is difficult to break down to component tasks that are independent of each other in a way that scales. Finding simple answers to that has bedeviled any efficiency expert for centuries since the first efforts of scalable processing… Mr. Whitney’s mass production efforts.
So I applaud any effort to solve these issues, though I don’t expect that a “simple” answer to a problem this hard is anywhere in the horizon. But the geek in me will keep reading.