Cooler brains

Reconfigurable networks-on-chip (NOC) capable of delivering high-performance fault-tolerant computing while achieving low energy consumption are being developed by researchers at Bristol and Manchester universities.

Dr Jose Nunez-Yanez, senior lecturer in digital systems at Bristol, who leads the EU-funded project, said developments in field programmable gate arrays (FPGAs) had resulted in performance problems. 'FPGAs have grown in size and complexity, with many logic elements, tools and connections spread all over the chip. The wires required to move data around decrease the performance,' he said.

'Our idea was to introduce another level of hierarchy in the communications to make one powerful FPGA consisting of multiple FPGAs connected through a network — this is the concept of a network on chip. We will have a grid connected through an asynchronous network used to communicate among all these elements, where the computing element is an FPGA.'

Multi-core FPGAs with an NOC as the main communication fabric could be used as multiprocessors for high-performance computing within 10 years. In what is known as asymmetric multiprocessing, each computing element has its own memory and communicates with others using packets and messages.

Future uses include video-based scientific applications such as drug discovery and high-definition video calling, which require a high data throughput rate. They could also be used in neural simulation, where a computer models in real time the biological processes that take place in the brain's neurons.

Current FPGAs can partially reconfigure themselves already. If software makes a high demand, the processor can load a bigger stream to contain all the data. The NOC design takes this a step further.

'If you imagine all the processing elements connecting to this network, every processing element can decide by itself what is needed at a particular time and load a bigger stream that makes the processing element deliver that function,' said Nunez-Yanez.

The energy saving is achieved through the globally asynchronous, locally synchronous concept. Every processing element has its own clock and power domain. When it reconfigures itself to do a job, it determines how fast it needs to do it and when it must complete it. According to this deadline, it changes its own voltage and frequency before the deadline passes. By raising or lowering the voltage or the frequency, it can optimise its energy consumption.

Fault tolerance is introduced through stochastic communications, in which a processing element, instead of sending a packet with a specific address, sends a packet with a request for a job. That packet will move around until it finds a processor that can do the job, but it does not have to go to a particular position, or a particular element of that computing grid. So long as one element can carry out a task, there is adaptability in the movement of data. If one path is broken, the data can still move around, as it does not have to follow a particular predetermined path.

Dr Doug Edwards, senior lecturer at the School of Computer Science at Manchester, explained: 'The request weaves its way around like a drunken sailor and you give it a push to make it go in roughly the right direction rather than a totally inappropriate one.

'We'll do this for the initial discovery of the service as advertised and, once this is done, we'll revert to a circuit switched routing quality. The initial request finds where the service is provided on the chip, establishes what the route is and thereafter uses that route for the rest of the data payload. If things need to be reconfigured, you send out another request.'

The high communication bandwidth of the new solution comes through the network itself. 'In current systems, many processes compete for the same element,' said Nunez-Yanez. 'In a network, you introduce more elements so you have more wires available to move data around, giving the high bandwidth.'

In the course of the project Manchester will provide expertise in asynchronous communications to the design of the network. Bristol specialises in processing elements and the reconfigurable computing processing elements. Industrial partners Xilinx, an FPGA manufacturer, and NOC developers Silistix and ST Microelectronics also support the project.

The work runs until 2010, by which time the team aims to have a system on an FPGA device. Otherwise they will have a prototype multi-processing board with which they can evaluate energy consumption, bandwidth and fault tolerance and on which they can map applications such as high- definition video calling.