16th meeting




继续阅读“16th meeting”

A dynamic synchronization mechanism and its implementation on Cell Broadband Engine.

Title: A dynamic synchronization mechanism and its implementation on Cell Broadband Engine.


Background: Nowadays, multicore on a chip is a common trend for microprocessor architecture. The successful example is a Cell Broadband Engine which is designed as a main engine for the game machine. As for the calculation of computer graphics on the multicore processors, the whole tasks can be divided and allocated statically to each core and the synchronization of each task is done statically, since the execution time are estimated beforehand. The cost performance of the multicore like Cell Broadband Engine is very high; so many researchers are working on how to apply multicore processors to more general application fields. In general, if a big task is divided into many subtasks, the execution time of each subtask cannot be estimated beforehand, so the dynamic task allocation and dynamic synchronization are needed for the efficient execution of parallel tasks. The data-driven principle and scheme is very simple and formal method for this dynamic synchronization mechanism. So, this research aims to find and evaluate the effectiveness of dynamic synchronization mechanism on the Cell Broadband Engine.


继续阅读“A dynamic synchronization mechanism and its implementation on Cell Broadband Engine.”

14,15th meeting




继续阅读“14,15th meeting”

11th meeting

This week, I realised the waiting mechanism as I wrote last week. And ran on CELL BE machine.

I choosed the 8*8 data and devided it into 4 blocks. Single execute time is about 0.005s and the parallel time is about 0.010s. It means I still have a lot of work to do.

Go through the code I wrote again, I find out a lot of “for” and replacement, which may cause the delay of the execution.

继续阅读“11th meeting”

10th meeting

1. First, analyze the algorithm parts and step, ex: point of interest algorithm consist of five parts, we divide it into 9 steps.

2. Program the spe, each spe can do the whole job, depends on the worknum sent to spe. Ex: the single CPU can do as in x86 model.

3. Define the partition and size of work. Ex: here we part the work into 8, so 8*9 steps need to be worked.

继续阅读“10th meeting”