DocsNavigationUser login |
Mapping ControlJust as data objects in a Sequoia program must be mapped to physical storage locations in the target machine, so must control statements be mapped to processors that will execute them. Sequoia draws a distinction between control and computation in a program, though the distinction is really a difference of the intent of the code, rather than its syntax.
All inner task control constructs must be mapped to a control processor in the target machine. This is the processor that will execute the statements. Current Mapping Interface It would be unwieldy and impractical for a mapping directive to be required for every operation in an inner task, such as the addition and assignment statement "int x = y + z;", and yet statements such as this must be executed somewhere. A general method that would allow the programmer to conveniently map control operations as a group is under development, however there is a mapping interface supported that provides a working solution. The current mapping interface that Sequoia supports handles the mapping of control by associating a control level with every Sequoia iteration construct (mappar, mapseq, and mapreduce), and then implicitly placing every other type of control operation in the same control level as its "closest" parent iteration construct. The mapping interface will assign a memory level in the target machine to each such Sequoia iteration construct, and this will cause the Sequoia compiler to emit code for that loop into a file that will be executed by a control processor at that memory level. The exception to this rule is a leaf task call; leaf tasks are always "controlled" by the level-0 controller. If a Sequoia loop that is running on a control processor in level L contains a nested Sequoia loop whose control level is set to level L-1, then this will result in the level-L processor calling into the L-1 processor to execute the contained loop; this call will be emitted as the appropriate communication operation on the target machine by the Sequoia compiler. For example, on a Cell machine, if a mapseq that is running in the PPE has a nested mappar inside it that is set to run in the SPE level, then each PPE mapseq iteration a thread call will be made from the PPE to the SPE to execute the nested mappar contruct. In the matrix multiplication running example, the control level of the Sequoia iteration constructs would be mapped as in the following updated task instance call graph diagram.
Sequoia Loop Transformations In addition to being assigned to a memory level, mapping a Sequoia loop (mappar, mapseq, or mapreduce) to a machine involves specifying any (or none) of the following set of loop transformations that will be performed on the loop.
Note that if a loop is software-pipelined, then the amount of space that each iteration has to work with will be reduced, since multiple iterations will be running in parallel. In the matrix multiplication running example, these transformations would be applied as follows, yielding a fully mapped/instantiated program, with all mapping details having been filled in. This mapped program represents a concrete implementation of the MatMul abstract Sequoua code.
There are 8 parallel SPEs in level 0 in the Cell machine model, and thus the program will be parallelized 8 ways over these processors, though this is acheived via a 2D parallelization mapping. Further, at both the disk-to-memory and memory-to-LS interfaces there are asynchronous transfer operations provided by the underlying hardware/OS, and so at each interface in the program a loop was software-pipelined to acheive overlap between computation and communication. |