Přejít k hlavnímu obsahu

Programmable and Customizable Hardware Accelerators for Self-adaptive Virtual Processors in FPGA

Jaroslav Sýkora
Typ obhajoby
Ph.D.
Datum obhajoby
Místo
ČVUT
Mail
The roots of all evil are the latencies that are statically unpredictable. Dynamic schedule of operations, constructed on-the-fly in data-driven machines, is needed to overcome them. Microthreading is a unified data-driven and dynamically scheduled model for efficient programming of many-core general-purpose processors. It overcomes unpredictable latencies in off-chip memories (DRAMs) and in on-chip shared interconnect. As silicon chips became power-limited, causing the shift from frequency scaling to many-core scaling, the previous work envisioned large-scale homogeneous manycore chips because it assumed that low-clock frequency silicon is easily scalable in space. However, the contemporary and future power constraints will favour heterogeneous (specialized) rather than homogeneous (general-purpose) many-cores because the thermal design power of a chip could be so low that not all cores may be powered up simultaneously. Besides the power issues the other negative side-effect of silicon scaling is an increase in latency of interconnect (metal wires) relative to that of gates: new designs are becoming limited by interconnect delays. As the interconnect delays depend on details of physical placement of modules in a chip or in a reconfigurable array they are difficult to predict accurately early on in the design process. Consequently, future hardware will be special-purpose and customized due to the power issues, and it will be data-driven to overcome on-chip interconnect latencies. This dissertation explores dataflow latency-tolerant techniques with a focus on customized hardware design using reconfigurable hardware arrays. Dataflow is studied at the gate and chip levels: gate-level dataflow overcomes on-chip interconnect delays, and chip-level dataflow allows for the composition of scalable heterogeneous many-cores. The first contribution is an analysis of a contemporary statically scheduled instruction-driven architecture for customized computing realized in an FPGA. In contrast to the original design bases of the architecture it is shown here that high-frequency instruction issue is needed even in an architecture with batch (vector-based) data processing. The second contribution is a method to achieve the highfrequency instruction issue by using dictionary tables of instruction fragments. Statically scheduled data-path used to be preferred because all latencies (including interconnect) were assumed to be fully known early in the design time. The third contribution is a new structured and extensible approach for synthesis of hardware controllers from synchronous Petri nets. The fourth contribution is a new technique for dataflow hardware synthesis from Petri nets. The technique is based on augmented synchronous Petri nets with optimal throughput. The fifth contribution is a technique that combines the data-driven microthreaded procedural computation model with the special-purpose data-driven hardware in structurally programmed reconfigurable arrays. Adaptive transparent migration of microthreads between the general-purpose and special-purpose hardware is demonstrated.
Napsal uživatel admin dne