The large number of embedded soft core processors available today make it tedious and time consuming to select the best processor for a given application. This task is even more challenging due to the numerous configuration options available for a single soft core processor while optimizing for contradicting design requirements such as performance and area. In this paper, we propose a generic framework for rapid performance estimation of applications on soft core processors. The proposed technique is scalable to the large number of configuration options available in modern soft core processors by relying on rapid and accurate estimation models instead of time consuming FPGA synthesis and execution-based techniques. Experimental results on two leading commercial soft core processors executing applications from the widely used CHStone benchmark suite show an average error of less than 6%, while running in order of minutes when compared to hours taken by synthesis-based techniques.
FPGA routing architectures consist of routing wires and programmable switches which together account for the majority of the fabric delay and area, making evaluation and optimization of an FPGA's routing architecture very important. Routing architectures have traditionally been evaluated using a full synthesize, pack, place and route CAD flow over a suite of benchmark circuits. While the results are accurate, a full CAD flow has a long runtime and is often tuned to a specific FPGA architecture type, which limits exploration of different architecture options early in the design process. In this paper we present Wotan, a tool to quickly estimate routability for a wide range of architectures without the use of benchmark circuits. At its core, our routability predictor uses efficient path enumeration through the FPGA's routing graph to 1) estimate the probability of node congestion and 2) estimate the probabilities to successfully route a randomized subset of (source, sink) pairs, which are then combined into an overall routability metric. We describe our predictor and present routability estimates for a range of 6-LUT and 4-LUT architectures using mixes of wire types connected in complex ways, showing a rank correlation of 0.91 with routability results from the full VPR CAD flow while requiring 18x less CPU effort.
To improve the computing performance in real-time applications, modern embedded platforms comprise hardware accelerators that speed up the tasks most compute-intensive parts. A recent trend in the design of real-time embedded systems is to integrate FPGA that are reconfigured with accelerators at runtime, to cope with timing constrained, dynamic workloads. One of the major limitations when dealing with partial FPGA reconfiguration in real-time systems is that the reconfiguration port can only perform one reconfiguration at a time, therefore it is possible to incur in scheduling problems like starvation and priority inversion. This paper shows how priority inversion and starvation can be solved by making the reconfiguration process preemptive, i.e., allowing it to be interrupted at any time and resumed at a later time without restarting it from scratch. Such a feature is crucial for the design of runtime reconfigurable real-time systems, but not yet available in today's platforms. Furthermore, it has been identified and analyzed the tradeoff of achieving a guaranteed bound on the reconfiguration delay for low-priority tasks and the maximum delay induced for high-priority tasks when preempting an ongoing reconfiguration. Experimental results on the Xilinx Zynq-7000 platform show that the proposed implementation of preemptive reconfiguration introduces a low runtime overhead, thus effectively solving priority inversion and starvation.