

# Estimation of Real Dynamic Power on Field Programmable Gate Array

CHALBI Najoua, BOUBAKER Mohamed, BEDOUI Mohamed Hedi

\*\*\*

Abstract This paper presents register transfer level power models for field Programmable Gate Array. The development of the dynamic power models results from a real measurement bench based on Spartan6. Power models are in function of the activity rate and the precision. The results are compared with Xilinx Xpower tool. **We have validated our operator's power models by** using the FIR filter computing application in an FPGA Spartan6. The experimental results show that the average accuracy of the model is higher and the maximum reached average error is equal to 5.5%.

#### 1. Introduction and related works

The increase in the operating clock frequency and the integration density make full consideration to Field Programmable Gate Array (FPGA) power consumption. The embedded systems battery operating time and the circuit reliability are so much affected by the power dissipation increase. FPGA circuit presents the flexibility advantage which enable us to prototype several embedded applications, but its performance is affected by the higher consumption power. So, it is very interested to control power consumption and to develop accurate estimation methodologies and techniques. This work deals in this context. Many estimations approaches were applied at each conception level starting from physical level to system level. It is widely known that the greatest power gain can be achieved at high level.

High level estimation approaches can be divided into two categories: probabilistic and statistical. Probabilistic techniques [1], [2], [3] are based on input stream to estimate the switching activity of the circuit. Probabilities was first used in [4] where a zero delay was assumed and a temporal independence assumption was considered, so transition probabilities are computed using the signal probabilities which are supplied by the user at the inputs and propagated from the inputs to the outputs of the circuit. Another probabilistic method was developed in [5], [6] where the circuit activity transition density measure was used and the inputs were with spatial independence. These techniques are accurate but they can't estimate the glitches power. Power statistical techniques [7], [8], [9] used random input patterns and monitor the power dissipation by simulator. In order to establish accurate measure, high number of simulated vectors is required, which increase the estimation run time. A Monte Carlo simulation technique was studied in [10] to overcome this problem. This technique uses input vectors that are randomly generated. Survey sampling perspective was addressed in [11]. The sequence vectors were provided to estimate power dissipation of a given circuit with certain statistical constraints such as confidence level and error. This technique divides the vectors sequence into consecutive vectors, to constitute the population of the survey. The average power was estimated by simulating the circuit by a large number of samples drawn from the Population [12]. Otherwise, some power macro modeling techniques have been introduced in [13], [14]. In [15], the authors used analytical approach without considering temporal correlation only the spatial correlation was considered. Other register transfer level (RTL) power macro models tried to exploit the low level characteristics have been explored in [16], [17], [18], [19]. These models depend on the probability, the transition density, the spatial and temporal correlations taking into account the spatial independence between signals.

We present in this work an estimation power methodology based on using a real measurement bench using the Spartan 6 environment because the accurate estimates starts from real measures. The real power estimation was compared to Xilinx Xpower tool results. The methodology was applied to arithmetic operators such as adders and multiplier. The validation was performed in filters FIR applications.

This paper is organized as following. In the first part we introduce the problem and describe some related works. In the second one, we describe the dynamic power **estimation methodology.** The operator's power models are presented in the third part. The results are reported in the fourth part. Finally, we conclude and explore our future works.

# 2. Dynamic power estimation methodology

We have developed a power measurement bench based on the Attlys Board environment [20], using the FPGA Spartan6 LX45 (Technology 45nm). This device is dedicated to optimize the high performance logic. It allows us to better control the operating implemented application through USB2 system provided by Digilent Adept. We can also configure the FPGA and control in real-time the currents and the consumed powers of different parts including the core and the I/O powers. The developed models are obtained by using our real bench with varied input activity factor. All the inputs/outputs are registered and the applications are placed near the FPGA I/O to avoid additional routing power and to minimize the glitches power. The models are based on the input precision and the activity rate at fixed frequency to 100MHz. The following figure describes the power estimation methodology (Fig1.). A comparison was done between our methodology and the Xilinx Xpower tool.



Fig1. Estimation methodology description

For each input sequence vectors, we calculate its activity\_rate (activity\_inputs) as the average number of transitions from  $0 \rightarrow 1$  and  $1 \rightarrow 0$  of each bit  $b_i$  of the vector I and then we propagate these sequences from the input to the output in order to evaluate the application activity \_rate. The average activity ( $\alpha$ ) is calculated by the basic formula as follows:

$$average\_activity = \frac{1}{w} * \sum_{w} (\frac{1}{L-1} * \sum_{l=1}^{L} (tr_{bi(l) \to bi(l+1)} (0 \to 1) + tr_{bi(l) \to bi(l+1)} (1 \to 0)))$$

(1)

Where L is the number of vectors sequence and w is the precision of each vector I;  $1 \le b_{\perp} \le w$ 

This experimental bench allows as to:

- Measure the static power when no application is implemented in FPGA circuit.
- Measure the global power for different input precision and different activity rates.

The measure of different dynamic power is commanded by the two signals reset and start. We measure in first step the power consumed by the input vectors register. In the second step, the inputs are send to the application test (start='1' and reset='0'), we measure in this case the global consumed power

 $P_{dyn \ global} = P_{dyn}(Register) + P_{dyn}(Application)$  and then deduce the application consumed power, also the different component of dynamic power such as (Pdyn(clock),Pdyn(signal), Pdyn(logic) and Pdyn(I/O)).

# 3. Basic Operator models

The static power is assumed to be invariant in function of the activity design because the implemented circuit that we have implemented is small and the static power increase is negligible. We reported in this part, the measured power by our tool and the Xpower tool.

# 3.1. Adder model

The table bellow (Table1) illustrates a comparison between the Xpower estimated power and the measured power while using our methodology for the 8 bit Adder arithmetic basic operators. The column one describes the activity variation from 25% to 100%. The total real measured power is outlined in the second column and its different power components (Pclock,Plogic, Psignal and PI/O), the column three reports the Xpower estimated power where as the last one describes the average error between the two methods. The last column rapports the error between the two measured powers. An average error of 54% outlines the inaccuracy of the Xpower tool as indicated in [21] and justifies the accuracy of our real measurement power tool

Tab1. Measured dynamic power of an adder (8 bit) on spartan6 at F=100MHz

| α   | Pdyn(mw) Xpower |       |        |       |       |             | Erro<br>r |
|-----|-----------------|-------|--------|-------|-------|-------------|-----------|
| (%) | clock           | logic | signal | 1/0   | total | dyn<br>real | (%)       |
| 0   | 1.83            | 0.0   | 0.0    | 0.0   | 1.83  | 0.84        | 54.0      |
| 25  | 1.83            | 1.62  | 6.67   | 3.9   | 14.02 | 6.23        | 55.5      |
| 50  | 1.83            | 4.005 | 12.80  | 8.86  | 27.5  | 12.9        | 52.8      |
| 100 | 1.83            | 7.009 | 18.41  | 19.96 | 47.2  | 21.2        | 55.0      |

The variation of the adder (8 bit) dynamic power and its components at frequency F=100MHz is described in the following figure (Fig2) and its analytical power model of the real measured power function to the activity\_rate is mentioned in figure 3 and equation (2).



Fig2. Variation of the adder (8 bit) dynamic power and its components at F=100MHz



Fig3. Analytical model of the 8 bit adder in function of the activity\_rate

$$Pdyn(Adder)(\alpha) = 0.204 \times \alpha + 1.366$$
<sup>(2)</sup>

In order to expand the generic adder, we have varied the input precision from 8, 16, 32, 64 to 128 bits, we have measured the consumption dynamic power (Tab2). The figure2 describe the variation of dynamic power of the generic adder in function of the activity rate at fixed frequency to 100 MHz.

# Tab2. Dynamic power of the generic adder with varied input precision (w) and activity\_rate at frequency F=100MHz

|   | W(bits) | Activity_rate(%) | Pdyn(measured) |  |  |
|---|---------|------------------|----------------|--|--|
|   |         |                  | (mw)           |  |  |
|   | 8       | 0                | 0.84           |  |  |
| ĺ | 16      | 25               | 14.5           |  |  |
|   | 32      | 50               | 41.65          |  |  |
|   | 64      | 100              | 88.98          |  |  |



Fig4. Linear variation of the generic adder dynamic power in function of the activity\_rate at F=100MHz

The variation of the dynamic power of the generic adder can be approximated by the linear model (Fig4) as described by equation (3) at fixed frequency to 100MHz:

$$Pdyn(Adder)(\alpha) = 0.906 \times \alpha - 3.162$$

(3)

# 3.2. Based Luts multiplier power model

The same measurement was done for the based luts multiplier (8×8) bit. The table and the figure as follow describe the variation of total dynamic power and its component while varying the activity\_rate. The estimated dynamic power using the Xpower tool is described in the second column. The third column rapports the measured dynamic power while using our experimental bench where as the last column deals with the error between the two measured powers. We outlined an average error of 33% between the two results.

Tab3. Measured dynamic power versus the estimated one using the Xilinx Xpower tool for the based lut multiplier  $(8\times8)$  bit

| α   | Pdyn(mw) Xpower |       |        |       |       | Р           | Erro |
|-----|-----------------|-------|--------|-------|-------|-------------|------|
| (%) | clk             | logic | signal | 1/0   | total | dyn<br>real | (%)  |
| 0   | 2.16            | 0.0   | 0.0    | 0.0   | 2.16  | 1.8         | 16.6 |
| 25  | 2.16            | 5.44  | 5.68   | 15.5  | 28.78 | 14.65       | 49.0 |
| 50  | 2.16            | 10.89 | 7.21   | 22,58 | 42.84 | 26.47       | 38.2 |
| 75  | 2.16            | 13.52 | 9.77   | 34.53 | 59.98 | 43.65       | 27.2 |
| 100 | 2.16            | 20.25 | 14.11  | 41.29 | 77.78 | 51.52       | 33.7 |

In order to expand the generic multiplier, we have varied the input precision from 8, 16, 32, 64 bits; and then measured the consumption dynamic power. The figure5 describe the variation of dynamic power of the generic multiplier in function of the activity rate at fixed frequency to 100 MHz. Tab4. Dynamic power of the generic multiplier at frequency F=100MHz

| W(bits) | Activity_rate(%) | Pdyn(measured) (mw) |
|---------|------------------|---------------------|
| 8       | 0                | 1.8                 |
| 16      | 25               | 42.65               |
| 32      | 50               | 104.12              |
| 64      | 100              | 223.45              |



Fig5. Variation of the generic multiplier dynamic power in function of the activity rate at fixed frequency to 100MHz

The variation of the dynamic power of the generic multiplier can be approximated by the linear model (Fig5) as described by equation (4) at fixed frequency to 100MHz:

 $Pdyn(Multiplier)(\alpha) = 2.256 \times \alpha - 5.726$ (4)

# 4. Results

To validate the operator's power models, we have chosen two applications a FIR filter with 4 stages (a) and another with 20 stages (b). The FIR filter architecture is described by this figure (Fig6) and surface performances are reported in table5. The architecture is on pipeline, we have placed registers after every block made up of one multiplier and an adder in order to minimize the glitches power.



Fig6. Description of the FIR filter architecture

Tab5. The FIR filters surface performances

| Archit<br>ecture | Slices<br>Luts | Slices<br>Registe<br>rs | IOBs | Occupation_<br>rate (%) |
|------------------|----------------|-------------------------|------|-------------------------|
| FIR 4<br>stages  | 2310           | 1450                    | 35   | 8.5                     |
| FIR 20<br>stages | 8927           | 3875                    | 35   | 33                      |

Estimated dynamic powers are summarized in table 6 for FIR filters 4 and 20 stages. The column three rapports the Xpower dynamic power. The measured dynamic power obtained from our bench is described in the fourth column. The developed model for dynamic power issues of real measurement is mentioned in the fifth column where as the last one carried out the error between the real measured dynamic power and its corresponding model.

Tab6. FIR filters power consumption obtained from Xpower and real measures at fixed frequency to 100MHz

| Architect | α   | Pdyn(m | Pdyn   | Pdyn    |        |
|-----------|-----|--------|--------|---------|--------|
| ures      | (%) | W)     | real   | (model) | Error( |
|           |     | Xpower |        | (mw)    | %)     |
| FIR 20    | 25  | 702    | 354.23 | 368.89  | 3.97   |
| stages    | 50  | 921    | 448.32 | 468.64  | 4.33   |
|           | 75  | 1097   | 663.12 | 702.23  | 5.56   |
|           | 100 | 1362   | 790.15 | 820.46  | 3.69   |
| FIR 4     | 25  | 175.23 | 87.23  | 91.46   | 4.62   |
| stages    | 50  | 210.32 | 132.10 | 138.44  | 4.57   |
|           | 75  | 276.42 | 174.85 | 181.24  | 3.52   |
|           | 100 | 321.21 | 200.12 | 211.52  | 5.38   |

The table6 outlines an average error of 4.38% between the measured and estimated model which verifies the accuracy of our measurement methodology and our architectural power models.

# 5. Conclusion

We have presented in this paper a real power measurement methodology based on Spartan6. Some analytical power models for adders and multipliers were developed function of the activity rate at fixed frequency. A comparison was established between our results and there from Xilinx Xpower tool. We have validated our models and approach on a 4 and 20 stages FIR filters. We outline an average error of 4.38% between the measured and estimated model which verifies the accuracy of our measurement methodology and our architectural power models. As future work, we think to enhance our applications library and develop others power models at system level.

# REFERENCES

- [1] A.Ghose, S.Devdas, K.Keutzer and J.White, "Estimation of average switching activity in combinational And sequential circuits", In Proceedings of the 29th Design Automation Conference, June 1992, pp. 253-259.
- [2] F.Najm, R. Burch, P. Yang, and I. N. Hajj, "Probabilistic simulation for reliability analysis of CMOS circuits", IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, vol.9(4), pp.439-450, April 1990.
- [3] R.Marculescu, D.Marculescu, and M. Pedram, "Logic level power estimation considering spatiotemporal correlation", Proceedings of the IEEE International Conference on Computer Aided Design, Nov. 1994, pp.224-228.
- [4] F.Najm, "A survey of power estimation techniques in VLSI circuits", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.2, pp.446 – 455, 1994.
- [5] F.Najm,"Transition density: a new measure of activity in digital circuits", IEEE Transactions on Computer aided design, vol.12, 1993, pp.310-323.
- [6] F.Najm,"Low pass filter for computing the transition density in digital circuits", IEEE Transactions on Computer Aided design, vol.13, pp.1123-1131, 1994.
- [7] G.Y.Yacoub, and W.H.Ku, "An accurate simulation technique for short-circuit power dissipation", In Proceedings of International Symposium on Circuits and Systems", 1989, pp.1157-1161.
- [8] C.M.Huizer, "Power dissipation analysis of CMOS VLSI circuits by means of switch level simulation", In IEEE European Solid State Circuits Conf, 1990, pp.61-64.
- [9] C.Deng, "Power analysis for CMOS/BiCMOS circuits", Proceedings of 1994 International Workshop on Low Power Design, April 1994, pp.3-8.
- [10] R.Burch, F.Najm, P.Yang, and T.Trick, "A Monte Carlo approach for power estimation", IEEE Transactions on VLSI Systems, vol.1, N. 1, pp.63-71, March 1993.

- [11] C.Ding, C.Hsieh, Q.Wu, and M.Pedram, "Stratified random sampling for power estimation", Proceedings Int'l Conf. on Computer Aided Design, Nov. 1996, pp. 577-582.
- [12] Y.A.Durrani, T.Riesgo, and F.Machado, "Power estimation for register transfer level by genetic algorithm", Proceedings for International Conference on Informatics in Control Automation and Robotics, August 2006, pp.1103-1107.
- [13] S.Gupta, and F.Najm, "Power Macromodeling for High Level Power Estimation", Proceedings IEEE Transactions on VLSI, 1999, pp.110-114.
- [14] X.Liu, and M.C, Papaefthymiou, "Incorporation of input glitches into power macromodeling", In Proceedings IEEE Inter. Symp. On Circuits and Systems, May 2002, pp.105-108.
- [15] X.Liu, and M.C.Papaefthymiou,"HyPE: Hybrid power estimation for IP-Based Systems-on-Chip", Proceedings for IEEE Trans on CAD of integrated circuits and systems, Vol. 24, No.7, July 2005, pp.1089-1103.
- [16] T.Jiang, X.Tang, and P.Banerjee, "Macro-models for high level area and power estimation on FPGA", Int.J.Simulation and Process Modelling, vol 2, No.1/2, 2006, pp.45-49, 2006.
- [17] G.Bernacchia, and M.C.Papaefthymiou, "Analytical Macromodelling for High Level Power Estimation", IEEE/ACM International Conference on Computer Aided Design, 1999, pp.280-283.
- [18] L.Shang, and N.K.Jha, "High Level Power Modeling of CPLDs and FPGA," IEEE International Conference on Computer Design, Sep.2001, pp.46-51.
- [19] L.Shang, A.S.Kavian, and K.Bathala, "Dynamic Power Consumption in Virtex II FPGA Family", Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays, California USA, 2002, pp.157-164.
- [20] Attlys<sup>™</sup> Board, Reference Manual, December 19, 2011.
- [21] R.Jevtic, and C.Carreras, "Power Estimation of Embedded Multiplier Blocks in FPGAs", IEEE transactions on Very large scale integration (VLSI) systems, vol.18, N. 5, pp.835-839, 2010.



# **BIOGRAPHIES**



Chalbi Najoua received his Philosophy Doctor Degree on Electrical Engineering from the National Institute of Engineering, Monastir, Tunisia. His research interests are VLSI design, FPGA estimation power and optimization.



Boubaker Mohamed received the diploma Degree in Electrical Engineering from ENI of Tunis, Tunisia in 1991. In 2002, he received the Master Degree in Computer Sciences from ENI of Sfax, Tunisia His research interests includes: connectionist system able to discriminate vigilance states.



Bedoui Mohamed Hedi is a Biophysics Professor. He is actually the Header of the Technology Medical Image (TIM) laboratory. His research interests are image and signal processing, VLSI design.