

## Optimization of energy consumption in a NOC link by using novel data

### encoding technique

Asha J.<sup>1</sup>, Rohith P.<sup>2</sup>

<sup>1</sup>M.Tech, VLSI design and embedded system, RIT, Hassan, Karnataka, India <sup>2</sup>Assistent professor, Dept. Of E&C, RIT, Hassan, Karnataka, India

Abstract - With advancements in VLSI process technology as the number of integration increases in SoC, this leads to increase in the density of on chip wires. Because of this, the performance of the circuit degrades. So network on chip came into existence, NoC is the best solution for scalability issues, but it has its own limitations i.e. the power dissipated by the links of a network-on-chip and also at the other element of communication system, for example the routers and the network interfaces. In this proposed paper, set of data encoding techniques with different schemes will be designed to decrease the power dissipation at network links of NoC, which optimizes the on-chip communication system not only in terms of performance but also in terms of power. These schemes are universal and transparent to construct the NoC fabric that means this application will not require any change in the router and link of architecture.

Key Words: coupling switching activity, data encoding, interconnection on chip, low power, network-on-chip (NoC), and power analysis.

#### **1. INTRODUCTION**

As VLSI technologies continue to scale, wire densities increases to support ever-small transistor geometries and causes on-chip wires to present increasing latency and energy problem. The high latency of cross-chip communication can still limit total performance by increasing the delay between on-chip units. Such scalable bandwidth requirement can be satisfied by using on-chip packet-switched micro-network of interconnects, generally known as Network-on-Chip (NoC) architecture. The basic idea came from the traditional large-scale multi-processors and distributed computing networks. The scalable and modular nature of NOC and their support for efficient on chip communication lead to the NOC based system implementation.

In this paper, focus on techniques aimed at reducing the power dissipated by the network links. In fact, the power dissipated by the network links is as relevant as that dissipated by routers and network interfaces (NIs) and their contribution is expected to increase as technology scales. In particular, we present a set of data encoding schemes operating at flit level and on an end-to-end basis, which allows us to minimize both the switching activity and the coupling switching activity on links of the routing paths traversed by the packets.

\_\_\_\_\_

#### 2. ENCODING SCHEMES

In the paper, encoding scheme will be implemented whose goal is to reduce power dissipation by minimizing the coupling transition activities on the links of the interconnection network. Let us first describe the power model that contains different components of power dissipation of a link. The dynamic power dissipated by the interconnects and drivers is

$$P = [T0 \rightarrow 1 (Cs+Cl) + Tc Cc] V_{DD}^{2} F_{CK}$$
(1)



Fig 1: Encoding block diagram.

Fig 1 above shows the general block diagram of encoding architecture. Let us consider a link width of *w* bits. If no encoding is used, the body flits are grouped in *w* bits by the NI and are transmitted via the link. In our approach, one bit of the link is used for the inversion bit, which indicates if the flit traversing the link has been inverted or not. More specifically, the NI packs the body flits in w-1 bits. The encoding logic *E*, which is integrated into the NI, is responsible for deciding if the inversion should take place and performing the inversion if needed. The generic block diagram shown in Fig. 1 is the same for all three encoding



schemes proposed in this paper and only the block E is different for the schemes. To make the decision, the previously encoded flit is compared with the current flit being transmitted.

#### Table I: EFFECT OF ODD INVERSION ON CHANGE OF TRANSITION TYPES

| Time  |                              | Normal                    |                    | Odd Inverted   |                |         |
|-------|------------------------------|---------------------------|--------------------|----------------|----------------|---------|
|       |                              | Type I Types II, III, and | pes II, III, and I | V              |                |         |
| t - 1 | 00, 11 00, 11, 01, 10 01, 10 |                           | 00, 11             | 00, 11, 01, 10 | 01, 10         |         |
| t     | 10, 01                       | 01, 10, 00, 11            | 11, 00             | 11, 00         | 00, 11, 01, 10 | 10, 01  |
|       | T1*                          | T1**                      | T1***              | Type III       | Type IV        | Type II |
| A 1   | Type II                      |                           |                    | Type I         |                |         |
| 1-1   | 01, 10                       |                           |                    | 01, 10         |                |         |
| 1     |                              | 10, 01                    |                    | 11, 00         |                |         |
|       | Type III                     |                           |                    | Type I         |                |         |
| 1-1   | 00, 11                       |                           |                    | 00, 11         |                |         |
| 1     |                              | 11, 00                    |                    |                | 10, 01         |         |
| t - 1 | Type IV                      |                           |                    | Type I         |                |         |
|       | 00, 11, 01, 10               |                           |                    | 00, 11, 01, 10 |                |         |
| ľ     | 00, 11, 01, 10               |                           |                    | 01, 10, 00, 11 |                |         |

Table above shows the effect of odd inversion on change of transition type. Coupling transition can be classified into four types, A Type I transition occurs when one of the lines switches when the other remains unchanged. In a Type II transition, one line switches from low to high while the other makes transition from high to low. A Type III transition corresponds to the case where both lines switch simultaneously. Finally, in a Type IV transition both lines do not change.

#### Scheme I

In scheme I, we focus on reducing the numbers of Type I transitions and Type II. The scheme compares the current data with the previous one to decide whether odd inversion or no inversion of the current data can lead to the link power reduction.



Fig 2: Encoder internal architecture of scheme I.

In the encoding logic, each Ty block takes the two adjacent bits of the input flits (e.g., X1X2Y1Y2, X2X3Y2Y3, X3X4Y3Y4, etc.) and sets its output to "1" if any of the transition types of Ty is detected. This means that the odd inverting for this

pair of bits leads to the reduction of the link power dissipation (Table I). The *Ty* block may be implemented using a simple circuit. The second stage of the encoder, which is a majority voter block, determines if the condition given in (2) is satisfied (a higher number of 1s in the input of the block compared to 0s). If this condition is satisfied, in the last stage, the inversion is performed on odd bits. The decoder circuit simply inverts the received flit when the inversion bit is high.

#### Scheme II

In scheme II, both Types I and II transitions are taken into account for deciding between half and full invert, depending on the amount of switching reduction. The scheme compares the current data with the previous one to decide whether the odd, full, or no inversion of the current data can give rise to the link power reduction.



Fig 3: Encoder internal architecture of scheme II.

Fig above shows the internal architecture of scheme II architecture. In scheme II it compares present bit with previous encoded flit to reduce link power dissipation. First stage of scheme I consists of Ty, T2, T4\*\* block, each Ty, T2, T4\*\* block takes two adjacent bits of the input flits (e.g., X1X2Y1Y2, X2X3Y2Y3, X3X4Y3Y4, etc.) and sets its output to "1" if any of the transition types of Ty, T2, T4\*\* is detected. This means that the odd inverting for this pair of bits leads to the reduction of the link power dissipation (Table I). The Ty, T2, T4\*\* block may be implemented using a simple circuit. The second stage is formed by a set of 1s blocks which count the number of 1s in their inputs. The output of these blocks has the width of log2 w. The output of the top 1s block determines the number of transitions that odd inverting of pair bits leads to the link power reduction. The middle 1s block identifies the number of transitions whose full

inverting of pair bits leads to the link power reduction. Finally, the bottom 1s block specifies the number of transitions whose full inverting of pair bits leads to the increased link power. Based on the number of 1s for each transition type, Module A decides full invert and convert action to be performed only it satisfies below condition (2) and (3).

$$Ty > \frac{(w-1)}{2}$$
(2)

$$T2 > T4^{**}$$
 (3)

Decoder circuit



Fig 4: General block diagram of decoder.



Fig 5: internal architecture of decoder circuit.

The circuit diagram of the decoder is shown in Fig. 4 and 5. The *w* bits of the incoming (previous) body flit are indicated by Zi (Ri), i = 0, 1, ..., w - 1. The *w*th bit of the body flit is indicated by inv which shows if it was inverted (inv = 1) or left as it was (inv = 0). For the decoder, we only need to have the Ty block to determine which action has been taken place in the encoder. Based on the outputs of these blocks, the majority voter block checks the validity of the inequality given by (2). If the output is "0" ("1") and the inv = 1, it means that half (full) inversion of the bits has been performed. Using this output and the logical gates, the inversion action is determined. If two inversion bits were used, the overhead of the decoder hardware could be substantially reduced.

#### Scheme III

In scheme III odd inversion converts some of Type I (T1\*\*\*) transitions to Type II transitions. As can be observed from Table II, if the flit is even inverted, the transitions indicated as T1\*\*/T1\*\*\* in the table are converted to Type IV/Type III transitions. Therefore, the even inversion may reduce the link power dissipation as well. The scheme compares the current data with the previous one to decide whether odd, even, full, or no inversion of the current data can give rise to the link power reduction.

# Table II: EFFECT OF ODD INVERSION ON CHANGE OF TRANSITION TYPES.

| Time  |         | Normal         |        | Even Inverted         |                |          |  |
|-------|---------|----------------|--------|-----------------------|----------------|----------|--|
|       | Type I  |                |        | Types II, III, and IV |                |          |  |
| t - 1 | 01, 10  | 00, 11, 01, 10 | 00, 11 | 01, 10                | 00, 11, 01, 10 | 00, 11   |  |
| t     | 00, 11  | 10, 01, 11, 00 | 01, 10 | 10, 01                | 00, 11, 01, 10 | 11,00    |  |
|       | T1*     | T1**           | T1***  | Type II               | Type IV        | Type III |  |
|       | Type II |                |        | Type I                |                |          |  |
| 1-1   | 01, 10  |                |        | 01, 10                |                |          |  |
| 1     |         | 10, 01         |        | 00, 11                |                |          |  |
| t - 1 |         | Type III       |        | Type I                |                |          |  |
|       |         | 00, 11         |        | 00, 11                |                |          |  |
| '     |         | 11, 00         |        | 01, 10                |                |          |  |
| t - 1 | Type IV |                |        | Type I                |                |          |  |
|       |         | 00, 11, 01, 10 |        |                       | 00, 11, 01, 10 |          |  |
| 1     |         | 00, 11, 01, 10 |        | 10, 01, 11, 00        |                |          |  |



Fig 6: Encoder internal architecture of scheme III.

Fig above shows the internal architecture of scheme III. The operating principles of this encoder are similar to those of the encoders implementing Schemes I and II. The proposed encoding architecture, which is based on the even invert condition of (4), the full invert condition of (4), and the odd invert condition of (3), is shown in Fig. 4. The *w*th bit of the previously encoded body flit is indicated by inv which shows if it was even, odd, or full inverted (inv = 1) or left as it was (inv = 0). The first stage of the encoder determines the transition types while the second stage is formed by a set of 1s blocks which count the number of ones in their inputs. In the first stage, we have added the *Te* blocks which determine



if any of the transition types of T2, T1\*\*, and T1\*\*\* is detected for each pair bits of their inputs. For these transition types, the even invert action yields link power reduction. Again, we have four Ones blocks to determine the number of detected transitions for each blocks. The output of the Ones blocks are inputs for Module C. This module determines if odd, even, full, or no invert action corresponding to the outputs "10," "01," "11," or "00," respectively, should be performed. Here, Module C was designed based on the conditions given in below

$$Te > \frac{(w-1)}{2}$$
(4)

$$Ty > \frac{(w-1)}{2}$$
(5)

$$T2 > T4^{**}$$
 (6)

#### **3. RESULTS**

The proposed data encoding and decoding schemes are simulated and verified using Verilog HDL in Xilinx ISE 14.7.

#### Scheme I

Fig 7 below shows the simulation output of encoding and decoding logic.

|       |               |             |                  |              |                  |              |              | 3,000,000 ps |
|-------|---------------|-------------|------------------|--------------|------------------|--------------|--------------|--------------|
|       |               |             |                  |              |                  |              |              |              |
| Name  |               | Value       | 2,999,995 ps     | 2,999,996 ps | 2,999,997 ps     | 2,999,998 ps | 2,999,999 ps | 3,000,000 ps |
| ا 🖗   | out[31:0]     | 0000000010: |                  | 00000000     | 0100011100010001 | 000001       |              |              |
| ا 📲   | decoder_out[3 | 00000000110 |                  | 00000000     | 1000010111100001 | 111110       |              |              |
| Þ 📑 ( | X[31:0]       | 0000000011( |                  | 00000000     | 1000010111100001 | 111110       |              |              |
| 16 (  | clk           | 1           |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             |                  |              |                  |              |              |              |
|       |               |             | V1. 2 000 000 m  |              |                  |              |              |              |
|       |               |             | x1: 5,000,000 ps |              |                  |              |              |              |

Fig 7: simulated output of scheme I encoding and decoding.

#### Scheme II

Fig 9 below shows the simulation output of encoding and decoding logic and Ty, T2, T4\*\* transition output.





#### Scheme III

Fig 9 below shows the simulation output of encoding and decoding logic and Ty, T2, T4\*\* transition

| Name                 | Value       | 1,999,995 ps | 1,999,996 ps | 1,999,997 ps                            | 1,999,998 ps | 1,9 |
|----------------------|-------------|--------------|--------------|-----------------------------------------|--------------|-----|
| ▶ 📑 X[31:0]          | 10001000100 |              | 100010001    | 000 10000 100 10000                     | 000000       |     |
| Ц <mark>а</mark> Сік | 1           |              |              |                                         |              |     |
| 🕨 📑 out[31:0]        | 01001100110 |              | 010011001    | 1001100011011000                        | 100000       |     |
| TY[30:0]             | 00000000000 |              | 0000000      | 000000000000000000000000000000000000000 | 00000        |     |
| T2[30:0]             | 00000000000 |              | 0000000      | 000000000000000000000000000000000000000 | 00000        |     |
| T4[30:0]             | 10011001100 |              | 10011001     | 10011000110110001                       | 100000       |     |
| 🕨 📲 Te[30:0]         | 00000000000 |              | 00000000     | 000000000000000000000000000000000000000 | 00000        |     |
| 🕨 📲 temp[31:0]       | 10001000100 |              | 100010001    | 000 10000 100 10000                     | 000000       |     |
| TYcount[4:0]         | 00000       |              |              | 00000                                   |              |     |
| T2count[4:0]         | 00000       |              |              | 00000                                   |              |     |
| T4count[4:0]         | 01110       |              |              | 01110                                   |              |     |
| Tecount[4:0]         | 00000       |              |              | 00000                                   |              |     |
| half_invert          | 0           |              |              |                                         |              |     |
| 🗤 full_invert        | 0           |              |              |                                         |              |     |
| 🕨 🍯 Z[31:0]          | 10001000100 |              | 100010001    | 000 10000 100 10000                     | 000000       |     |
| ▶ 📑 decoder_out[3    | 10001000100 |              | 100010001    | 000 10000 100 10000                     | 000000       |     |
|                      |             |              | 00000000     | 000000000000000000000000000000000000000 | 100000       |     |

Fig 9: simulated output of scheme I encoding and decoding.

#### 4. CONCLUSION

In this proposed paper, set data encoding schemes will be implemented to reduce the power dissipated at the NOC link. In fact, links are responsible for a significant fraction of the overall power dissipated by the communication system. As compared to the previous encoding schemes, the proposed schemes is to minimize not only the switching activity, but also the coupling switching activity which is mainly responsible for link power dissipation in the deep sub micron meter technology regime. The proposed encoding schemes are agnostic with respect to the underlying NoC architecture in the sense that their application does not require any modification neither in the routers nor in the links.



#### ACKNOWLEDGEMENT

I express my sincere thanks to Mr.Rohit P., Assistant Professor, Department of Electronics and Communication Engineering. Rajeev Institute of Technology, Hassan for his valuable guidance and continuous encouragement in course of my work. I would also like to thank Mrs. Ambika K. Head of the department, assignment professor, department of Electronics and Communication, Rajeev Institute of technology.

#### REFERENCES

- [1] Nima Jafarzadeh, Maurizio Palesi, Ahmad Khademzadeh, and Ali Afzali-Kusha,"data encoding techniques for reducing energy consumption in network on chip" IEEE transactions on very large scale integration (vlsi) systems, vol. 22, no. 3, march 2014.
- [2] M. Ghoneima, Y. I. Ismail, M. M. Khellah, J. W. Tschanz, and V. De, "Formal derivation of optimal active shielding for low-power on-chip buses," IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 5, pp. 821-836, May 2006.
- [3] Z. Yan, J. Lach, K. Skadron, and M. R. Stan, "Odd/even bus invert with two-phase transfer for buses with coupling," in Proc. Int. Symp. Low Power Electron. Design, 2002, pp. 80-83.
- [4] Z. Khan, T. Arslan, and A. T. Erdogan, "Low power system on chip bus encoding scheme with crosstalk noise reduction capability," IEE Proc. Comput. Digit. *Tech.*, vol. 153, no. 2, pp. 101–108, Mar. 2006.
- [5] C. G. Lyuh and T. Kim, "Low-power bus encoding with crosstalk delay elimination," IEE Proc. Comput. Digit. Tech., vol. 153, no. 2, pp. 93-100, Mar. 2006.
- [6] L. Rung-Bin, "Inter-wire coupling reduction analysis of bus-invert coding," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 7, pp. 1911–1920, Aug. 2008.

#### **BIOGRAPHIES**



Asha J., pursuing M.Tech in VLSI designing and embedded systems in Rajeev institute of technology, Hassan, Karnataka, India. Email: asha.dsh@gmail.com



Rohith P. assistant professor, dept. of Electronics and communication, Rajeev institute of technology, Hassan, Karnataka, India. Email:rohithgowda1985@gmail.com