

# **IMPLEMENTATION ON FPGA OF RELIABLE NETWORK-ON-CHIP**

# **G.ELANAGAI**

PG Student, Dept. of ECE, A V C College of Engineering Mayiladuthurai, Tamilnadu, India.

\_\_\_\_\_\*\*\*\_\_\_\_\_\*\*\*

**Abstract** - The communication inside NOC in a dynamic routing environment makes reliable communication difficult. Communication between the intellectual properties through other nodes in a dynamic environment needs to be performed efficiently. To enhance reliable communication we use routing algorithms like XY routing algorithm and deflection routing algorithm. To make this project innovative here we propose a new model of reliable NOC reducing the area by removing the LOOP BACK MODULE and to overcome the deficiency centralized buffer is used. This centralized buffer helps us to retrieve the data which lost its path by resending it against to the source itself. The Hamming code is being replaced by CRC in order to reduce the computational complexity.

#### Key Words: Reliable NOC, Loop Back Module, Centralized buffer, Hamming Code, CRC Code.

## **1. INTRODUCTION**

As indicated by several researchers and the International Technology Roadmap for semiconductors (ITRS), nanometre System-on-Chip (SOCs) will most likely not have an economic yield if all transistors must be functional. Besides, it is expected that Moore's law will continue to hold for another five to fifteen years where billion gates can be integrated in a chip. This capacity will allow integration of several tens to hundred resources like processor cores, DSP cores, Interface circuits, FPGA blocks and Memory blocks. Thereby, it is possible to integrate more than one processing element (PE) in a SOC, being known as Multi-processor System-on-chip (MPSoC). MPSoCs have been widely used in high performance embedded systems, such as web servers, network processors, and parallel media processors. They combine the advantages of data processing parallelism of multi-processors, and parallel media processors. The Network-on-chip (NoC) architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication design issues such as performance limitations of long interconnects, and integration of high number of PE on a chip. A tiled-based 2D-mesh NoC based system, where one or more cores and other resources are encapsulated into a tile. It consists of Routers (R), PE and Network interfaces (NI). PEs may be intellectual property (IP) blocks or embedded memories. Each PE is connected to the corresponding router port using the network interface. This enables to use packets for transferring information between PEs without requiring

dedicated wirings for point to point connection. In brief, NoCs not only offer a scalable performance needed by systems which grow with each new generation, but also allow to mitigate the energy consumption by avoiding the use of long global wires. Since all links in the NoC can operate simultaneously on different data packets, a high level of parallelism is making it attractive for replacing previous communication architectures like dedicated pointto point signal wires, shared buses or segmented buses with bridges. Furthermore, NoCs are reusable templates and aid to reduce the so called design productivity gap. Finally, none of the current on-chip interconnect approaches will meet all the requirements of future SoCs, as Nocs could potentially fulfil.

#### 2. OVERVIEW OF PREVIOUS WORK

## **2.1 Traffic Models**

Traffic models refer to the mathematical characterization of workloads generated by various classes of applications. With network performance being highly dependent on the actual traffic, it is obvious that accurate traffic models are needed for a thorough understanding of the huge design space of network topologies, protocols and implementations. Since implementing real applications is time consuming and lacks flexibility, such analytical models can be used instead to evaluate the time consuming and lacks flexibility, such analytical models can be used instead to evaluate network performance early in the design process. It should be noted, however, that the research in this area is still behind due to lack of widely accepted set of NoC benchmarks. This situation has two primary reasons. First, the applications suitable for NoC platform are typically very complex. For instance, it is common for applications to be partitioned among tens of processors in order to allow for evaluations of scheduling, mapping, etc. For general purpose chip multiprocessors (CMPs), benchmarks can actually stress the NoCs. Second, compared to traditional research areas like physical design where the design constraints are static the NoC research require detailed information about the dynamic behavior of the system; this is hard to obtain even using detailed simulation or prototyping. As a result, most researchers and designers still rely on the synthetic traffic patterns such as stress-test a network design. Similarly, there have been initial steps towards releasing parallel benchmarking targeting the future CMPs.

## 2.2 Application Scheduling

Another important problem in NoC design is communication and task scheduling. Although scheduling is a traditional topic in computer science, most previous work focuses on maximizing performance. More recently, energyaware scheduling techniques for real-time and distributed systems have also been introduced, but they address only the bus based or P2P communication. We note that mapping and scheduling problems can be considered jointly.

Communication and task scheduling for NoCs is addressed in where the authors present a scheduling algorithm which minimizes the overall energy consumption of the system while guaranteeing the real-time deadlines imposed on tasks. Likewise, Scheduling and arbitration policy for NoCs that use case scenarios, here communication access protocol is presented. Capturing the dynamic system behaviour, i.e., the change system behaviour due to incoming and completed applications is also important. It should be also noted that dynamic voltage frequency scaling (DVFS) can be used in conjunction with scheduling in order to minimize the overall energy consumption. However, since they are both very hard problems.

## 2.3 Packet Routing

Routing has been extensively studied in classical interconnection networks, many of which have been leveraged in on-chip networks. One example is the dimension-ordered routing which routes packets in one dimension, then moves on to the next dimension, until the final destination is reached. While such a technique is very popular due to its simplicity, adaptive routing techniques, model routing, planar adaptive routing can provide better throughput and fault tolerance by allowing alternative paths depending on the network congestion and run-time faults. Oblivious routing algorithms which generate routes without any knowledge of traffic have also been extensively studied in the context of classical interconnections networks and can be relevant to on-chip networks due to their low overhead. New routing protocols have also been investigated specifically for NoCs. For instance, deflective routing in routes packets to one of the free output channels belonging to a minimal path; if this is not possible, then packets are misrouted. Techniques have been also proposed to dynamically switch between deterministic and adaptive routing to exploit the trade-off between them.

Application-specific customization of routing protocols and techniques to provide low overhead routing algorithms with high path diversity have been explored. With NoCs being increasingly concerned with power, thermal and reliability issues, there exists recent work proposing thermal and reliability aware routing algorithms.

# 2.4 NoC Evaluation and Validation

Fast and accurate approaches for analyzing critical metrics such as performance, power consumption or system fault-tolerance are important to guide the design process. However, in order to be used within an optimization loop or make early design choices, the analysis techniques need to be tractable and provide meaningful feedback to designers.

Communication latency and network bandwidth are common performance metrics of interest. While it is relatively easier to find the communication latency for guaranteed service traffic, analyzing the average latency for best effort traffic is a challenging task.

Performance analysis largely depends on various simplifying assumptions on the network or traffic characteristics and typically assumes deterministic routing due to the difficulty in handling the more general problem. Approaches that relax the Markovian assumption and analytical power consumption models are highly needed.

## **3. PROPOSED SYSTEM**

## **3.1 Routing in NoC**

NoC is architecture inspired by data-communication networks, such as internet, communication and wireless networks with interprocessor communication supported by packet-switched and circuit-switched networks. The basic idea of NoC is to communicate across the chip in a way similar to that of messages transmitted over the internet as the methods and architectures from the computer network could be borrowed and adopted to the on-chip communication and can potentially resolve the interconnect scaling challenges.

Routing strategies can be categorized into deterministic and adaptive schemes. In a deterministic routing strategy, source and destination determine the traversal path. Popular deterministic routing schemes for NoC are source routing and XY routing, which are also referred to as 2-D dimensionorder routing. In source routing, the source core specifies the route to the destination. In XY routing, the packet follows the rows first then moves along the columns towards destination.

In an adaptive-routing strategy, the traversal path is decided on a per-hop basis. Adaptive schemes involve dynamic arbitration and next-hop selection mechanisms, i.e., based on local link congestion. There is a great potential to improve communication local traffic or conditions of neighbors.

Minimal-cost (shortest path) computation is fundamental among different dynamic-routing strategies. The basic idea is that the routing algorithm always chooses the least congested path towards the destination through optimal path planning.

## 3.2 Shortest Path Computation

DP is a powerful mathematical technique for making a sequence of interrelated decisions. Bellman formalized the

term DP and used it to describe the process of solving the problems where one needs to find the best decision one after another. It doesn't provide a standard mathematical formulation of the algorithm. Rather DP is a general type of approach to problem solving, and it restates an optimization problem in recursive form, which is known as Bellman equation. The Bellman equation becomes

 $V(K)(v,w) = \min \{V(k-1)(u,w)+Cv,u\}$ 

#### 3.3 Cyclic Redundancy Codes

CRC stands for Cyclic Redundancy Check which means that is based on cyclic algorithm that generates redundant information. The CRC performs mathematical calculation on a block of data and returns information about the contents and organization of the data. So the resultant number uniquely identifies the block of data. This unique number can be used to check the validity of data or to compare two blocks. So this approach is used in many communication and computer systems to ensure the validity or stored data.

In general CRC codes are able to detect:

- All single-and double-bit errors.
- All odd numbers of errors.

• All burst errors less than or equal to the degree of the polynomial used.

• Most burst errors greater than the degree of the polynomial used.

## 3.3.1 CRC idea

The main idea of CRC is to treat the message as binary numbers, and divide it by fixed binary number. The remainder from the division is considered check sum.

## 3.3.2 Theory of Operation

As stated in the previous section, the CRC is a simple binary division and subtraction. The only difference is that operations are done.

## 3.3.3 Polynomial Concept

The CRC algorithm uses the term polynomial to perform all of its calculations. This polynomial is the same concept as the traditional arithmetic polynomials. The divisor, dividend, quotient and remainder that are represented by numbers are represented by numbers are represented as polynomials with binary coefficients. CRC has two main implementation technique they are Straight forward and look-up table based

# **3.4 ROUTING**

## 3.4.1 XY Routing

XY Routing is a dimension order routing which routes packets first in X- or horizontal direction to correct column then in Y- or vertical direction to the receiver. XY routing suits well on a network using mesh or torus topology. Addresses of the routers are their xy-coordinates. XY routing never runs into deadlock or live lock.



Figure 3.1 XY routing from router A(1) to router B(10).

There are some problems in the traditional XY routing. The traffic does not extend regularly over the whole network because the algorithm causes the biggest load in the middle of the network. There is a need for algorithms which equalize the traffic load over the whole network.

#### 3.4.1.1 Pseudo Adaptive XY Routing

This routing works in deterministic or adaptive mode, depending on the state of the network. Algorithm works in deterministic mode when the network is not or only slightly congested. When network becomes blocked, the algorithm switches to the adaptive XY routing works on mesh network which consist of routers, wires and IP-blocks. Every router has five bidirectional ports: North, south, east, west and local. Local Port connects router to its local core while the other ports are connected to the neighboring route.

#### 3.4.1.2 Surrounding XY Routing

This routing has three different routing modes. N-XY (Normal XY) mode works just like basic XY routing. It routes the Packet first along X-axis and then Y-axis. The SH-XY mode (Surround vertical XY) routes packet to the correct column on the grounds of coordinates of the destination.

#### 3.4.2 Turn Models

This algorithm determines a turn or turns which are not allowed while routing packets through a network. Turn models are live lock free.

#### 3.4.2.1 West-first Routing

A West-first routing algorithm prevents all turns to west. So the packets going to west must be first transmitted as far to west as necessary. Routing packets to west is not possible later.

#### **3.4.2.2 North-last Routing**

Turns away from north are not possible in a north-last routing algorithm. Thus the packets which need to be routed to north must be transferred there at last.

International Research Journal of Engineering and Technology (IRJET) IRIET Volume: 03 Issue: 12 | Dec -2016 www.irjet.net

e-ISSN: 2395 -0056 p-ISSN: 2395-0072

## 3.4.2.3 Negative-first Routing

This routing algorithm allows all other turns except turns from positive direction to negative direction. Packet routings to negative directions must be done before anything else.



Figure 3.2 Turn models for (a) X-Y Routing, (b) West-first, (c) North-last, (d) Negative last (Solid lines for Permitted turns and dashed line for prohibited turns).

# 3.5 Dead lock avoidance

Routing and message dependent deadlock avoidance is achieved by extending the bubble flow control technique to flit level using central buffers. This technique keeps the routing minimal plus ensuring minimum no latency as well. Before explaining our scheme, we like to reiterate 3 conditions that are required for bubble control to work. 1) Every ring or cycle must have bubble. 2) If there is a bubble in the ring, Packets within the ring i.e. they have to make progress. 3) External packets entering the ring are not allowed to destroy the bubble.

#### 3.5.1 Avoiding Deadlock

The idea of avoiding routing deadlock is simple. For every ring, even having a single flit bubble is enough to ensure forward progress of flits. For flits entering the ring, we need to ensure that the whole packet will be allowed to enter the ring along with maintaining the original bubble of the ring. This is the necessary condition Because of the following reason. If the whole packet is not allowed to enter the ring, even having a multi flit bubble in the ring. Furthermore, since packets in the central buffer of the current router are part of the overall ring, looking at the space of the next router's CB is not required. Thus the condition to avoid deadlock only requires looking at empty buffer space of itself and no credit information of the downstream router is needed which makes our technique perfectly suitable for EB-based channels.

## 4. HARDWARE DESCRIPTION

Altera DE2 board is the hardware used here it has many features that allow the user to implement a wide range of the designed circuits, from simple circuits to various multimedia projects.









Figure 5.1 XY Routing algorithm result in Altera DE2 board





## **6. CONCLUSION AND FUTURE WORK**

Now we had implemented NoC in 2D and hence here we will propose a challenging task of implementing reliable network on chip in 3D in future perspective. Next is finding a new technique to further reduce the area and the errors respectively.

#### REFERENCES

- [1] Abad P, Puente V, Gregorio JA (2007) Rotary router: An efficient archietecture for CMP interconnection networks. In: Proceedings of the international symposium on computer architecture.
- [2] Balfour J , Dally WJ (2006) Design tradeoffs for titled CMP on-chip networks. In: Proceedings of international conference on supercomputing.
- [3] Bjerregaard T , Mahadevan S (2006) A Survey of research and practices of NoC. ACM comput Surv 38.
- [4] Chan SC, Shepard KL , Restle PJ (2003) Design of Resonant global clock distributions. In: Proceedings of the international conference on Computer design.
- [5] Dally WJ (1992) Virtual-channel flow control. IEEE Trans Parallel Distrib Syst 3(2).
- [6] Draper J , Ghosh J (1994) A Comprehensive analytical model for wormhole routing inmulticomputer systems. J Parallel Distrib Comput 23(2).
- [7] Duato J , Yalamanchilli S , Ni L (2002) Intreconnection networks : an engineering approach. Morgan Kaufmann, San Mateo, CA.
- [8] Hung W et al (2004) Thermal-aware IP Virtualization and placement for NoC architecture, In: Proceedings of ICCD.
- [9] Lu Z, Jantsch A (2007) Layered switching for networks on chip. In: Proceedings of design automation conference.
- [10] Marescaux T (2007) Introducing the SuperGT NoC In: Proceedings of design automation Conference.