Computer Organisation and Architecture: Unit 2

Reduced Instruction Set Computer (RISC)

An important aspect of computer architecture is the design of the instruction set for the processor. The instruction set chosen for a particular computer determines the way that machine language programs are constructed. As digital hardware became cheaper with the advent of integrated circuits, computer instructions tended to increase both in number and complexity. The trend into computer hardware complexity was influenced by various factors such as upgrading existing models to provide more customer applications, adding instructions that facilitate the translation from high-level language into machine language programs, and striving to develop machines that move functions from software implementation into hardware implementation.

A computer with a large number of instructions is classified as complex instruction set computer, abbreviated CISC. In the early 1980s, a number of computer designers recommended that computers use fewer instructions with simple constructs so they can be executed much faster within the CPU without having to use memory as often. This type of computer is classified as a reduced instruction set computer or RISC.

CISC Characteristics:

The design of an instruction set for a computer must take into consideration not only machine language constructs, but also the requirement imposed on the use of high-level programming languages. The translation from high-level to machine language programs is done by means of a compiler program.

Another characteristic of CISC architecture is the incorporation of variable-length instruction formats. Instructions that require register operands may be only two byte in length, but instructions that need two memory addresses may need five bytes to include the entire instruction code.

The major characteristics of CISC architecture are:

A large number of instructions – typically from 100 to 250 instructions.
Some instructions that perform specialized tasks and are used infrequently.
A large variety of addressing modes-typically from 5 to 20 different modes.
A variable-length instruction formats.
Instructions that manipulate operands in memory.

RISC Characteristics:

The small set of instructions of a typical RISC processor consists mostly of register-to-register operations, with only simple load and store operations for memory access. Thus each operand is brought into a processor register with a load instruction.

A characteristic of RISC processors is their ability to executer one instruction per clock cycle. This is done by overlapping the fetch, decode and executer phases of two or three instruction by using procedure referred to as pipelining. A load or store instruction may require two clock cycles because access to memory takes more time that register operations.

The major characteristics of RISC architecture are:

Relatively few instruction and addressing modes.
Memory access limited to load and store instructions.
All operations done within the registers of the CPU.
Fixed-length, easily decoded instruction format.
Single-cycle instruction execution.
Hardwired rather than micro programmed control.
A relatively large number of registers in the processor unit.
Use of overlapped register windows to speed-up procedure call and return.
Efficient instruction pipeline.
Compiler support for efficient translation of high-level language programs into machine language programs.

PIPELINING:

Pipeline is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates concurrently with all other segments. A pipeline can be visualized as a collection of processing segment through which binary information flows. Each segment performs partial processing dictated by the way the task is partitioned. The result obtained from the computations in each segment is transferred to the next segment in the pipeline. The name "pipeline" implies a flow of information analogous to an industrial assembly line.

The simplest way of viewing the pipeline structure is to imagine that each segment consists of an input register followed by a combinational circuit. The register holds the data and the combinational circuit performs the sub operation in the particular segment. The output of the combinational circuit in the given segment is applied to the input register of the next segment.

Example:

The pipeline organization will be demonstrated by means of a simple example. Suppose that we want to perform the combined multiply and add operations with a stream of numbers.

A_i * B_i + C_i for i= 1,2,3,……….,7

Each sub operation is to be implemented in a segment within a pipeline. Each segment has one or two registers and a combinational circuit is shown below

Diagram to be placed

R1 through R5 are registers that receive new data with every clock pulse. The multiplier and adder are combinational circuits. The sub operations performed in each segment of the pipeline are as follows:

R1 for A_{i ,}R2 for B_i Input A_i and B_i

R3 = R1 * R2, R4 for C_i multiply and input C_i

R5=R3+R4 add C_ito product

The five registers are loaded with new data every clock pulse. The effect of each clock is shown below:

Diagram to be placed

The first clock pulse transfers A₁ and B₁ into R₁and R₂. The second clock pulse transfers the product of R₁and R₂into R₃and C into R4. The same clock pulse transfers A2 and B2 into R1 and R2. The third clock pulse operates on all three segments simultaneously. It places A3 and B3 into R1 and R2, transfers the product of R1 and R2 into R3, transfers C2 into R4 and places the sum of R3 and R4 into R5. It takes three clock pulses to fill up the pipe and retrieve the first output from R5. From there on each clock produces a new output and moves the data one step down the pipeline. This happens as ling as new input data flow into the system. When no more input data are available, the clock must continue until the last output emerges out of the pipeline.

Arithmetic Pipeline:

Pipeline arithmetic units are usually found in very high speed computers. They are used to implement floating-point operations, multiplication of fixed-point numbers and similar computations encountered in scientific problems. A pipeline multiplier is essentially an array multiplier as described below, with special adders designed to minimize the carry propagation time through the partial products.

The inputs to the floating-point adder pipeline are two normalized floating-point binary numbers.

X=A*2^b

Y=B*2^b

A and B are two fractions that represent the mantissas and a and b are the exponents. The floating-point addition and subtraction can be performed in four segments given below.

Diagram to be placed

The registers labeled R are placed between the segments to store intermediate results. The sub operations that are performed in four segments are:

Compare the exponents.
Align the mantissas.
Add or subtract the mantissas.
Normalize the result.

Example:

Consider the two normalized floating-point numbers:

X=0.9504*10³

Y=0.8200*10²

The two exponents are subtracted in the first segment to obtain 3-2=1. The larger exponent 3 is chosen as the exponent of the result. The next segment shifts the mantissa or Y to the right to obtain.

X=0.9504*10³

Y=0.0820*10³

This aligns the two mantissas under the same exponent. The addition of lthe two mantissas in segment 3 produces the sum

Z=1.0324*10³

The sum is adjusted by normalizing the result so that it has a fraction with a nonzero first digit. This is done by shifting the mantissa once to the right and incrementing the exponent by one to obtain the normalized sum.

Z=0.10324*10⁴

The comparator, shifter, adder-subtractor, Incrementor and decrementor in the floating-point pipeline are implemented with combinational circuits.

Instruction Pipeline

Pipeline processing can occur not only in the data stream but in the instruction stream as well. An instruction pipeline reads consecutive instructions from memory while previous instructions are being executed in other segments. This causes the instruction fetch and executes phases to overlap and perform simultaneous operations.

Consider a computer with an instruction fetch unit and an instruction execution unit designed to provide a two-segment pipeline. The instruction fetch segment can be implemented by means of a first in, first out (FIFO) buffer. This is a type of unit that forms a queue rather than a stack. Whenever the execution unit is not using memory the control increments the program counter and uses its address value to read consecutive instructions from memory. The instructions are inserted into the FIFO buffer so that they can be executed on a first-in, first-out basis. Thus an instruction stream can be placed in a queue, waiting for decoding and processing by the execution segment. The instruction stream queuing mechanism provides an efficient way for reducing the average access time to memory for reading instructions. Whenever there is space in the FIFO buffer, the control unit initiates the next instruction fetch phase. The buffer acts as a queue from which control then extracts the instructions for the execution unit.

Computers with complex instructions require other phases in addition to the fetch and execute to process an instruction completely. In the most general case, the computer needs to process each instruction with the following sequence steps.

Fetch the instruction from memory.
Decode the instruction.
Calculate the effective address.
Fetch the operands from memory.
Execute the instruction.
Store the result in the proper place.

There are certain difficulties that will prevent the instruction pipeline from operation at its maximum rate. Different segments may take different times to operate on the incoming information. Some segments are skipped for certain operations.

Example:

A register mode instruction does not need an effective address calculation. Two or more segments to wail until another are finished with the memory. Memory address conflicts are sometime resolved by using two memory buses for accessing instructions and data in separate modules. In this way, an instruction word and a data word can be read simultaneously from two different modules.

The design of an instruction pipeline will be most efficient if the instruction cycle is divided into segments of equal duration. The time that each step takes to fulfill its function depends on the instruction and the way it is executed.

Computer Organisation and Architecture

Friday, December 25, 2009

Unit 2

No comments:

Post a Comment

Followers

Blog Archive

About Me