Computer Architecture Essay, Research Paper
1. There have been a lot of developments in microprocessors since the 286 chip. The 286 CPU are no longer sold and are very rarely found in commercial use today because of its running speed, which is between 10MHz to 20MHz. This processor has a 24-bit address bus, and is able to address up to 16 million different address locations. It also has two operating modes, which are real mode and protected mode. The real mode is basically for normal DOS operations and it uses only 8086 code (8086 was the previous CPU). When it was in protected mode the CPU is able to access beyond the 1mb address limit and employed its added features, which were intended for multi-tasking operations such as Windows, but this CPU is not powerful enough to carry out these multi-tasking operations. The 286 came with a bus width of 16-bit internal, 24-bit address, and 16-bit external, with an external speed of between 6MHz – 25MHz, and an internal speed of between 6MHz – 25MHz.
The next CPU was the 386, this is also no longer produced it had a slightly faster running speeds which are between 16MHz to 40MHz. This CPU could carry out effective multi tasking operations. It also had a substantial improvement in both memory management and it had an enlarged instruction set. It is also the minimum CPU for running windows. It came in two types the 386 SX and the 386 DX. The SX had a 32-bit internal data path but it only had a 16-bit path between the CPU and the computer memory. The DX on the other hand had a 32-bit data bus between the CPU and the memory chips allowing larger data transfers so it had faster through put. It also was able to use external cache memory, usually about 64k, which also improved performance. The 386 came in two different types they both had a internal bus width of 32 bit, the SX had a address bus width of 24 bit, and a external bus width of 16 bit, its internal and external speed was between 16MHz – 33MHz. The DX however had an address and external bus width of 32 bit, its internal and external speed was between 33MHz–40MHz.
The 486 were the next CPU, this is still produced, there was little change to the 386 instruction set, but the 486 ran at speeds between 20MHz to 100MHz. There was more emphasis placed on the enhancements to improve the performance. It was also available in the DX and SX varieties. The difference between them was that the DX had a maths co-processor the SX did not, the Motherboards that used the 486SX chip had a spare maths co-processor socket to upgrade to a DX. The 486 chip because of its design to carry out the most common instructions in a single clock cycle this was a faster than the previous CPUs. It also had 8k of bit in cache memory, the new burst mode it had, allowed memory transfers from consecutive memory locations to be carried out at one clock cycle. The 486 came in four different types they all had a bus width of 32 bit (internal, address, and external), with an external speed of between 20MHz – 50MHz. The differences between them were in the internal speeds of the CPU. The SX had an internal speed of between 20MHz – 50MHz, the DX had an internal speed of between 25MHz – 50MHz, the DX2 had an internal speed of between 50MHz – 66MHz. The DX4 had an internal speed of between 100MHz – 120MHz, which was actually faster than the bottom of the Pentium range.
The Pentium CPU came and is the current entry level standard for computers. This CPU is effectively two that are in the one chip. This then allows two instructions to be executed in parallel, which means it greatly speeds up throughput. It also has the main mathematical operations hard wired into the chip this then means that it can be up to ten times faster than the 486DX maths coprocessor can. All the Pentium models are supercalar. The basic chip has two integer processing pipelines. It also has a branch prediction facility which is 90% of the time correctly predicts the flow of the program and fetches the instruction from the buffer area. This type of CPU has a specially designed high performance Floating Point Unit and a 16 bit internal cache. The Pentium CPUs have a bus width of 2×32-bit internal, 32-bit address, and 64-bit external, with an external speed of between 33MHz – 83MHz, and an internal speed of between 63MHz – 233MHz.
2. The term RAM stands for random access memory which is a storage device made up of silicon chips. A computer has two types of RAM these use arrays of transistor switches to store the binary data or in other words the switches on the chips can change, which is done by an electrical current being passed through them. This type of memory is volatile which means that any information that is in it when the computer is switched off is then lost, apart from the program which is being run at that time which is unaffected as it is only a copy of it. This then means that the users created data has to be saved before the computer is switched off.
The term ROM stands for read only memory that can only be read but cannot be written to. This type of memory is not volatile this means that all the switches on the silicon chips are already set, which means that any information that is in it when the computer is switched off is then kept. The computer BIOS is stored in this type of chip so that the basic computer control programs are available as soon as the computer is switched on so it can operate. These ROM chips cannot be changed unlike the RAM chips. The computer also sets the patterns on these chips form commands, information or programs that the needs to operate. This means the data is “hard-wired” into the ROM chip. You can store the chip forever and the data will always be there. Besides, the data is very secure. The BIOS is stored on ROM because the user cannot disrupt the information. There are different types of ROM, too:
1. Programmable ROM (PROM). This is basically a blank ROM chip that can be written to, but only once. It is much like a CD-R drive that burns the data into the CD. Some companies use special machinery to write Proms for special purposes.
2. Erasable Programmable ROM (EPROM). This is just like PROM, except that you can erase the ROM by shining a special ultra-violet light into a sensor atop the ROM chip for a certain amount of time. Doing this wipes the data out, allowing it to be rewritten.
3. Electrically Erasable Programmable ROM (EEPROM). Also called flash BIOS. This ROM can be rewritten through the use of a special software program. Flash BIOS operates this way, allowing users to upgrade their BIOS.
ROM is slower than RAM, which is why some try to shadow it to increase speed.
The term Static Memory or SRAM is a type of RAM and is one of the fastest, this is because it does not use the capacitive method. It uses instead different cells which represents a single bit and the value is held by a more complex set of transistors that are configured as a bistable that is commonly called a flip-flop. This type of memory will maintain itself until it is either altered by a new value or the power is switched off.
The Static Memory also holds information as long as power flows through the circuit it does not need to be constantly refreshed though.
The term Dynamic RAM or DRAM is a type of RAM it is not as fast as SRAM but it is smaller in size. It is mainly used for the computers main memory, but it is also volatile. It stores the data through a capacitor that holds the transistor in its switched state but the capacitor loses power quickly and has to be recharged on a regular basis which is roughly every two milliseconds.
The term Memory Addressing means that it is able to store instructions in the memory at a location so it is then able to retrieve it again. It is a unique location and the contents may be part of an application program or system program, or may be data. This is important, as the machine must be able to distinguish one-program instructions from the next. It is done by holding machine instructions ‘in different memory locations. The memory address and its contents are totally different the address is a particular location in memory and it contents an application, system program or it may be data.
The computer’s memory, both RAM and ROM are regarded as a contiguous list of locations. Each location is identified by its unique memory address. In fact, the memory is organised as a matrix of storage cells. For simplicity, we will take as an example of a matrix, 16 rows and 16 columns, providing 256 addressable locations or cells. Specifying its row and column co-ordinates can access any cell in the matrix. The memory chip circuitry has to translate any memory address into the corresponding co-ordinates.
For example, the CPU requests access to address 227 (i.e. 11100011 in binary. This binary pattern is placed on the address bus. The four least significant bits (0011) are used by the column decoder to determine the column co-ordinate, known as the Column Address Select (CAS) line. The four most significant bits (1110) are used by the row decoder to determine the row co-ordinate, known as the Row Address Select (RAS) line. The row and column address lines then access only the single unique cell, which corresponds to the address supplied. Note that the convention is to number address bus lines and data bus lines commencing with line 0. So, a 16-bit address bus would number its lines from AO to A15 and an 8-bit data bus would number from DO to D7. As can be seen, any cell in the matrix can be individually accessed. Hence the description as a random access device. The cells accessed may contain a program instruction or program data. The circuitry can’t distinguish between instructions and data and the programmer has to ensure that the correct addresses are being accessed.
The term Memory Access Speed means that how long it takes the memory in nano-seconds to retrieve the data. Standard memory speeds have not progressed at same rate as processor speeds. As a result, the CPU can process data faster than the data can be fetched from memory or placed in memory. The Pentium motherboard operates at no more than 66MHz while CPUs can run at up to 266MHz. Consider that a 133MHz CPU cycles every 7ns while the access time for main memory is usually 70ns.
Accessing each cell will incur the same circuit switching time overhead. This is the chip’s access time. There will be address lines and also data lines to transfer data in and out of the cells to the CPU. A data transfer will either be a read operation (the cell’s contents are copied on to the data bus) or a write operation (the contents of the data bus are copies into the cells). To instruct the chip, on which operation is required, it is fed read/write information on its control lines.
If the CPU requires data from memory, it issues a read instruction along with the address to be read. To write data to memory, the CPU places the data on the data bus and issues a write instruction along with the address location.
3. The term ALU stands for Arithmetic Logic Unit this carries out all the arithmetic and logical operations with in the CPU or Central Processing Unit.
The term Register covers the areas of transient storage, which hold information keep track of instructions and retain the position and results of these operations. Each of the different registers has a specific purpose of what functions that it has to carry out they are located in the Execution Unit. There is a number of registers which carry out certain functions, e.g. memory address register, memory buffer register, stack pointer, program counter and the process status register.
The term Control Circuit is used to control many of the computer’s other components such as the memory and the periheral devices. It has an interrupt unit that indicates the order in which particular operations use the CPU, also it limits the amount of CPU time each operation may take /tells them what it wants to do. There is an instruction decoder that reads the pattern of information in a specific register and decodes the pattern into an operation.
The term Control Bus is when data travels between the CPU and memory on parallel wires called a “bus.” One line contains the control signals that are generated from within the CPU. Another line senses the input signals. Every bus operation begins with a new clock tick.
The term Address Bus is what is used to locates information in memory addresses, its a one way line from the processor, Each memory location has an individual address, the CPU accesses a particular address by putting the specific address in binary format on to the address bus.
The term Data Bus is what is used to transfer data between the CPU and memory by a two way transfer that can read information or write new information into memory when the correct memory location is found, but its only able to write the new information to RAM memory.
1- MEMORY ADDRESS (ADDRESS BUS)
2 – CONTROL SIGNAL (CONTROL BUS)
3- DATA (DATA BUS)
1- Knocking on door to open it.
2- Transfer the data.
3- Telling it what it wants to do.
4. The diagram on the last page shows how CPU registers are used, the diagram is called the Fetch / Execute Cycle, and there is two main parts, the fetch cycle and the execute cycle. These cycles can be divided into a more specific description of how the registers are used, the fetch part of the cycle is the same regardless of the instruction but the instructions will change in the execute part of the cycle.
This cycle can be broken down into a more detailed account of how the various registers are used they are detailed below.
Registers, are specialized storage areas, these are used to hold information temporarily while it is being decoded. Each of these registers has a defined purpose to carry out so that the computer can operate effectively. A General-purpose registers that are used for performing arithmetic functions. A Current instruction registers that contain both the operator and the operand of the current instruction. The Program Counter is the register that holds the address of the next instruction to be carried out these instruction are automatically incremented to the next ‘instruction. But when the current instruction is a branch or jump instruction, then that address is copied from the instruction to the Program Counter. The Program Counter is copied to the Memory Address Register which hold the address of the memory locations from which information will be read or to which data will be written and occasionally. It will hold the address of the instruction in the fetch cycle and the information to be used in an instruction in the execute cycle.
Memory data registers are used to temporarily store information read from or written to the memory. Data goes here before it goes to the Current Instruction Register where it is decoded. Once the instruction has been decoded the operand of the instruction is put in the MAR and the data will then be copied to the MDR. Any transfers of data from memory go via the MDR. The MDR and the MAR serve the system as screen registers, this allows for the difference in speed between the CPU and the memory.
The CIR or Current Instruction Register is where the instruction is copied to it holds both the operator and the operand of the current instruction
If the Fetch / Execute cycle is interrupted by more information then it will “stack” the cycle between the fetch and execute phase then deal with the new data and return to the interrupted cycle. The test for interrupts is only carried out at the end of each instruction cycle.
When the item in the MDR is added to the Accumulator the whole operation carries on returning to the fetch cycle. The Accumulator is the register that carries out arithmetical functions.
The status registers contain bits that are carried bases on the result of an instruction. They also contain information on interrupts to information to get a priority on less important information.
All these steps are added to the program counter. Between each stage of this cycle the data is carried on busses that take it to the address part or the data part of the cycle. There are different types of bus here are two examples.
The Address Bus carries addresses so that the required locations can be accessed so they can read or write data. The Data Bus transfers the information to the correct memory location.
This then means from the diagram the fetch part of the cycle carries all the data to the correct one of the execute part of the cycle.
The fetch part of the cycle is common while the execute part of the cycle varies. The fetch-execute cycle is as follows:
The address of the instruction is copied from the PC and held in the MAR.
The instruction (e.g., add x), is placed into the MDR where it is temporarily stored.
The instruction (add x), is then copied to the CIR.
The PC now moves on to the next instruction, (e.g., add y).
While in the CIR the instruction is decoded, this determines what the instruction has to do, (add).
The operand part of the instruction, (x) is then copied to the MAR.
The data item (e.g., 3), whose address is still stored in the MAR, is copied to the MDR.
The item held in the MDR (3) is then added to the accumulator.
The process is then repeated for the next instruction, (add y).
The accumulator works as follows:
For example, value x = 3, y = 4, z = 7
Instructions – add x, add y, add z
Accumulator Value = 0
“ = 3
“ = 7
“ = 14
The root of the single cycle processor’s problems:
The cycle time has to be long enough for the slowest instruction (load)
Break the instruction into smaller steps
Execute each step (instead of the entire instruction) in one cycle
Cycle time: time it takes to execute the longest step
Keep all the steps to have similar length
Use a register to save a signal’s value whenever a signal is generated in one clock cycle and used in another cycle later
The advantages of the multiple cycle processor:
Cycle time is much shorter
Different instructions take different number of cycles to complete
Load takes five cycles
Jump only takes three cycles
Allows a functional unit to be used more than once per instruction (though requires more muxes, registers)
Well, the root of these problems of course is that facts that the Single Cycle Processor’s cycle time has to be long enough for the slowest instruction.
The solution is simple. Just break the instruction into smaller steps and instead of executing an entire instruction in one cycle, we will execute each of these steps in one cycle.
Since the cycle time in this case will then be the time it takes to execute the longest step, our goal should be keeping all the steps to have similar length when we break up the instruction.
Well the last two bullets pretty much summarise what a multiple cycle processor is all about.
The first advantage of the multiple cycle processor is of course shorter cycle time than the single cycle processor. The cycle time now only has to be long enough to execute part of the instruction (point to “breaking into steps).
But may be more importantly, now different instructions can take different number of cycles to complete. For example:
(1) The load instruction will take five cycles to complete.
(2) But the Jump instruction will only take three cycles.
This feature greatly reduces the idle time inside the processor.
Finally, the multiple cycle implementation allows a functional unit to be used more than once per instruction as long as it is used on different clock cycles. For example, we can use the ALU to increment the Program Counter as well as doing address calculation.