6-1-1992

The Design and implementation of an 8 bit CMOS microprocessor

Jeffrey Correll

Follow this and additional works at: http://scholarworks.rit.edu/theses

Recommended Citation

This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
The Design and Implementation of an
8 bit CMOS Microprocessor

by

Jeffrey Correll

A Thesis Submitted
in
Partial Fulfillment of the
Requirements for the Degree of
MASTER OF SCIENCE
in
Computer Engineering

Approved by:  George A. Brown (Thesis Advisor)

Jong D. Chong

Robert E. Pearsen

Roy S. Gemikowski (Department Head)

DEPARTMENT OF COMPUTER ENGINEERING
COLLEGE OF ENGINEERING
ROCHESTER INSTITUTE OF TECHNOLOGY
ROCHESTER, NEW YORK
JUNE 1992
THESIS RELEASE PERMISSION FORM

ROCHESTER INSTITUTE OF TECHNOLOGY
COLLEGE OF ENGINEERING

Title of Thesis: The Design and Implementation of an 8 bit CMOS Microprocessor

I, Jeffrey A. Correll, hereby refuse permission to the Wallace Memorial Library of RIT to reproduce my thesis in whole or in part.

Date: 6 - 30 - 72
# TABLE OF CONTENTS

Abstract ................................................................. i

Introduction ............................................................ 1

1.0 Theory ............................................................... 3

1.1 Organization and Behavioral Modeling ....................... 12
1.1.1 Control Unit .................................................. 15
1.1.2 Instruction Register ......................................... 17
1.1.3 General Purpose Registers .................................. 18
1.1.4 Stack Pointer ................................................ 19
1.1.5 Flags Register ............................................... 20
1.1.6 ALU ........................................................... 21
1.1.7 Multiplier ..................................................... 22
1.1.8 Memory Data Register ...................................... 23
1.1.9 Shifter ......................................................... 24
1.1.10 Bus Unit ..................................................... 24

2.0 Logic Circuit Construction ........................................ 28

2.1 Control Unit ....................................................... 30
2.2 Instruction Register ............................................. 39
2.3 General Purpose Registers ..................................... 41
2.4 Stack Pointer .................................................... 43
2.5 Flags Register .................................................. 47
2.6 ALU ............................................................. 51
2.7 Multiplier ......................................................... 54
2.8 Memory Data Register .......................................... 57
2.9 Shifter ........................................................... 60
2.10 Bus Unit ......................................................... 63

3.0 Microprocessor Layout ............................................ 75

4.0 Conclusions ......................................................... 78

5.0 References ........................................................ 80
TABLE OF CONTENTS  [continued]

Appendix A

Control Unit Control Flow Chart  .........................................  A-1

Appendix B

Bus Unit Control Flow Chart  ..............................................  B-1

Appendix C

Processor BLM Code  .....................................................  C-1

Appendix D

Quicksim Logical Simulation Listings  .................................  D-1

Instruction Register  ..........................................................  D-1
ALU  ........................................................................ D-6

Appendix E

New Library Components  ....................................................  E-1

Appendix F

Program Simulation Results  ................................................  F-1
# LIST OF FIGURES

1. Block Diagram of Processor ............................................................... 14
2. Control Unit Block Symbol ................................................................. 16
3. Instruction Register Block Symbol ..................................................... 17
4. General Purpose Register Block Symbol ............................................. 18
5. Stack Pointer Register Block Symbol ............................................... 19
6. Flags Register Block Symbol ............................................................ 20
7. ALU Block Symbol ............................................................................ 21
8. Multiplier Block Symbol ..................................................................... 22
9. Memory Data Register Block Symbol ................................................. 23
10. Shifter Block Symbol ......................................................................... 24
11. Bus Unit Block Symbol ...................................................................... 27
12. Logic Circuit of Control Unit ............................................................ 31
13. Logic Circuit of Control Unit Positive Clock PLA .............................. 32
14. Logic Circuit of Control Unit Negative Clock PLA ............................. 33
15. Logic Circuit of Register Control Block ............................................ 35
16. Logic Circuit of Control Unit Register Decode Circuitry .................. 36
17. Logic Circuit for ALU Control .......................................................... 37
18. Logic Circuit for Jump Control .......................................................... 38
19. Logic Circuit of Instruction Register ................................................ 40
20. Logic Circuit of General Purpose Registers ....................................... 41
21. Logic Simulation Results of General Register ..................................... 42
22. Logic Circuit of Stack Pointer Register ........................................... 44
23. Results of Stack Pointer Simulation .................................................. 46
24. Logic Circuit of Flags Register ........................................................... 48
25. Simulation Results of Flags Register .................................................. 50
26. Logic Circuit of ALU ........................................................................... 53
27. Logic Circuit of Highest Multiplier Hierarchy Level ............................ 54
28. Logic Circuit of Multiplier Core ......................................................... 55
29. Simulation Results of Multiplier ........................................................ 56
30. Logic Circuit of Memory Data Register ............................................. 58
31. Simulation Results of Memory Data Register ...................................... 59
32. Logic Circuit of Shifter ....................................................................... 61
33. Simulation Results of Shifter ............................................................... 62
34. Read Cycle Timing Diagram ............................................................... 63
35. Write Cycle Timing Diagram .............................................................. 64
36. Logic Circuit for Bus Unit ................................................................. 65
37. Logic Circuit for Bus Unit PLA .......................................................... 66
38. Logic Circuit for Bus Unit Timer ......................................................... 67
39. Logic for Parallel Load 8-bit Counter ................................................ 68
40. Logic for 8-bit Tri-State Latch ............................................................. 69
41. Logic for Reset State Latch ................................................................. 70
42. Logic for Prefetch Queue ................................................................... 72
43. Logic for Prefetch Queue Counter ...................................................... 73
44. Logic for 8-bit Tri-State Register ......................................................... 74
45. Microprocessor Layout Floorplan ....................................................... 77

A-1. Control Flow Chart of Control Unit - Pt 1 ........................................ A-1
LIST OF FIGURES [continued]

A-2. Control Flow Chart of Control Unit - Pt 2 ........................................ A-2
A-3. Control Flow Chart of Control Unit - Pt 3 ........................................ A-3
A-4. Control Flow Chart of Control Unit - Pt 4 ........................................ A-4
A-5. Control Flow Chart of Control Unit - Pt 5 ........................................ A-4
A-6. Control Flow Chart of Control Unit - Pt 6 ........................................ A-5
A-7. Control Flow Chart of Control Unit - Pt 7 ........................................ A-6
B-1. Control Flow Chart of Bus Unit - Pt 1 ................................................ B-1
B-2. Control Flow Chart of Bus Unit - Pt 2 ................................................ B-2
B-3. Control Flow Chart of Bus Unit - Pt 3 ................................................ B-3
B-4. Control Flow Chart of Bus Unit - Pt 4 ................................................ B-4
E-1. Logic Symbol of Trilat Circuit ........................................................... E-1
E-2. Transistor Level Schematic of Tri-State Latch for Logic Simulation .......... E-2
E-3. Logic Simulation Results of Trilat ....................................................... E-2
E-4. Transistor Level Schematic of Tri-State Latch for Physical Simulation ..... E-3
E-5. Simulation Results of Trilat ............................................................... E-4
# LIST OF TABLES

1. ALU Register-Register Instruction Format ............................................. 4
2. ALU Immediate Addressing Instruction Format ................................... 5
3. ALU Direct Addressing Instruction Format ......................................... 5
4. ALU operations ......................................................................................... 6
5. Shifter operations ..................................................................................... 7
6. Miscellaneous operations ....................................................................... 7
7. General Instruction Format .................................................................... 8
8. Shifter Instruction Format ...................................................................... 10
9. ALU Truth Table ..................................................................................... 51
Abstract

The design and implementation cycle of an 8 bit CMOS microprocessor is discussed. The primary steps in the design procedure of the microprocessor consists of instruction selection, instruction encoding and organizational specification. A simple architecture is chosen to allow the emphasis of this investigation is focused upon the entire design procedure.

Software behavioral models of functional blocks within the processor are used to validate the architecture. The functional blocks are then replaced with logic circuit models and tested.

After logical simulations of all blocks have been completed, physical simulations of the logic circuits are performed using a SPICE like simulator to extract delay characteristics of longest circuit paths. Using this delay information, a preliminary estimate of processor speed is possible.

Layout of the processor is generated using the Department of Computer Engineering's 2 μM CMOS Standard Cell Library.
Introduction

Because of the sheer number of devices and the complexity of today's monolithic microprocessors, techniques have evolved to effectively manage their design and implementation.

In the early days of microprocessor development, it was common to build a hardware prototype of the processor before integrating it onto silicon. The prototype was built to enable the designers to validate proper system operation. With a processor of 5,000 gates for example, this was no small undertaking. One small change in logic could very well effect the operation of the rest of the system. Basic operational problems also exist with prototype implementation. How could one be sure that the problem with the circuit was a design flaw and not a broken trace, a loose wire wrap or a faulty part? Also will the integrated circuit design operate properly or with the same characteristics as the discrete hardware prototype?

After the design was validated, a physical layout drawing was created in order fabricate the circuit on silicon. This was accomplished by using rulers and pencils to draw the various masks on graph paper at a draftsman's table.

By today's standards however, a prototype is not only unwise but virtually impossible to realize. With the number of devices on a single chip exceeding the 2 million mark, a discrete model would occupy a significant amount of area as well as dissipate enough energy to heat a moderate size room. Also, the basic implementation problems would still exist except that they would be exacerbated by several orders of magnitude. The solution lies in using computers to model computers. More specifically, in creating software that will behave like the proposed architecture.
Once the architecture of a system has been established, it is possible to write software to model its behavior. Intuitively enough, this type of software is referred to as a behavioral model. Much of the time the architecture is broken down to reveal some of the proposed organization as well. This is done in order to partition the processor into functionally independent sections. The interface between individual parts of the microprocessor can then be established and tested. This encourages a modular design wherein techniques and even circuitry within a particular module may be changed but not affect the operation of other sections, as long as the interface remains constant.

After validation of the architecture has been completed, it is possible to start creating logic circuit models to replace the behavioral models. This technique has several advantages. Not the least of which is that by replacing behavioral modules individually, the logic circuit designs may be tested with a 'perfect' environment. What this means is that if the processor simulation operated according to specifications with the behavioral software models, then the replaced module can be the focus of attention, and (hopefully) any problems encountered will be the result of the logic model. This procedure is repeated until all modules in the design are composed of logic models.

After simulations of the logic models insure proper system operation, the simulations of the physical circuit models must be made to verify that the device sizing is adequate to drive the attached loads in the specified periods of time.
1.0 Theory

The design and implementation procedure outlined in the Introduction section is essentially the one followed in the completion of this microprocessor.

The first step was to create the architecture for the microprocessor. Because this was an experiment in the design process as a whole, and not one devoted strictly to computer architecture; a simple 8 bit architecture was chosen. The first step was to write down all the instructions that were desirable to have in a microprocessor from a software engineer's point of view. This list was successively reduced over several iterations leaving the list designated Table 4 as the ALU related operations, Table 5 as the shifter operations and Table 6 as the miscellaneous instructions.

The criteria for exclusion of instructions was based upon their functionality in executing a reasonable program and intuition as to their difficulty of implementation. The designer's conception of the format of the system organization and that fact that this entire project was being done by him in the time span of approximately 1 school year also played an important part in limiting the size of the architecture.

As one can plainly see in Tables 4-6 the list of instructions is simple and contains nearly all the basic operations one would expect to find in any general purpose processor on the market today. The columns define the instruction mnemonic, the actual op-code for various addressing modes, the PSW (Processor Status Word), and a simple description of the operation.

The PSW is a nibble containing flags NZVC (Negative, Zero, Overflow and Carry). These flags report the status of various operations performed by the processor and allow for all standard conditional jumps.
Depending upon the instruction, up to three different addressing modes could apply. The first mode is register-register addressing. This is used when an instruction deals only with data that is contained within registers. This mode applies to all of the ALU related operations, the shifter operations and the multiply instruction.

Because the ALU operations may be used with any of the three addressing modes, the General Instruction Format shown in Table 7 has three specific forms that correspond to these addressing modes. For the register-register addressing mode, the instruction format is as follows:

<table>
<thead>
<tr>
<th>01xxx100</th>
<th>-DDD-SSS</th>
</tr>
</thead>
<tbody>
<tr>
<td>xxx: encoded ALU function</td>
<td></td>
</tr>
<tr>
<td>SSS: source register</td>
<td></td>
</tr>
<tr>
<td>DDD: destination register</td>
<td></td>
</tr>
</tbody>
</table>

TABLE 1 ALU Register-Register Instruction Format
The next mode is immediate addressing which causes a constant 8-bit value to be used in an operation. This mode is specified with the ALU instructions, load or store. The format for ALU instructions in this mode is given by:

<table>
<thead>
<tr>
<th>01xxx000</th>
<th>-DDD----</th>
<th>ZZZZZZZZ</th>
</tr>
</thead>
<tbody>
<tr>
<td>xxx: encoded ALU function</td>
<td>DDD: destination register</td>
<td>ZZZZZZZZ: immediate data</td>
</tr>
</tbody>
</table>

TABLE 2 ALU Immediate Addressing Instruction Format

The last mode available is the direct addressing mode. Direct addressing requires that along with the operation code the instruction must contain an 8-bit address where the desired data resides.

<table>
<thead>
<tr>
<th>01xxx010</th>
<th>-DDD----</th>
<th>ZZZZZZZZ</th>
</tr>
</thead>
<tbody>
<tr>
<td>xxx: encoded ALU function</td>
<td>DDD: destination register</td>
<td>ZZZZZZZZ: address of data</td>
</tr>
</tbody>
</table>

TABLE 3 ALU Direct Addressing Instruction Format

The load and store instructions have only the immediate and direct addressing modes associated with them. A register-register load or store has no meaning in this processor.
<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Opcode</th>
<th>Flags NZVC</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADD</td>
<td>$40 immed.</td>
<td>XXXX</td>
<td>2's complement Addition</td>
</tr>
<tr>
<td></td>
<td>$42 direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$44 reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>SUB</td>
<td>$48 immed.</td>
<td>XXXX</td>
<td>2's complement Subtraction</td>
</tr>
<tr>
<td></td>
<td>$4A direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$4C reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>NOT</td>
<td>$50 immed.</td>
<td>XX00</td>
<td>Logical Inversion</td>
</tr>
<tr>
<td></td>
<td>$52 direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$54 reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>AND</td>
<td>$58 immed.</td>
<td>XX00</td>
<td>Logical AND</td>
</tr>
<tr>
<td></td>
<td>$5A direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$5C reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>OR</td>
<td>$60 immed.</td>
<td>XX00</td>
<td>Logical OR</td>
</tr>
<tr>
<td></td>
<td>$62 direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$64 reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>XOR</td>
<td>$68 immed.</td>
<td>XX00</td>
<td>Logical Exclusive-OR</td>
</tr>
<tr>
<td></td>
<td>$6A direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$6C reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>NEG</td>
<td>$70 immed.</td>
<td>XXXX</td>
<td>2's complement Inversion</td>
</tr>
<tr>
<td></td>
<td>$72 direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$74 reg-reg</td>
<td></td>
<td></td>
</tr>
<tr>
<td>CMP</td>
<td>$78 immed.</td>
<td>XXXX</td>
<td>Byte Comparison</td>
</tr>
<tr>
<td></td>
<td>$7A direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>$7C reg-reg</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

TABLE 4 ALU operations
### TABLE 5 Shifter operations

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Opcode</th>
<th>Flags</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LSR</td>
<td>$C0</td>
<td>0X0X</td>
<td>Logical Shift Right</td>
</tr>
<tr>
<td>LSL</td>
<td>$C8</td>
<td>XX0X</td>
<td>Logical Shift Left</td>
</tr>
<tr>
<td>ASR</td>
<td>$D0</td>
<td>XX0X</td>
<td>Arithmetic Shift Right</td>
</tr>
<tr>
<td>ASL</td>
<td>$D8</td>
<td>XX0X</td>
<td>Arithmetic Shift Left</td>
</tr>
<tr>
<td>ROR</td>
<td>$E0</td>
<td>XX0X</td>
<td>Rotate Right</td>
</tr>
<tr>
<td>ROL</td>
<td>$E8</td>
<td>XX0X</td>
<td>Rotate Left</td>
</tr>
<tr>
<td>Push</td>
<td>$88</td>
<td>----</td>
<td>Push Register</td>
</tr>
<tr>
<td>Pop</td>
<td>$80</td>
<td>----</td>
<td>Pop Register</td>
</tr>
</tbody>
</table>

### TABLE 6 Miscellaneous operations

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Opcode</th>
<th>Flags</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LD</td>
<td>$08 immed.</td>
<td>----</td>
<td>Load Register</td>
</tr>
<tr>
<td></td>
<td>$0A direct</td>
<td></td>
<td></td>
</tr>
<tr>
<td>ST</td>
<td>$10 direct</td>
<td>----</td>
<td>Store Register</td>
</tr>
<tr>
<td></td>
<td>$12 indirect</td>
<td></td>
<td></td>
</tr>
<tr>
<td>MULT</td>
<td>$98 reg-reg</td>
<td>----</td>
<td>Multiply (c:d &lt;- a*b)</td>
</tr>
<tr>
<td>SCB</td>
<td>$B0</td>
<td>---- 1</td>
<td>Set Carry Bit</td>
</tr>
<tr>
<td>CCB</td>
<td>$A8</td>
<td>---- 0</td>
<td>Clear Carry Bit</td>
</tr>
<tr>
<td>JMP</td>
<td>$B8</td>
<td>----</td>
<td>Jump Always</td>
</tr>
<tr>
<td></td>
<td>$B9</td>
<td></td>
<td>Jump if equal/zero</td>
</tr>
<tr>
<td></td>
<td>$BA</td>
<td></td>
<td>Jump if overflow/underflow</td>
</tr>
<tr>
<td></td>
<td>$BB</td>
<td></td>
<td>Jump if carry/borrow</td>
</tr>
<tr>
<td></td>
<td>$BC</td>
<td></td>
<td>Jump if negative</td>
</tr>
<tr>
<td>Halt</td>
<td>$A0</td>
<td>----</td>
<td>Halt</td>
</tr>
<tr>
<td>NOP</td>
<td>$00</td>
<td>----</td>
<td>No Operation (Idle Cycle)</td>
</tr>
</tbody>
</table>

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -7-
Table 7 shows the general instruction format devised for the processor. Each of the fields of the instruction are listed and named:

<table>
<thead>
<tr>
<th>TTxxxRM-</th>
<th>-DDD-SSS</th>
<th>ZZZZZZZZ</th>
</tr>
</thead>
<tbody>
<tr>
<td>TT:</td>
<td>-DDD-SSS</td>
<td>ZZZZZZZZ</td>
</tr>
<tr>
<td>xxx:</td>
<td></td>
<td></td>
</tr>
<tr>
<td>R:</td>
<td></td>
<td></td>
</tr>
<tr>
<td>M:</td>
<td></td>
<td></td>
</tr>
<tr>
<td>DDD:</td>
<td></td>
<td></td>
</tr>
<tr>
<td>SSS:</td>
<td></td>
<td></td>
</tr>
<tr>
<td>ZZZZZZZZ:</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**TABLE 7 General Instruction Format**

Some explanation of the possible values of each field and their uses is in order:

**TT**: the two bit operation type field is used to distinguish which of the general types of operations the op-code belongs to. The basic types are 00 which corresponds to a NOP, LOAD or STORE, 01 and 11 which indicates that the encoded data bits are sent directly as control signals to the ALU and shifter respectively, and lastly 10 is used to represent miscellaneous instructions whose bits do not correspond to any special control line encoding.

**xxx**: these three bits determine which specific instruction is to be executed within a specific operation type. For instance if the TT bits are 01 and these bits are 000 this would indicate an ALU add operation since setting the control lines of the ALU to 000 would cause an add to be performed.
R: this bit designates if the op-code is a register-register operation (R=1) or a register-memory function.

M: the data modifier bit works in conjunction with the R bit to more specifically determine which type of addressing is used. When register-register addressing is used this bit has no significance but should be set to 0. However if register-memory addressing is used, an M=0 would indicate immediate addressing and M=1 represents direct addressing.

DDD: this field of the second byte determines where the results of an operation is to be placed. It also specifies the first data source in register-register and register-memory operations.

SSS: this field is used to specify the second data source in register-register operations. It only becomes active when R=1.

ZZZZZZZZ: used to hold the data for an immediate data operation or the address for a direct addressing mode function.

One additional note on the subject of instruction encoding. Since all of the shifter related functions are by definition single register functions, the encoding scheme for these is slightly different. Table 8 shows the modified format for shifter operations. The only differences are that the instruction occupies only one byte and the source/destination register is encoded in the three least significant bits of the instruction.
Although there are truly only three addressing modes in which to deal with data in operations, there are some instructions that do not use any of the aforementioned addressing modes. These instructions could be classified as using inherent addressing (TT=01) due to the nature of their operations.

One such instruction is multiply. This instruction has the source and destination registers hardcoded into it. The way in which multiply works can be explained by the following RTL statement:

\[ \text{C} : \text{D} = \text{A} \ast \text{B} \]

This equation states that the A register is the multiplier, the B register is the multiplicand, and the C and D registers hold the most significant and least significant bytes of the results respectively.

The instructions for Set Carry Bit (SCB), Clear Carry Bit (CCB), Jump, and Halt also reside in the inherent category. All of these instructions have only one form, except for the jump instruction which requires some explanation. As one can observe from Table 6, the jump instruction has several forms which correspond to a jump on certain conditions. In order for the instruction to jump to a new address, the address

| 11xxxSSS |
|---|---|
| 11: specifies shifter function |
| xxx: encoded shifter function |
| SSS: source/destination register |

TABLE 8 Shifter Instruction Format
provided in the second byte of the jump instruction in the form of an eight bit number which corresponds to the absolute address of the destination.

The processor's address space was restricted to eight bits. This was done in order to keep the address busses and data busses the same size, thus reducing the complexity of the processor. If the expansion of the address space to 16 bits or more is desired, the same design techniques could be followed, and the same architecture used, with some changes to accommodate the larger sizes.

If the word size of the processor and the address bus size are the same, then the only changes that need to be made are that all registers that hold data or addresses and all of the data and address busses must be made large enough to accommodate the new word size.

If the address space is to be increased by a different amount than the word size, then major design changes will have to be made. This will involve determining how an address will be fetched from memory on the data bus. One technique would be to read in portions of the address over several cycles. For instance if the word size is to remain eight bits but the address space is increased to a 32 bit size, then an address could be acquired from memory in four reads (one read per byte).

There also remains the problem of reading this address into an internal register such as the Stack Pointer. One solution would be to have an internal data bus of 32 bits. During a data operation only one byte of the bus is used.

Since the address and data busses of this microprocessor are the same, the problems just mentioned will not be addressed in detail.
1.1 Organization and Behavioral Modeling

The organization of the processor was developed in the later half of the architecture specification. It includes partitioning the processor into several functional blocks. These blocks along with descriptions of their functions and diagrams are presented here.

The first step in the implementation of this processor was to model each of the processor's modules in software. Two languages were examined for this purpose. The first was VHDL (VHSIC Hardware Description Language). After consideration, this option was rejected because the release of the VHDL system available from RIT's Department of Computer Engineering required a slightly different version of the simulator than the one used by the basic EDA tools. This would prevent the simulation of VHDL models with circuits using the RIT Department of Computer Engineering's Standard Cell Library. Also, because this VHDL was an Alpha release from Mentor Graphics it was not a full implementation of the language. It was felt that a significant amount of effort would be required to overcome problems associated with this version of the software that could be used instead to concentrate on the project at hand.

The language that was chosen was Mentor Graphics' BLM (Behavior Language Model) or more specifically C with functions calls and other 'hooks' into Mentor Graphics' Quicksim Logic Simulator. This proved to be a good choice because of the author's familiarity with C, and the ease with which one could construct these models.

After the organization had been decided upon, the interfaces to the various modules were available. The first step in constructing a BLM was to create a symbol for the part in SYMED (Mentor Graphics Symbol Editor) symbol that was graphically meaningful. For instance the BLM for the ALU was constructed to take on the standard 'V' like shape (Figure 7) normally associated with an ALU. These symbols could then be interconnect with NETED, the Mentor Graphics' Schematic Editor, to interface the
modules with one another.

In the descriptions of the processor that are presented here, all control signals are to be considered active high signals unless otherwise specified.

An important point that must be stated here is the author's use of reading and writing as they apply to functional blocks and circuits within the processor. Because the functional blocks were created and modeled on an individual basis, the use of the terms read and write is different than the one normally used. For instance a register is said to write when it presents its stored data on an output bus. Similarly, a register reads when it captures data at one of its inputs. This convention is maintained throughout the entire processor description.
Figure 1 Block Diagram of Processor
1.1.1 Control Unit (CU)

This block performs as a Mealy state machine, therefore the BLM that was written to emulate it was created with a memory of its current state, a knowledge of its input values and the execution steps it should take according to the Control Unit Control Flow Chart found in Appendix A.

If one reads the next few sections describing the input and output pins of the remaining blocks, a good understanding of most of the pins on the CU can be gained. However, there are still a few pins that require explanations:

illegal_instr: This output is raised when an illegal instruction occurs.

read: This output is used to indicate when the instruction register must read its input bus.

i0(7:0): By directly connecting the CU to the i0 bus, the PSW from the flags register is used during a conditional jump to determine if the condition is true or false.

psw_mux: This is a control signal connected to a multiplexer which chooses the PSW output from the ALU or the shifter and routes it to the Flags Register Input.

mux_sel: This is also a control signal to a multiplexer except that this chooses between the output of the shifter or the ALU.

mux_oe: This control signal enables or disables the multiplexer's tri-state output from the ALU or shifter to the i0 bus.
Figure 2 Control Unit Block Symbol
1.1.2 Instruction Register (IR)

This register is a read-only register that accepts an instruction byte and decodes the operation of that byte. It provides the control unit with information such as instruction type (ALU operation, load, store, shift, etc) as well as addressing mode (immediate, register-memory, register-register).

![Instruction Register Block Symbol]

Figure 3 Instruction Register Block Symbol

The decoding of opcodes read from memory is performed by combinational circuitry attached to the read-only register. The BLM decoding was performed in a straightforward manner by reading in data from the ibus input when the clk and read inputs were both high. After the data was acquired, the byte was cast into a C union structure so that individual bit fields could be addressed. This would allow for simple conditional statements to be used in determining how the output lines should be set for the CU.

The determination of what output lines the instruction register would have was decided during the instruction encoding phase of design. The output lines can be directly associated with different cases discussed in the instruction encoding section.
For instance when the reg output is high it indicates a register-register instruction. Similarly if the alu_op line is high then the bits contained in the instr(2:0) bus are routed directly to the ALU as control signals. All of the other signals are fairly obvious with the exception of src_dest(2:0) which is the source/destination register specification for a shift register instruction.

1.1.3 General Purpose Registers

These 4 registers (denoted A,B,C and D) are used as intermediate locations to hold results of operations. They are read/write registers having access to all four internal data busses.

![Figure 4 General Purpose Register Block Symbol](image)

Since all of registers in the processor were to operate similarly, the BLM code (and later the logic circuit schematics) was simply copied and altered to suit the individual requirements of specific registers. The construction of the General Register models...
was done in this way; using the basic code from the Instruction Register, modifications were made to include two input and two output busses, a pin for selecting the bus (a or b), a pin for selecting the mode (a high on the read line would read an input bus, a low would write to an output) and an enable pin.

1.1.4 Stack Pointer (SP)

The Stack Pointer Register is used to maintain the address used in pushes and pops (and also in subroutine jumps if implemented). This register is essentially the same as the General Purpose Registers with the exception that it can automatically increment or decrement the value it has stored with the application of a high signal on the inc or dec inputs respectively.

![Diagram of Stack Pointer Register Block Symbol](image)

Figure 5 Stack Pointer Register Block Symbol
1.1.5 Flags Register (Flags)

The Flags Register was designed to maintain the four PSW bits and have the ability to set or reset the carry bit on command. Although the input $iI$ and output $oO$ are shown to be eight bits wide this was done only to make the register compatible with the existing eight bit wide internal data busses.

![Flags Register Block Symbol](image)

Figure 6 Flags Register Block Symbol

It is plainly obvious that the Flags Register has only one output, this was done because the only place that the PSW need go was to memory to be written as data and through the ALU during a CMP instruction. The lack of another output bus was also advantageous since it allowed for an unused combination of the register's inputs to serve as the means of setting or resetting the carry bit. When the $bus_{sel}$ input was set high and the $read$ input was low (thus attempting to write to a nonexistent output bus $ol$ the register would assign the value of the $sc$ (set carry) input to the carry bit in the register.
1.1.6 Arithmetic Logic Unit (ALU)

The ALU was conceived to be a generic, combinational circuit having 2 data bus inputs \( a \) and \( b \) and one resultant data output. Also present was a 4-bit control bus to select among the nine available functions, and a 4-bit bus output containing the PSW.

![ALU Block Symbol](image)

Figure 7 ALU Block Symbol

The nine functions that the ALU will perform can be found in Table 4 with the addition of a pass_a and pass_b modes that will pass the \( a \) and \( b \) inputs respectively through to the output. If one looks at the ALU instruction chart, it is apparent that there is no ADD instruction with a carry in. This was excluded in order to make the ALU more simple.

The PSW output of the ALU will change whenever data is presented at the inputs of the ALU. However, this output will not be stored in the Flags register until the
appropriate control signals are applied. In this way, the PSW output may still be referred to as the Processor Status Word since it is not stored on a cycle to cycle basis but rather on an instruction basis. Upon completion of the ALU specifications, the C code for the block was written and may be found in Appendix C.

1.1.7 Multiplier (Mult)
The multiplier was the simplest of the BLMs to create. It simply multiplied the values of both inputs together and wrote the high byte to the out_hi(7:0) output and the low byte to the out_lo(7:0) output when the oe input was high. Since the multiplier was a combinational circuit, the BLM was written to emulate a combinational circuit. This was done by having the multiplier recalculate the output whenever any of the inputs changed.

\[\text{Figure 8} \quad \text{Multiplier Block Symbol}\]
1.1.8 Memory Data Register (MDR)

This register has read/write access to all 4 data paths as well as write capability of high and low nibbles to the CU. The purpose of the high/low nibble write is to allow the CU to decode the register(s) involved in operations.

![Memory Data Register Block Symbol](image)

**Figure 9  Memory Data Register Block Symbol**

The MDR again used the basic register BLM code. The exceptions this time were that the MDR had only one input and one output bus thus eliminating the need for a `bus_sel` line. However there were two additional four bit output busses added: `hi_nib(7:4)` and `lo_nib(3:0)`. These two busses were used to transport the hi and low nibbles of the second byte in an op-code (that which contained the source and destination registers when applicable) to the Control Unit.

These two additional busses were easily implemented. By simply using logical bit masks and shifts the appropriate nibbles of the store data byte could be separated and sent out on the correct bus.
1.1.9 Shifter

The next block of the processor to be created was the shifter. Since all of the functions in the data path of the processor were to be completed in 1 clock cycle, it was decided that a shifter using switching techniques would be used. It would have a data input and output as well as inputs for control lines and an PSW output. Although Figure 10 shows a four bit wide PSW bus, the only pertinent signals are N, Z, and C since an overflow will never occur in the shifter.

![Shifter Block Symbol](image)

Figure 10 Shifter Block Symbol

The shifter block diagram also shows a carry-in input (cin) which is used to input the value of a carry bit for a ROL or ROR operation. This carry bit is provided by the Flags Register.

1.1.10 Bus Unit (BU)

This module was the most complicated model to create and to debug due mainly to its functional complexity. It is a state machine in itself responsible for interfacing with
the external environment.

This module also maintains the PC due to the fact that all addressing operations are absolute and not offset from the PC or any other register. Within the BU is a hardware queue four bytes deep that is used to store instructions that have been fetched from memory at the location specified by the PC. When not synchronized with the CU to perform some specific task, the BU attempts to keep its prefetch full.

Upon system reset, the Bus Unit will bootstrap the processor from a hardcoded memory address. When a reset occurs, the bus unit will automatically read address $00 and fetch that byte which is actually a pointer to the first instruction of the program code to be executed.

The inputs and outputs are described as follows:

\textit{inbus(7:0)}: This input bus is used to present data that is to be written into memory to the bus unit.

\textit{addr(7:0)}: This input bus is used to provide the proper address information to allow the bus unit to write data to, or read data from external memory. This bus is also used to load the Bus Unit's PC register with an instruction address.

\textit{outbus(7:0)}: This bus takes the data from the bus unit to other parts of the processor. The data carried on this bus is from a memory read or a request to the bus unit for the next op-code in the prefetch.

\textit{func(3:0)}: The bus unit has five possible requests that can be made of it. These include:
**jump_taken**: This function informs the bus that a jump was taken. It should clear the bus units’ prefetch queue and accept the address on the *addr* bus as the new memory location to begin an instruction fetch.

**write**: The data on the *inbus* is written in the location specified by the *addr* bus.

**read**: The data at the address specified by the *addr* bus is read and placed the it on the *outbus*.

**next_opcode**: The control unit is requesting the next op-code in memory. If the bus_unit has the data in its prefetch, it immediately presents it on the *outbus* and raises the *data_valid* signal. If the data is not in the prefetch, the BU makes the CU wait unit it can fetch it from memory and then raises the *data_valid* signal.

**direct_fetch**: The BU takes the first byte in the prefetch and uses it as the address from which to read a data byte. If the prefetch is empty, then the BU gets the next byte and then does the fetch.

**en_func**: This signal enables a command given to the bus on it on the *func* bus.

**data_valid**: This output signal is used to inform the CU that the bus unit is presenting the data requested on the *outbus*.

**dbus(7:0)**: The bi-directional data bus to the memory chip.

**abus(7:0)**: This is the address bus to the memory chip.

**oe**: The control signal that allows the memory chip to write to the *dbus*.

**we**: This signal causes memory to latch the *abus* and *dbus* during a memory write.
**reset**: This signal informs the BU that a reset condition is in effect and that it should proceed to do a direct_fetch on memory address $00$ to get the first instruction.

![Bus Unit Block Symbol](image)

**Figure 11**  Bus Unit Block Symbol

The bus unit also offers several features that make it unique. One feature involves the Direct Fetching operations. When an instruction is in the direct addressing mode, the next byte in memory is the address of the data. The bus unit has the capability of automatically taking the first instruction from the internal prefetch and writing it to memory as the address. This prevents the needless reading and writing of a memory address register. An Control Flow Chart of the Bus Unit is located in Appendix B.

In addition it was decided that the processor would have four 8-bit data buses two of which would connect the register outputs to the ALU, multiplier and the BU and
two that would connect the outputs of the ALU, multiplier and BU to the register inputs.

2.0 Logic Circuit Construction

Once the BLM simulations were completed and the architecture and the organization were validated by running a variety of test programs, the functional blocks of the processor were replaced with logic circuit models.

The general testing philosophy was that blocks would be constructed from logic parts to functionally emulate the BLMs. The new logic circuits would be thoroughly tested in a stand alone fashion. Once this segregated testing was completed, the new hardware block would be placed into the software model of the entire processor and the test vectors re-run.

It was originally intended to use this procedure until all the blocks of the processor were replaced and a model composed of only hardware could be tested. However, due to the large number of logic circuits in the design and the relative inefficiency of circuit models compared to software models, it was decided that only a few logic blocks would appear in the design at any one time. Although this approach may not uncover all of the problems associated with an all-logic version of the processor, it was felt that the results would be acceptable due to the constraint that the processor's fabrication was not imminent.

After logic simulation of each the circuit blocks was completed using Mentor Graphics' logic simulator Quicksim, the circuits were physically simulated using a SPICE like simulator also from Mentor Graphics called Accusim. This physical simulation was done in order to extract the delay times of the components thus
providing an estimation of maximum clock speed for the processor.

A careful examination of the logic circuit design was used to determine the longest delay paths. The delay time is designated as the elapsed time from 50% of final value of the input signal to 50% of final value of the latest output signal. To simulate loading, an external capacitor of 0.5 pF was placed on the output lines. The simulation temperature was 27°C.

The logic parts (and layout cells) used in this processor were taken from the RIT Department of Computer Engineering's 2 μm CMOS Standard Cell Library created by Larry Rubin. They will henceforth be referred to as The Library. Those parts that were required but not present in The Library were completely developed and tested. These cells will be discussed in detail in Appendix E.

The following is a brief description of The Library components used that are not immediately obvious:

buf - buffer circuit implement with two inverters
cmux2 - 2 input mux
cmux4 - 4 input mux
crxfr - CMOS resistive transmission gate
cxfr - CMOS transmission gate
dff - D-type Flip-Flop
dffar - D-type Flip-Flop with asynchronous reset
dffarcr - D-type Flip-Flop with asynchronous reset; active high clock
dffarscr - D-type Flip-Flop with asynchronous reset and set; active high clock
dffascr - D-type Flip-Flop with asynchronous set; active high clock
dlat - D latch
dlatar - D latch with asynchronous reset
ietri - tri-state inverter

The Design and Implementation of an 8 bit CMOS microprocessor J. Correll 6/29/92 -29-
2.1 Control Unit

The largest of the functional blocks to be implemented was the Control Unit. Its large size is due chiefly to the sizable PLA implemented to produce the many signals needed to control the other blocks in the processor.

The input signals to the Control Unit are relatively few (compared to the number of outputs). The majority of these signals originate in the Instruction Register.

The schematic in Figure 12 is the top level hierarchical diagram of the Control Unit. Upon examination of this schematic, one immediately notices that there are several large functional blocks present.

The block designated PPLA (Positive clock edge PLA) is the PLA and associated latches used to generate most of the signals that occur in the control unit. A more detailed description of the contents of the PPLA block is found in Figure 13. This figure actually only consists of the transistor logic for the PLA (POS_PLA) and the latches used to capture the output signals when they are valid. The internal circuitry of the POS_PLA block will not be shown because the sheer size of the circuit precludes any detail from being shown when printed.

The NPLA (Negative clock edge PLA) is used to generate control signals on the negative clock edge. This block is architecturally the same as the PPLA block with Figure 14 showing the block diagram containing the transistor level logic of the PLA (NEG_PLA) and the signal latches. The transistor schematic of the NEG_PLA is also too dense to show much detail, so again, this schematic is excluded.
Figure 12 Logic Circuit of Control Unit
Figure 13  Logic Circuit of Control Unit Positive Clock PLA
Figure 14 Logic Circuit of Control Unit Negative Clock PLA
The next most sophisticated block in the design is the Register Control (REG_CNTRL) logic. This block is used to control the operation of the General Purpose Registers, MDR, SP, and the Flags Register. Since it is required that these registers be operated from the Control Unit itself, or from another source such as the source/destination byte of a register-register instruction a scheme of multiple control points for each register had to devised. The REG_CNTRL block serves this purpose. At the top of the Figure 15, one notices that there are en, read, and bus_sel (where applicable) lines for each of the registers that are fed directly into OR gates whose outputs go directly to the respective registers. These inputs are fed from the POS_PLA and NEG_PLA directly to operate specific registers at a particular time (such as the hardcoded register designation during a MULT operation).

In the middle of the schematic are located three functional blocks called REG_DECODE (Register Decode). These blocks take the register encoded bits that appears in the source/destination byte or the last three bits in the shifter instruction and decodes them to operate the processors registers. Each of these blocks also have an enable line which controls whether or not the encoded bits the blocks represent are decoded and passed out of the block. A logic level diagram of the REG_DECODE block may be found in Figure 16.

This technique of register decoding relies on the assembler used with the processor to ensure that first, no reference is made to a nonexistent register and second, that a conflicting signal to a register does not occur (such as one register being told to read and write at the same time).
Figure 15 Logic Circuit of Register Control Block

The Design and Implementation of an 8 bit CMOS microprocessor   J. Correll   6/29/92   -35-
Figure 16 Logic Circuit of Control Unit Register Decode Circuitry
The next block to be discussed is the ALU_CNTRL (ALU Control) block. This logic was necessary due to the fact that the ALU needed to be controlled either through an op-code such as AND, or by the Control Unit explicitly for such things as a CMP which actually subtracts one value from another.

The output labeled compare checks to see if the instruction being processed is the CMP instruction. If it is, the Control Unit needs some indication so that it does not try to write the result of the subtraction into any register other than the flags register. The one AND gate used to perform this function could have been placed in the IR and another output from the IR to the Control Unit could have been added. There is no advantage to placing it in one place versus the other.

![Logic Circuit for ALU Control](image)

Figure 17 Logic Circuit for ALU Control

The last block of circuitry in the Control Unit is the JUMP_CNTRL (Jump Control) logic. This circuit compares the type of conditional jump that is being processed to the data received from the Flags Register (the lower nibble of bus i0). If there is a match between these two values, a high jump_good signal is passed into the POS_PLA where
the appropriate action is taken.

Figure 18 Logic Circuit for Jump Control

The worst case circuit paths in the PLA are defined as those signal paths with the largest number of transistor gates attached to them. These paths would have the largest load capacitance and therefore react the slowest to input stimuli. After simulation of these circuit paths in the Control Unit PLA, the propagation time was measured to be 63 nS.
2.2 Instruction Register

Although this register is discussed before the General Purpose Register in the next section, it was that register that served as the basis for this design.

The circuit for the Instruction Register can be found in Figure 19. On the left portion of the diagram, one can observe the storage portion of the register (a more detailed explanation of circuit operation can be found in the following section 2.3). The instruction decode portion of the circuit demands the most attention. Because of the way in which the instructions were encoded, the most effort (and circuitry) went into the decoding of the 'special' functions such as: Pop, Push, Jump, etc. The remaining circuitry is there simply to decode the addressing mode, general category of the instruction (alu_op, shift_op, nop) and whether the instruction is illegal or not.

An output listing from the Quicksim simulation of the Instruction Register can be found in Appendix D.
Figure 19 Logic Circuit of Instruction Register
2.3 General Purpose Registers

The general register logic design was made a straightforward exercise by using the d flip-flops blocks designated DFF in The Library. The logic diagram of the register can be seen in Figure 20 below.

Figure 20  Logic Circuit of General Purpose Registers
The basic operation of the register is simple, the external logic is configured such that the DFF is always accepting data into its master stage. The DFF will latch the data into the slave stage when a high signal is placed in the read and en inputs. The input bus_sel merely causes the input multiplexer to select which of the two busses data is accepted from. The bus_sel input also causes one of the tri-state output buffers to become active when the read signal is low and the en signal is high.

The basic design used in this register will serve as the basis for the design of later registers such as the Stack Pointer Register and the Flags Register.

Figure 21 below is a plot of the logical simulation results for the general register. By examining the diagram, one can observe that the inputs of the register i0 and i1 are presented with $AA$ and $CC$ respectively. When the en and read signals are both activated at time 10.0, the register captures the data present at its i0 input since the bus_sel line is low. This data is then written out to the o0 output at time 39.0. Similarly the value of $CC$ is read in from i1 and written to o1.

![Figure 21 Logic Simulation Results of General Register](image)

The general purpose register was also tested using the Accusim simulator to obtain the time delay of stable signals appearing on the register's output when a write occurs. The signal rise time was 3.56 nS and the fall time was 4.05 nS.
2.4 Stack Pointer

Probably the most difficult register to implement in this processor was the Stack Pointer. The complication of design arises from its need to be self-incrementing and self-decrementing. Much time was spent on this design and several different schemes were tried. The final choice was the simplest conceptually and the most straightforward to implement. By using a bank of adders and asynchronously setable and resetable flip-flops it was possible to perform the functions at a reasonable speed with a tolerable circuit size.

The basic register design came from the General Purpose Register. However, due to the extra functionality required by this register, many additions were made. Upon examination of Figure 22 one can pick out the basic circuitry ported from the General
Purpose Register schematic. The way in which the self-incrementing/decrementing is performed is as follows:

1. When either the inc or dec input is raised the DLAT (d-type latch) circuit captures the present value of the register. The output of the DLAT is
fed into the appropriate bit position in the adder bank.

2. The inc control line is used to determine if number added to the present value of the register is 01 or FF, which would be increment or decrement respectively.

3. The two NAND gates at the rst* and set* inputs of the DFFARSCR flip-flop is used so that the output of the respective adder block is used to set the flip-flop (if the adder output is 1) or reset the flip-flop (if the adder output is 0).

Since the circuitry contained in this register is very similar to that of the General Purpose Register, it shares the rise and fall time of that register as well. The times required for incrementing and decrementing the values stored in the Stack Pointer were found to be 5.6 nS for both operations. The time was the result of incrementing $FF by 1 and decrementing $00 by 1. These operations caused a carry bit to be propagated throughout the entire adder circuit, thus making it the longest path.
Figure 23 Results of Stack Pointer Simulation
2.5 Flags Register

All registers presented thusfar have been eight bits wide. The Flags Register is the exception. Because there are only four PSW bits, it was not necessary to build a byte register. The previous requirement of 2 output busses was also not implemented here, since the PSW needs only to pass through the ALU or go to the BU to be written to memory. A more detailed explanation of the general design considerations of the Flags Register can be found in Section 1.1.5.
The logic to implement the setting and resetting of the carry bit was a simple matter of decoding the bus_sel, read, sc and en signals so that when a write to bus1 was attempted the register holding the carry bit would take on the value of the sc line.

Figure 24 Logic Circuit of Flags Register
A test trace of the Flags Register is found below in Figure 25. As once can see from the o0 output, the value stored in the register is 0x08. The value then changes to 0x09 when a write to bus1 is attempted and sc is high. Similarly the stored value returns to 0x08 when sc is low and another write is performed.

This register is also a virtual clone of the General Purpose Register. Therefore a physical simulation will not be presented here and the delay times of this register are considered to be the same as those of its origin register.
Figure 25: Simulation Results of Flags Register

<table>
<thead>
<tr>
<th>Value</th>
<th>Flag 0</th>
<th>Flag 1</th>
<th>Flag 2</th>
<th>Flag 3</th>
<th>Flag 4</th>
<th>Flag 5</th>
<th>Flag 6</th>
<th>Flag 7</th>
</tr>
</thead>
<tbody>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- bue_eel: include read, enable, and output

The Design and Implementation of an 8 bit CMOS microprocessor
J. Correll
6/29/92
2.6 ALU

The ALU was one of the most interesting circuits in the processor to design. The ALU is based on the Full-Adder circuit. It was noticed that the Full-Adder truth table contained several logic functions embedded in it. By examining Table 9 it is noticed that if the C variable is maintained at 0, then the Sum column is actually B A, the Carry column is B•A. And if C is made a 1 then the Sum and Carry columns become \( \overline{B} \overline{A} \) and B+A respectively.

<table>
<thead>
<tr>
<th>C</th>
<th>BA</th>
<th>Sum</th>
<th>Carry</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>00</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>01</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>10</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>11</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>00</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>01</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>10</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>11</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

TABLE 9 ALU Truth Table

By making these observations, it was apparent that four of the nine functions (\( \overline{B} \overline{A} \) was not used and addition is the function of the Full-Adder) were already available for use. A scheme was developed using extra logic, to create the remaining functions. The following table summarizes the remaining functions to be implemented and the technique in which it was done.

NEG B: \( B = \overline{B} ; A = 0 \), Carry-in = 1
NOT B: \( B = \overline{B} ; A = 0 \)
PASS_A: \( B = 0 ; A = A \)
PASS_B: \( A = 0 ; B = B \)
SUB: \( B = \overline{B} ; A = A \)
Signals that needed to be inverted for use in the full adder were passed through Exclusive-Or gates which served as programmable inverters. At certain times, a zero was required at one of the inputs to the adder. This was accomplished by AND gates placed before the Exclusive-Or gates which could be made to pass a 0 simply by placing a low signal on one of its inputs.

The problem of choosing between the Sum and Carry column functions was easily solved through the use of multiplexers at the output. These multiplexers were controlled by the decode logic to select the proper adder outputs.

The last circuitry to be implemented was that which generated the PSW bits. The Carry bit (when applicable) was simply the Carry-out of the most significant Full-adder block. The Sum output of the most significant stage was passed as the Negative bit. The zero signal can be described by the equation: \( \text{Zero} = a_7 \overline{a_6} \ldots \overline{a_0} \). Finally the Overflow could only be true during an add when the most significant bit (MSb) of the \( a \) and \( b \) inputs differed from the MSb of the output, or \( \text{O} = \overline{\text{out}(7)} \cdot a(7) \cdot b(7) + \text{out}(7) \cdot \overline{a(7)} \cdot \overline{b(7)} \).

Physical simulations of the ALU through its longest circuit paths were performed to calculate the maximum time required for the output lines to settle at correct values. The traces of the simulations are too large to be seen clearly when printed so they will be excluded. The maximum circuit path involved an add with a carry that propagated though the entire circuit. This can be realized by adding $01$ with $01$. Since the adder design is essentially a ripple adder, the carry must pass through the entire circuit before the carry bit of the PSW will become valid. The low to high signal delay time for all of the outputs was measured to be 9.77 nS and the high to low delay 53 time was 8.11 nS.
Figure 26 Logic Circuit of ALU

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -53-
2.6 Multiplier

The next block to be discussed is the multiplier block. Figure 27 is a schematic of the highest level of hierarchy in the multiplier. The core has 16 inputs (2 eight-bit busses) and a 16 bit output broken into two bytes out_hi and out_lo. The results of a multiplication appear at the outputs and are permitted out of the circuit when the oe signal is raised enabling the tri-state buffers.

Figure 27 Logic Circuit of Highest Multiplier Hierarchy Level

The multiplier core (mult) is shown in Figure 28 as a combinational circuit made up of mainly full adder blocks and NAND gates. The algorithm for multiplication is
based on a Modification of Booth's Algorithm.

Figure 28  Logic Circuit of Multiplier Core

Figure 29 is an example of a few numbers passed into the multiplier during a test run.
Figure 29  Simulation Results of Multiplier
Because of the complexity of the multiplier and its presence in the data path, it was expected that this block would be the limiting factor in the speed of processor operation. By examining the circuit, it is possible to deduce that the longest circuit path that is used in the multiplication of $FF$ by $FF$. Using Accusim, the time for all signals to become stable with this multiplication was found to be 14.89 nS.

2.8 Memory Data Register

The register that was by far the easiest to implement is the Memory Data Register. The circuitry is the same as that of the General Purpose Register except for its simplicity. By examining Figure 30 it can be see that the logic for the register is straightforward because of the existance of only one input and one output bus. The only new additions to the basic design were the high and low nibble busses which required no extra logic.
Figure 30  Logic Circuit of Memory Data Register
Figure 31 Simulation Results of Memory Data Register
2.9 Shifter

The construction of the shifter circuit was simple. Since the shifter was to be combinational it could be done by simply switching bit positions within the byte being operated on. The most straightforward way to accomplish this was through the use of multiplexers. This was also the most efficient way to accomplish the shifting due to the fact that the multiplexers were implemented using CMOS transmission gates.

There are two varieties of multiplexers used in the design. The middle six bits use two input muxes to shift either left or right. The most and least significant bits require four bit (actually four and three bits respectively since the LSb does not maintain a sign bit on an ASL or LSL, which are functionally identical) muxes because of the possibility of replacing those bits with the carry-in bit, a zero, the sign bit, or a shifted bit.

There is also logic present to generate the PSW. The Zero bit is evaluated in the same way as it was for the ALU (see Section 2.6). The Overflow bit is always set to a low since no overflow is possible in the shifter. The most significant bit is passed out as the Negative bit and the carry bit is the original \( din(7) \) or \( din(0) \) for a left shift or right shift respectively.

After logical simulation, the longest path in the circuit was simulated with Accusim to obtain the limiting propagation time. The actual propagation time for a stable signal through the shifter was 1.82 nS using an ROR instruction which imports a new carry bit. The time that limits the speed of shifter operation was that required to generate the PSW. The time until the Zero bit became stable was 5.18 nS. This is to be expected since the generation of this bit requires all other bits to be stable.
Figure 32 Logic Circuit of Shifter
Figure 33 Simulation Results of Shifter
2.10 Bus Unit

The circuit shown in Figure 36 is the top level schematic of the Bus Unit. This block was the most complicated to design of the entire processor. The complexity arose from the many functions that it performs such as reading from and writing to memory, and maintaining the prefetch and the functions involved with the prefetch, like direct fetching.

The PLA and its associated latches is shown in Figure 37. This PLA controls the various other parts of the Bus Unit. The latches are used to capture the outputs of the PLA when they become stable.

The next schematic, Figure 38, is that of the Bus Unit Timer. This timer is used to provide adequate setup up and hold times for writes and reads to and from memory. The important times associated with memory reads and writes are shown in Figures 34 and 35 respectively.

![Figure 34 Read Cycle Timing Diagram](image-url)
The different outputs on the timer correspond to lengths of time that must have passed before certain actions can be taken. For instance in Figure B-3, the Bus Unit would remain in State C until the setup\textit{l} time has elapsed.
Figure 36 Logic Circuit for Bus Unit
Figure 37 Logic Circuit for Bus Unit PLA

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92
Figure 38Logic Circuit for Bus Unit Timer

Figure 39 shows the circuit for the parallel loaded eight-bit counter. This counter is used as the PC in the Bus Unit. The counter is a ripple counter that is activated when its clk line is pulsed. It also has the capability to reset to all zeros when the reset line is brought low. Using the set and reset inputs of the DFFARSCR flip-flop, some external logic is used to selectively reset, or set the flip-flops according to the value of the appropriate input bit. The set value is loaded when the ld line is brought high.
Figure 39 Logic for Parallel Load 8-bit Counter
Although this counter is nearly identical to the Stack Pointer, it was implemented differently for variety and to investigate the difference in performance between a ripple counter and the full-adder implementation used for the stack pointer. As expected, the full adder SP implementation was superior. An increment in the parallel loaded eight-bit register took 27.8 nS through the longest path. This register has no decrement capability.

Figure 40 Logic for 8-bit Tri-State Latch

Figure 40 is the schematic representation of the eight bit tristate latch created for the processor. This latch serves to hold data that needs to be written to memory or an internal bus. The tristate capability can be used for shared busses either internal or external to the processor.

In order to create the eight bit latch above, it was necessary to design the tristate latch it uses. Although there were latches in The Library as well as tristate buffers,
there was no part that combined the two, and the tristate buffers that do exist are used to drive extremely large capacitive loads.

Figure 41  Logic for Reset State Latch

After the processor is reset, the Bus Unit will be place in the default state (all PLA state inputs will be 0) since a reset clears the contents of the PLA's latches. From this state the BU will automatically bootstrap itself by going out to memory location $00$ to fetch the starting address of the start of the program code. Since some of the states of the BU that are used after a reset are the same as those used in normal instruction fetching, some sort of 'flag' is needed to inform the PLA of its present condition. The circuit of Figure 41 serves this purpose.

The master reset line is connect to the set* input of the Reset State Latch. When a reset occurs the output of the latch becomes high and serves to inform the BU that it has been reset. After the BU has performed all of the steps called for after a reset, the BU will clear the latch by toggling the rst* input of the latch.

During normal operation, the BU attempts to pre-fetch instructions and store them in an internal queue for later access. The circuitry that implements this function is shown as Figure 42. Although the prefetch is actually only four elements deep, the prefetch system uses five storage elements in order to differentiate between the queue
Figure 42 Logic for Prefetch Queue
being full, empty, or in between.

The manner in which the prefetch operates is actually very straightforward. Each of the storages elements designated REG8B_TRI is an 8-bit tri-state register. The read and write lines of these registers are controlled by the outputs of two counters and AND gates to enable or disable the reading and writing. The write input of each register is connect to the QUEUE_frnt counter and the read input of each register is connected to the QUEUE_rear counter.

During normal operation, the rear counter controls which of the registers will accept the new data, whereas the front counter designates which register will present its data to the processor. To enter a new piece of data into the prefetch, the read input is raised. This sensitizes the AND gates to pass a high signal to whichever register is to read the data as specified by the rear counter. After the read takes place, the read line is lowered and the counter incremented. The operation of making the prefetch write to the processor's internal data bus operates similarly.

Also present in the prefetch are two circuits to determine if the prefetch is full or empty. A full prefetch occurs when the rear counter has been incremented to the point where one more increment would make the front and rear counters equal. An empty queue occurs if the front and rear counter are equal.

Using two counter circuits called QUEUE_COUNTER in the diagram, the front and rear queue pointers are maintained. The actual implementation of the counter is shown in Figure 43 as a ring-counter. This category of circuit was chosen because no decoding was required to enable a particular storage element. All that was need were AND gates at the output of the counter to control which of the storage elements read or wrote from and to the bus.
Figure 43 Logic for Prefetch Queue Counter
The last circuit in the BU unit is the 8-bit Tri-State Register. This register is used to hold the data bytes that have been prefetched from memory. The circuit is merely a DFFAR flip-flop with a tristate inverter at its output. When a system reset occurs the \( \text{rst}^* \) line allows for the contents of the register to be cleared. The \( \text{rd} \) and \( \text{wr} \) signals are fairly intuitive. The read signal latches the data present on the \( d(7:0) \) bus into the first stage of the flip-flop. The \( \text{wr} \) signal causes the second stage to accept the data from the first stage as well as activating the tri-state inverter which presents this data onto the output bus.

Figure 44 Logic for 8-bit Tri-State Register
3.0 Microprocessor Layout

The final step of the implementation involved creating physical mask layouts for all of the circuits used in the processor. The task of creating these layouts was made easier because of the functional block partitioning that was done earlier.

The smaller functional blocks such as the registers, the ALU and the shifter were implemented as a whole without the use of layout hierarchy. The strategy behind the layout procedure was to obviously make the blocks as small as possible. In addition, the functional blocks were made as close to square as possible to reduce the block perimeter.

The larger functional blocks (the Control Unit and the Bus Unit) were further partitioned to make layout more manageable. This allowed only the desired levels of the hierarchy to be shown, thus improving the performance of the workstation used by reducing the number of devices that needed to be constantly redrawn. As the smaller blocks were completed, they were simply connected together in a fashion that would allow for minimum area use as well as short signal line lengths.

Much time was spent on the processor floorplan. The original goal of the layout was to arrange the blocks such that minimum distances existed between the registers, ALU and multiplier, thus reducing the time delay in the data path. However, due to the fact that the layout turned out to be very large, it was decided that simply fitting the circuitry within the pad ring would be the goal.

By placing the layout blocks in the pad ring and arranging them in a number of positions, a floorplan was finally arrived at that was a compromise between reduced line lengths and area usage. As one can see from Figure 45, there is not a great deal of space between the functional blocks. Because of this fact, signal and bus routing was
very difficult. Fortunately, since most of the blocks in the data path are lined up vertically, it was possible to run the many signals over these blocks in the second metal layer.

The processor was placed into a 4600x6800 $\mu$m$^2$ pad ring.
Figure 45  Microprocessor Layout Floorplan
4.0 Conclusions

Although no specific investigative goals were stated at the beginning of this paper, several important things were discovered during the course of this project.

During the instruction encoding portion of the design, it was anticipated that by using a so-called horizontal encoding of the instructions, instruction space would be wasted, but the instruction decoding would be greatly simplified. After examining the layout of the processor, it is felt that this was the correct choice since chip area is at a premium.

Probably the most important lessons learned (if not the most confounding ones) were those dealing with behavior modeling. Because of the circumstances, it was necessary to choose Mentor Graphic's BLM to model the processor. This technique uses a C language program to generate output responses for the input stimuli applied to a circuit model which is not unlike the way in which VHDL would work. However, because it is the common C language, timing of signals within the model is not automatically regulated. For example, it was necessary to be exceeding careful during modeling that when an input signal occurred, adequate time elapsed before output responses or changes in state were allowed to proceed. This problem proved to be the most troublesome in the modeling of the state machines. To use an aphorism, C used in the modeling of state machines is "magic" and must be used with the utmost care in order to ensure that the functional block does not exhibit characteristics that are physically impossible.

Upon examination of the finished layout, it was noticed that the PLAs used in the Control Unit and the Bus Unit occupy approximately 23% of the entire chip layout. Although this area is many times smaller than would be required if all of the logic were implemented with combinational logic, the PLAs are still inefficient due to their
sparse population. A more area efficient technique of implementing the control store would have been to use a microcode engine.

Although much was learned by the author from the execution of this project, it was decidedly too ambitious a project for one person to attempt in the time frame allotted.

In summary, the microprocessor incorporated approximately 12,500 transistors into the design. The nominal estimated speed was calculated to be 8 MHz without measurement of signal line capacitance which would contribute to speed reduction of the system.
5.0 References


Appendix A:

Figure A-1  Control Flow Chart of Control Unit - Pt 1
Figure A-2 Control Flow Chart of Control Unit - Pt 2
Figure A-3  Control Flow Chart of Control Unit  Pt 3

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -A-3-
6. Pass Func Field to Shifter
   Set ALU to pass A input
   ALUmux select Shifter out
   Enable ALUmux output
   Reg (MDM_hl_nibble, write, bus1)
   Enable Flags
   Flags Reg Write bue_0
   Set PSW mux to read Shifter
(Falling CLK Edge)

7. Set SP to Write bue_0
   Enable SP
   Set B.U. to Write
   Enable B.U.
   Reg (SRC_DEST_nibble, write, bus1)

8. Decrement SP

Figure A-4 Control Flow Chart of Control Unit - Pt 4

9. Enable Multiplier output
   Set Reg_A to write bue_0
   Set Reg_B to write bue_1
   Set Reg_C to read bue_1
   Set Reg_D to read bue_0

10. Set Flags Reg to Write bue_1
    Enable Flags
    Set Flags Carry Bit = 1

11. Set Flags Reg to Write bue_1
    Enable Flags
    Set Flags Carry Bit = 0

Figure A-5 Control Flow Chart of Control Unit - Pt 5
Figure A-6  Control Flow Chart of Control Unit   Pt 6
Set ALU to Subtract
MDR read bus_D
Set PSW mux to read ALU
Enable Flags_Reg
Flags Reg Read bus_1
IF REG
  Reg (MDR_lo_nibble, write, busD)
ELSE
  Set MDR to Write
  Enable MDR

Set B.U. to Jump_Taken
Enable B.U.
MDR read bus_D

Set B.U. to Next Op Code
Enable B.U.

Figure A-7  Control Flow Chart of Control Unit - Pt 7
Appendix B:

Set Data_Valid to FALSE
Set Out_bus to Hi-Z
Latch In_bus
Latch Addr_bus
Latch Func_bus
Reset Timer

Enable Func
FALSE

Y
Y
Y
Y
Y
N
N
N
N
N

Prefetch
full
Next
Op-code
Pre-fetch
empty
Pre-fetch
empty

Figure B-1 Control Flow Chart of Bus Unit - Pt 1
Figure B-2  Control Flow Chart of Bus Unit - Pt 2
Figure B-3  Control Flow Chart of Bus Unit - Pt 3
Pull OE low
Set Data-valid to TRUE
Connect Dbus to Outbus

Enable Func
FALSE

Pull OE low
Connect Abus to Abus_latch

Timer >= setup2

Pull OE high
Set Abus to Hi-Z
Set Dbus to Hi-Z
Set Data-valid to TRUE

Figure B-4  Control Flow Chart of Bus Unit - Pt 4
Appendix C:

This BLM was the one used to behaviorally simulate the ALU and serves as an example of how the other models were written. Due to the large size of the BLMs for the remaining blocks within the processor, they will not be included here.

C Code for ALU BLM:

/*******************************************************************************
* Component: ALU
* Author: Jeff Correll
* Date : 22 July 1991
* Comments:
* This blm models the alu of my thesis processor
*******************************************************************************/

#define BUS_WIDTH 8 /* number of bits in the a and b busses */
#define PSW_WIDTH 4 /* number of bits in the psw */
#define FUNC_WIDTH 4 /* number of bits in the function bus */
#define TRUE 1
#define FALSE 0

/* These are the codes that would appear on the 'func' lines of the alu for the corresponding operation to take place */
#define ADD 0x00
#define SUB 0x01
#define NOT_A 0x02
#define AND 0x03
#define OR 0x04
#define XOR 0x05
#define NEG_A 0x06
#define CMP 0x07
#define PASS_A 0x08
#define PASS_B 0x09

#define NOW 0 /* Delay time of Zero, happens now */
#define PROP_DELAY 5 /* token delay through alu */

#define PIN_CON_VAL(pin_name,bitnum) \ (qsim_con_value[(*(qsim_instance_ptr->alu_I_ \ pin_name))->bits[bitnum]])

#define PIN_OUT(pin,value,time) \ (qsim_output(&qsim_instance_ptr->alu_O_ \ pin,value,time))

#define PIN_STATE(pin_name, bit_number) \ (*((qsim_instance_ptr->alu_I_ \ pin_name))->bits[(bit_number)])

#include "idea/sys/ins/qsim.h"
#include "user/jac8396/thesis/blm/blm_lib.h"
#include "alu_pin.h" /* include the pin file generated by PFGEN */
/******

Description: This function is used to initialize the alu. It sets all of
the flags and internal variables to appropriate values.

Input: 
Output:

*******************************************************************************/

{ long delay; /* delay until output pin changes state */
char level[BUS_WIDTH]; /* new level of changing output pins */
int cnt; /* counter variable */
long time; /* holds present simulation time */

dfi_$object_identifier inst_id;
dfi_$inst_rec_ptr inst_ptr;

inst_id = qsim_get_instance_id();
dfi_$get_inst(&qsim_dfi_channel, &inst_id, &inst_ptr);

strncpy(TD_String, inst_ptr->instance_pathname.chars, inst_ptr->instance_pathname.len);

for (cnt=0; cnt<BUS_WIDTH; cnt++) /* set hi-z and unknown (x) array to */
{ /* appropriate values */
  bus_hi_z[cnt] = QSIM_XZ;
  bus_x[cnt] = QSIM_XS;
}

for (cnt=0; cnt<PSW_WIDTH; cnt++)
{
  psw_hi_z[cnt] = QSIM_XZ;
  psw_x[cnt] = QSIM_XS;
}

delay = NOW; /* init the outputs to unknown, now */
PIN_OUT(out, bus_x, &delay);
PIN_OUT(psw, bus_x, &delay);
}

*******************************************************************************/

*******************************************************************************/

alu_a()

*******************************************************************************/

Description: Pin handler for input bus a
Input: Data from the input bus a
Output:

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -C-3-
do_alu_func();
*/
/*******************************************************************************/

alu_b()
 /*******************************************************************************/
* Description: Pin handler for input bus b
* Input: Data from the input bus a
* Output:
*******************************************************************************/
{ 
  do_alu_func();
}
/*******************************************************************************/

alu_func()
 /*******************************************************************************/
* Description: Pin handler for function input bus
* Input: Data from the input bus func
* Output:
*******************************************************************************/
{ 
  do_alu_func();
}
/*******************************************************************************/

do_alu_func()
 /*******************************************************************************/
* Description: This function performs all of the functions of the alu. These
* include normal alu functions such as ADD, SUB, etc as well
* as setting up the PSW word.
* Input:
* Output:
*******************************************************************************/
{

  char abus[BUS_WIDTH], /* these arrays hold the Quicksim values of */
         bbus[BUS_WIDTH], /* the input busses */
         out[BUS_WIDTH];
  char psw[PSW_WIDTH],
         func[FUNC_WIDTH];

  int bit; /* used as a counter for each bit in the array */
  int a_error, /* error in conversion flags */

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -C-4-
b_error,
func_error;

int abus_int=0, /* integer value of input and output buses */
  bbus_int=0,
  func_int=0,
  psw_int=0,
  out_int=0;

int zero=0, /* integer values of psw bits */
carry=0,
negative=0,
overflow=0;

long delay=PROP_DELAY; /* delay in output signal appearance */

for(bit=0; bit < BUS_WIDTH; bit++) /* read a-bus and b-bus qsim values */
{ 
  abus[bit] = PIN_STATE(a,bit);
  bbus[bit] = PIN_STATE(b,bit);
}

for(bit=0; bit < FUNC_WIDTH; bit++) /* read the func bus qsim values */
  func[bit] = PIN_STATE(func,bit);

a_error = qstoi(abus, BUS_WIDTH, &abus_int); /* convert qsim value to int */
if (a_error)
  qmsg("abus not convertible...");

b_error = qstoi(bbus, BUS_WIDTH, &bbus_int); /* convert qsim value to int */
if (b_error)
  qmsg("bbus not convertible...");

func_error = qstoi(func,FUNC_WIDTH, &func_int); /* convert qsim value to int */
if (func_error)
  qmsg("func not convertible...");

qmsg("func_int = 0x%2.2x",func_int);
if (!func_error)
  switch(func_int) /* perform correct alu function */
  {
    case ADD:
      qmsg("ADD");
      out_int = abus_int + bbus_int;
      break;
  }
case SUB:
    qmsg("SUB");
    out_int = abus_int - bbus_int;
    break;

case AND:
    qmsg("AND");
    out_int = abus_int & bbus_int;
    break;

case OR:
    qmsg("OR");
    out_int = abus_int | bbus_int;
    break;

case NOT_A:
    qmsg("NOT_A");
    out_int = ~abus_int;
    break;

case XOR:
    qmsg("XOR");
    out_int = abus_int ^ bbus_int;
    break;

case NEG_A:
    qmsg("NEG_A");
    out_int = (abus_int*(-1))&0xFF;
    break;

case CMP:
    qmsg("CMP");
    break;

case PASS_A:
    qmsg("PASS_A");
    out_int = abus_int;
    qmsg("Passing 0x%2.2x",abus_int);
    break;

case PASS_B:
    qmsg("PASS_B");
    out_int = bbus_int;
    qmsg("Passing 0x%2.2x",bbus_int);
    break;
default:
    qmsg("Illegal function");
    break;
}

qmsg("out_int = 0x%2.2x",out_int);

    if ( ((out_int > 0xFF) || (out_int < 0x00)) /* should carry be set */
     && ( (func_int == NEG_A)||(func_int == ADD) ||
         (func_int == SUB)||(func_int == CMP) )
    )
    carry = 1;

    if (out_int & 0x80) /* should negative be set */
        negative = 1;

    if (((abus_int & 0x80) == (bbus_int & 0x80)) && carry)
        overflow = 1;

    if ((out_int & 0xff) == 0x00) /* should zero be set */
        zero = 1;

    out_int = out_int & 0xFF; /* keep only the 8 least significant bits */

/* if there are no conversion errors or there are errors converting
the b-bus when the function is only a-bus related output the
results else put out unknowns on the output bus */

    if (((b_error & (func_int == PASS_A || func_int == NOT_A ||
    func_int == NEG_A)
    ) || !(a_error || b_error || func_error))
    )
    itoqs(out_int,BUS_WIDTH,out);
    PIN_OUT(out,out,&delay);

    psw_int = (negative*8) + (zero*4) + /* set up the psw (NZVC) */
           (overflow*2) + carry;
    itoqs(psw_int,PSW_WIDTH,psw);
    PIN_OUT(psw,psw,&delay);
    qmsg("PSW=0x%2.2x",psw_int);
}
else
{
    PIN_OUT(out,bus_x,&delay);
    PIN_OUT(psw,psw_x,&delay);
## Appendix D:

### Instruction Register Logical Simulation Results:

<table>
<thead>
<tr>
<th></th>
<th>XXr</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
<th>X</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.0</td>
<td>0</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
</tr>
<tr>
<td>200.0</td>
<td>0</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
</tr>
<tr>
<td>250.0</td>
<td>1</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
</tr>
<tr>
<td>300.0</td>
<td>0</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
</tr>
<tr>
<td>309.0</td>
<td>0</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
<td>X</td>
</tr>
<tr>
<td>311.0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>312.0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>313.0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>X</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>400.0</td>
<td>40</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>450.0</td>
<td>40</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>500.0</td>
<td>40</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>511.0</td>
<td>40</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>512.0</td>
<td>40</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>600.0</td>
<td>42</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>650.0</td>
<td>42</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>700.0</td>
<td>42</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>709.0</td>
<td>42</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>710.0</td>
<td>42</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>800.0</td>
<td>44</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>850.0</td>
<td>44</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>900.0</td>
<td>44</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>909.0</td>
<td>44</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>910.0</td>
<td>44</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>911.0</td>
<td>44</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>912.0</td>
<td>44</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1000.0</td>
<td>48</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1050.0</td>
<td>48</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1100.0</td>
<td>48</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>4</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1109.0</td>
<td>48</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1111.0</td>
<td>48</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1112.0</td>
<td>48</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1200.0</td>
<td>50</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1250.0</td>
<td>50</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1300.0</td>
<td>50</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1309.0</td>
<td>50</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>2</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

```
TIME   ^ibus  ^immed  ^alu_op  ^shift_op  ^clc  ^instr
       ^read  ^halt  ^load  ^push  ^mult  ^jump  ^src_dest
       ^reg  ^nop  ^store  ^pop  ^stc  ^illegal
```
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -D-3-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll 6/29/92 -D-4-
The Design and Implementation of an 8 bit CMOS microprocessor J. Correll 6/29/92 -D-5-
ALU Logical Simulation Results:

0.0 0 00 00 X XX ADD 200.0 1 00 00 8 80 SUB
5.0 0 00 00 X XX 207.0 1 00 00 8 XX
12.0 0 00 00 X XX 208.0 1 00 00 8 FF
13.0 0 00 00 X XX 209.0 1 00 00 8 FE
16.0 0 00 00 X 00 213.0 1 00 00 8 FC
19.0 0 00 00 X 00 217.0 1 00 00 8 F8
20.0 0 00 00 4 00 221.0 1 00 00 8 F0
100.0 0 FF FF 4 00 225.0 1 00 00 8 E0
104.0 0 FF FF 6 00 229.0 1 00 00 8 C0
107.0 0 FF FF X XX 233.0 1 00 00 8 80
109.0 0 FF FF 6 00 237.0 1 00 00 0 00
110.0 0 FF FF X 00 240.0 1 00 00 4 00
111.0 0 FF FF X XX 250.0 1 FF FF 4 00
112.0 0 FF FF X XX 257.0 1 FF FF X XX
113.0 0 FF FF X FE 259.0 1 FF FF 4 00
114.0 0 FF FF X FE 260.0 1 FF FF X 00
116.0 0 FF FF X FE 262.0 1 FF FF 4 00
118.0 0 FF FF 8 FE 300.0 1 01 AA 4 00
150.0 0 00 80 8 FE 308.0 1 01 AA X XX
158.0 0 00 80 X XX 309.0 1 01 AA X XX
159.0 0 00 80 X XE 310.0 1 01 AA C AB
160.0 0 00 80 8 FE 311.0 1 01 AA C XX
161.0 0 00 80 X X0 312.0 1 01 AA X XX
162.0 0 00 80 8 XX 313.0 1 01 AA 8 FF
163.0 0 00 80 8 80 <-- 316.0 1 01 AA X XX
163.0 0 00 80 8 80 <-- 318.0 1 01 AA 0 57 <--

TIME ^func ^b ^out
^a ^psw

TIME ^func ^b ^out
^a ^psw
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll 6/29/92 -D-7-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -D-8-
Appendix E:

During the course of the design some circuit parts were desired that did not exist in The Library. One of these parts was the TRILAT (Tri-State Latch). This part is a combination of a one bit latch and a tristate buffer.

The performance of the circuit will be discussed here. However all of the variables present on the schematic and the symbol will not be discussed. For more information on these refer to the Rochester Institute of Technology's CMOS Standard Cell Library Administrator's Manual by Larry Rubin et. al.

Figure E-1 Logic Symbol of Trilat Circuit
The schematic of the circuit is shown in E-2. Various sizes of the circuit were designed, however only the 'C' is used in the microprocessor.

Above is a captured trace from a Quicksim logical simulation of the Trilat circuit. It can be readily observed that the output of the circuit is in the high impedance condition when the e (enable) signal is low. When the e line is brought high the circuit writes out its contents as a strong signal to the output.
Figure E-4 Transistor Level Schematic of Tri-State Latch for Physical Simulation

A trace of an Accusim physical simulation of the C sized Trilat circuit driving a 0.5 pF capacitive load at 25° C is shown in Figure E-5. The high and low delay times were measured from 50% of maximum input signal to 50% of maximum output signal. Using this technique, the high delay time of the circuit was 3.13 nS and the low delay time was 1.70 nS.
Figure E-5  Simulation Results of Trilat
Appendix F:

The following listings are test programs used in the verification of the processor behavioral and logic circuit models. The programs used are presented first. These listing are in the format of memory address(opcode). Next to the first byte of each instruction is a mnemonic comment of each instruction and the expected result.

These programs listings are only a fraction of the ones used to test, however they are a good representation of the kind of tests performed.

```
# This program tests the loading and storing
# instructions for all modes of addressing.
#
00/01;  #lda #$25  a<-0x25
01/08;  #sta #$a0  [$a0]<-0x25
02/00;  
03/25;  
04/10;  #sta #$a0  [$a0]<-0x25
05/00;  
06/a0;  
07/08;  #ldb #$a0  b<-a0
08/10;  
09/a0;  
0a/10;  #stb #$b0  [$b0]<-a0
0b/10;  
0c/b0;  
0d/0a;  #ldc $b0  c<-[b0]<-a0
0e/20;  
0f/b0;  
10/08;  #1dd #$e0  d<-0xe0
11/30;  
12/e0;  
13/12;  #std $a0  [$a0]<-e0 = [0x25]<-e0
14/30;  
15/a0;  
16/0a;  #lda $a0  a<-[25]<-e0
17/00;  
```
# 250.000: /CONTROL_UNIT:: State A
# 500.000: /CONTROL_UNIT:: Reset activated
# 750.000: /CONTROL_UNIT:: State T
# 1250.000: /CONTROL_UNIT:: State T
# 1750.000: /CONTROL_UNIT:: State T
# 2250.000: /CONTROL_UNIT:: State T
# 2750.000: /CONTROL_UNIT:: State T
# 3250.000: /CONTROL_UNIT:: State T
# 3750.000: /CONTROL_UNIT:: State T
# 4250.000: /CONTROL_UNIT:: State T
# 4750.000: /CONTROL_UNIT:: State T
# 5250.000: /CONTROL_UNIT:: State R
# 5500.000: /INSTR_REG:: instr = 0x08
# 5500.000: /INSTR_REG:: instr(2:0)=0x01
# 5500.000: /INSTR_REG:: decode type = 0x00
# 5750.000: /INSTR_REG:: instr = 0x08
# 5750.000: /INSTR_REG:: instr(2:0)=0x01
# 5750.000: /INSTR_REG:: decode type = 0x00
# 5750.000: /CONTROL_UNIT:: state U
# 6250.000: /CONTROL_UNIT:: State W
# 6750.000: /CONTROL_UNIT:: State W
# 7250.000: /CONTROL_UNIT:: State W
# 7750.000: /CONTROL_UNIT:: State W
# 8250.000: /CONTROL_UNIT:: State W
# 8750.000: /CONTROL_UNIT:: State W
# 9250.000: /MDR:: reading Data=0x00
# 9250.000: /CONTROL_UNIT:: State W
# 9500.000: /MDR:: reading Data=0x00
# 9750.000: /MDR:: reading Data=0x00
# 9750.000: /CONTROL_UNIT:: State Y
# 10250.000: /CONTROL_UNIT:: State Y
# 10750.000: /CONTROL_UNIT:: State Y
# 11250.000: /CONTROL_UNIT:: State V
# 11750.000: /CONTROL_UNIT:: State V
# 12250.000: /CONTROL_UNIT:: State V
# 12750.000: /CONTROL_UNIT:: State V
# 13250.000: /CONTROL_UNIT:: State V
# 13750.000: /CONTROL_UNIT:: State V
# 14250.000: /CONTROL_UNIT:: State V
# 14750.000: /CONTROL_UNIT:: State V
# 15250.000: /CONTROL_UNIT:: State E
# 15255.000: REGA:: reading Data = 0x25
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-3-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll 6/29/92 - F-4 -
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-5-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-7-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-8-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-9-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-10-
<table>
<thead>
<tr>
<th>Time</th>
<th>Event Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>125750.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>126000.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>126000.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>126250.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>126250.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>126250.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>126500.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>126500.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>126750.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>126750.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>126750.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>127000.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>127000.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>127250.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>127250.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>127250.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>127500.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>127500.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>127750.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>127750.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>127750.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>128000.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>128000.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>128250.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>128250.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>128250.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>128500.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>128500.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>128750.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>128750.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>128750.000</td>
<td>/CONTROL_UNIT:: State F</td>
</tr>
<tr>
<td>129000.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>129000.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>129250.000</td>
<td>/MDR:: writing Data=0x25</td>
</tr>
<tr>
<td>129250.000</td>
<td>/REGD:: Writing 0xe0 on bus1</td>
</tr>
<tr>
<td>129250.000</td>
<td>/CONTROL_UNIT:: State A</td>
</tr>
<tr>
<td>129750.000</td>
<td>/CONTROL_UNIT:: State A</td>
</tr>
<tr>
<td>130250.000</td>
<td>/CONTROL_UNIT:: State A</td>
</tr>
<tr>
<td>130750.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>131250.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>131750.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>132250.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>132750.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>133250.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>133750.000</td>
<td>/CONTROL_UNIT:: State Q</td>
</tr>
<tr>
<td>134250.000</td>
<td>/CONTROL_UNIT:: State R</td>
</tr>
<tr>
<td>134500.000</td>
<td>/INSTR_REG:: instr = 0x0a</td>
</tr>
</tbody>
</table>

The Design and Implementation of an 8 bit CMOS microprocessor  
J. Correll  
6/29/92 -F-11-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-12-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-13-

# 149750.000: /INSTR_REG:: instr(2:0)=0x00
# 149750.000: /INSTR_REG:: decode type = 0x00
# 149750.000: /CONTROL_UNIT:: state U
# 149750.000: /CONTROL_UNIT:: Nop..next state = A
# 150250.000: /CONTROL_UNIT:: State A
# 150750.000: /CONTROL_UNIT:: State Q
This program tests the compare instruction

The A and B registers are loaded with numbers and then the compare is performed. The results are placed in the Flags Register.

```
00/01;
01/08; #ld a, #$05
02/00;
03/05;
04/08; #ld b, #$10
05/10;
06/10;
07/7c; #cmp a,b PSW=0x08
08/01;
09/7c; #cmp b,a PSW=0x00
0a/10;
0b/78; #cmp a, #05 PSW=0x04
0c/0f;
0d/05;
0e/00;
0f/00;
```

```
# 500.000: /BUS_UNIT:: Reset Activated
# 500.000: /BUS_UNIT:: State C
# 1000.000: /BUS_UNIT:: State C
# 1500.000: /BUS_UNIT:: state G
# 2000.000: /BUS_UNIT:: state P
# 2500.000: /BUS_UNIT:: state I
# 3000.000: /BUS_UNIT:: state O
# 3500.000: /BUS_UNIT:: state L
# 4000.000: /BUS_UNIT:: state D
# 4500.000: /BUS_UNIT:: state N
# 5000.000: /BUS_UNIT:: state N
# 5500.000: /INSTR_REG:: instr = 0x08
# 5500.000: /INSTR_REG:: instr(2:0)=0x01
# 5500.000: /INSTR_REG:: decode type = 0x00
# 5500.000: /BUS_UNIT:: state N
# 5750.000: /INSTR_REG:: instr = 0x08
# 5750.000: /INSTR_REG:: instr(2:0)=0x01
# 5750.000: /INSTR_REG:: decode type = 0x00
# 6000.000: /BUS_UNIT:: State A
# 6000.000: /BUS_UNIT:: Pre-fetch empty...
# 6000.000: /BUS_UNIT:: Fill the prefetch
```
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  

6/29/92  

- F-15 -
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92

F-16-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-17-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll 6/29/92 -F-18-
# 58000.000: /INSTR_REG:: decode type = 0x01
# 58000.000: /BUS_UNIT:: state J
# 58250.000: /INSTR_REG:: instr = 0x78
# 58250.000: /INSTR_REG:: instr(2:0)=0x07
# 58250.000: /INSTR_REG:: decode type = 0x01
# 58500.000: /BUS_UNIT:: state F
# 58755.000: /MDR:: reading Data=0x78
# 59000.000: /MDR:: reading Data=0x78
# 59000.000: /BUS_UNIT:: State A
# 59000.000: /BUS_UNIT:: Pre-fetch empty...
# 59500.000: /BUS_UNIT:: State C
# 60000.000: /BUS_UNIT:: State C
# 60500.000: /BUS_UNIT:: state G
# 61000.000: /BUS_UNIT:: state P
# 61500.000: /BUS_UNIT:: state H
# 62000.000: /BUS_UNIT:: state J
# 62250.000: /MDR:: reading Data=0x0f
# 62500.000: /MDR:: reading Data=0x0f
# 62500.000: /BUS_UNIT:: state J
# 62750.000: /MDR:: reading Data=0x0f
# 63000.000: /BUS_UNIT:: state J
# 63500.000: /BUS_UNIT:: state F
# 64000.000: /BUS_UNIT:: State A
# 64000.000: /BUS_UNIT:: Pre-fetch empty...
# 64000.000: /BUS_UNIT:: Fill the prefetch
# 64500.000: /BUS_UNIT:: State C
# 65000.000: /BUS_UNIT:: State C
# 65500.000: /BUS_UNIT:: state G
# 66000.000: /BUS_UNIT:: state P
# 66500.000: /BUS_UNIT:: state H
# 67000.000: /BUS_UNIT:: State A
# 67000.000: /BUS_UNIT:: Q[0] = 0x05

# 67500.000: /BUS_UNIT:: state J
# 68000.000: /BUS_UNIT:: state J
# 68255.000: /MDR:: reading Data=0x05
# 68500.000: /MDR:: reading Data=0x05
# 68500.000: /BUS_UNIT:: state J
# 68750.000: /MDR:: reading Data=0x05
# 68755.000: /MDR:: writing Data=0x05
# 68755.000: REGA:: Writing 0x05 on bus1
# 69000.000: /MDR:: writing Data=0x05
# 69000.000: /FLAGS_REG:: Just took i0 as 04

# This program is used to test the functionality
# of the shifter. Also the carry bit in the
# flags register is toggled to vary the carry

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-19-
# input to the shifter.
#
00/01;  
01/08;  lda #$01  
02/00;  
03/01;  
04/c8;  lsla carry=0, a=#$02,PSW=0x00  
05/08;  ldb #$1  
06/10;  
07/f1;  
08/c9;  lslb carry=1, b=#$e2,PSW=0x09  
09/08;  ldc #$91  
0a/20;  
0b/91;  
0c/d2;  asrc carry=1, c=#$c8,PSW=0x09  
0d/c2;  lsrc carry=0, c=#$64,PSW=0x00  
0e/b0;  scb carry=1  
0f/e0;  rora carry=0, a=#$32,PSW=0x00  
10/b0;  scb carry=1  
11/e9;  rolb carry=0, b=#$64,PSW=0x00  
12/b0;  scb carry=1  
13/a8;  ccb carry=0  
14/00;  
15/00;  
16/00;  
17/00;  

# 250.000: /CONTROL_UNIT:: State A  
# 500.000: /CONTROL_UNIT:: Reset activated  
# 500.000: /BUS_UNIT:: Reset Activated  
# 500.000: /BUS_UNIT:: State C  
# 750.000: /CONTROL_UNIT:: State T  
# 1000.000: /BUS_UNIT:: State C  
# 1250.000: /CONTROL_UNIT:: State T  
# 1500.000: /BUS_UNIT:: state G  
# 1750.000: /CONTROL_UNIT:: State T  
# 2000.000: /BUS_UNIT:: state P  
# 2250.000: /CONTROL_UNIT:: State T  
# 2500.000: /BUS_UNIT:: state I  
# 2750.000: /CONTROL_UNIT:: State T  
# 3000.000: /BUS_UNIT:: state O  
# 3250.000: /CONTROL_UNIT:: State T  
# 3500.000: /BUS_UNIT:: state L  
# 3750.000: /CONTROL_UNIT:: State T  
# 4000.000: /BUS_UNIT:: state D  

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-20-
# 4250.000: /CONTROL_UNIT:: State T
# 4500.000: /BUS_UNIT:: state N
# 4750.000: /CONTROL_UNIT:: State T
# 5000.000: /BUS_UNIT:: state N
# 5250.000: /CONTROL_UNIT:: State R
# 5500.000: /INSTR_REG:: instr = 0x08
# 5500.000: /INSTR_REG:: instr(2:0)=0x01
# 5500.000: /INSTR_REG:: decode type = 0x00
# 5500.000: /BUS_UNIT:: state N
# 5750.000: /INSTR_REG:: instr = 0x08
# 5750.000: /INSTR_REG:: instr(2:0)=0x01
# 5750.000: /INSTR_REG:: decode type = 0x00
# 5750.000: /CONTROL_UNIT:: state U
# 6000.000: /BUS_UNIT:: State A
# 6000.000: /BUS_UNIT:: Pre-fetch empty...
# 6000.000: /BUS_UNIT:: Fill the prefetch
# 6250.000: /CONTROL_UNIT:: State W
# 6750.000: /CONTROL_UNIT:: State W
# 7250.000: /CONTROL_UNIT:: State W
# 7750.000: /CONTROL_UNIT:: State W
# 8250.000: /CONTROL_UNIT:: State W
# 8750.000: /CONTROL_UNIT:: State W
# 9000.000: /BUS_UNIT:: state J
# 9250.000: /MDR:: reading Data=0x00
# 9250.000: /CONTROL_UNIT:: State W
# 9500.000: /MDR:: reading Data=0x00
# 9500.000: /BUS_UNIT:: state J
# 9750.000: /MDR:: reading Data=0x00
# 9750.000: /CONTROL_UNIT:: State Y
# 10000.000: /BUS_UNIT:: state J
# 10250.000: /CONTROL_UNIT:: State Y
# 10500.000: /BUS_UNIT:: state F
# 10750.000: /CONTROL_UNIT:: State Y
# 11000.000: /BUS_UNIT:: State A
# 11000.000: /BUS_UNIT:: Pre-fetch empty...
# 11000.000: /BUS_UNIT:: Fill the prefetch
# 11250.000: /CONTROL_UNIT:: State V
# 11500.000: /BUS_UNIT:: State C
# 11750.000: /CONTROL_UNIT:: State V
# 12000.000: /BUS_UNIT:: State C
# 12250.000: /CONTROL_UNIT:: State V
# 12500.000: /BUS_UNIT:: state G
# 12750.000: /CONTROL_UNIT:: State V
# 13000.000: /BUS_UNIT:: state P
# 13250.000: /CONTROL_UNIT:: State V
# 13500.000: /BUS_UNIT:: state H
# 13750.000: /CONTROL_UNIT:: State V
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-22-
22000.000: /BUS_UNIT:: State A
# 22000.000: /BUS_UNIT:: Pre-fetch empty...
# 22000.000: /BUS_UNIT:: Fill the prefetch
# 22005.000: REGA:: reading Data = 0x02
# 22250.000: REGA:: reading Data = 0x02
# 22250.000: /FLAGS_REG:: Just took i0 as 00
# 22250.000: /CONTROL_UNIT:: State A
# 22500.000: /BUS_UNIT:: State C
# 22750.000: /CONTROL_UNIT:: State Q
# 23000.000: /BUS_UNIT:: State C
# 23250.000: /CONTROL_UNIT:: State Q
# 23500.000: /BUS_UNIT:: state G
# 23750.000: /CONTROL_UNIT:: State Q
# 24000.000: /BUS_UNIT:: state P
# 24250.000: /CONTROL_UNIT:: State Q
# 24500.000: /BUS_UNIT:: state H
# 24750.000: /CONTROL_UNIT:: State Q
# 25000.000: /BUS_UNIT:: State A
# 25000.000: /BUS_UNIT:: Q[0] = 0x08
# 25250.000: /CONTROL_UNIT:: State Q
# 25500.000: /BUS_UNIT:: state J
# 25750.000: /CONTROL_UNIT:: State Q
# 26000.000: /BUS_UNIT:: state J
# 26250.000: /CONTROL_UNIT:: State R
# 26500.000: /INSTR_REG:: instr = 0x08
# 26500.000: /INSTR_REG:: instr(2:0)=0x01
# 26500.000: /INSTR_REG:: decode type = 0x00
# 26500.000: /BUS_UNIT:: state J
# 26750.000: /INSTR_REG:: instr = 0x08
# 26750.000: /INSTR_REG:: instr(2:0)=0x01
# 26750.000: /INSTR_REG:: decode type = 0x00
# 26750.000: /CONTROL_UNIT:: state U
# 27000.000: /BUS_UNIT:: state F
# 27250.000: /CONTROL_UNIT:: State W
# 27255.000: /MDR:: reading Data=0x08
# 27500.000: /MDR:: reading Data=0x08
# 27500.000: /BUS_UNIT:: State A
# 27500.000: /BUS_UNIT:: Pre-fetch empty...
# 27750.000: /CONTROL_UNIT:: State W
# 28250.000: /CONTROL_UNIT:: State W
# 28750.000: /CONTROL_UNIT:: State W
# 29250.000: /CONTROL_UNIT:: State W
# 29750.000: /CONTROL_UNIT:: State W
# 30000.000: /BUS_UNIT:: state H
# 30250.000: /CONTROL_UNIT:: State W
# 30500.000: /BUS_UNIT:: state J
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll 6/29/92 -F-24-
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92 -F-25-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-26-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-27-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-28-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-29-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92
# 86250.000: /INSTR_REG:: instr = 0xb0
# 86250.000: /INSTR_REG:: instr(2:0)=0x06
# 86250.000: /INSTR_REG:: decode type = 0x02
# 86250.000: /CONTROL_UNIT:: state U
# 86500.000: /BUS_UNIT:: state F
# 86750.000: /CONTROL_UNIT:: State M
# 87000.000: /FLAGS_REG:: Data now = 0x01
# 87000.000: /BUS_UNIT:: State A
# 87000.000: /BUS_UNIT:: Pre-fetch empty...
# 87000.000: /BUS_UNIT:: Fill the prefetch
# 87250.000: /FLAGS_REG:: Data now = 0x01
# 87250.000: /CONTROL_UNIT:: State A
# 87500.000: /BUS_UNIT:: State C
# 87750.000: /CONTROL_UNIT:: State Q
# 88000.000: /BUS_UNIT:: State C
# 88250.000: /CONTROL_UNIT:: State Q
# 88500.000: /BUS_UNIT:: state G
# 88750.000: /CONTROL_UNIT:: State Q
# 89000.000: /BUS_UNIT:: state P
# 89250.000: /CONTROL_UNIT:: State Q
# 89500.000: /BUS_UNIT:: state H
# 89750.000: /CONTROL_UNIT:: State Q
# 90000.000: /BUS_UNIT:: State A
# 90000.000: /BUS_UNIT:: Q[0] = 0xe9

# 90250.000: /CONTROL_UNIT:: State Q
# 90500.000: /BUS_UNIT:: state J
# 90750.000: /CONTROL_UNIT:: State Q
# 91000.000: /BUS_UNIT:: state J
# 91250.000: /CONTROL_UNIT:: State R
# 91500.000: /INSTR_REG:: instr = 0xe9
# 91500.000: /INSTR_REG:: instr(2:0)=0x05
# 91500.000: /INSTR_REG:: decode type = 0x03
# 91500.000: /BUS_UNIT:: state J
# 91750.000: /INSTR_REG:: instr = 0xe9
# 91750.000: /INSTR_REG:: instr(2:0)=0x05
# 91750.000: /INSTR_REG:: decode type = 0x03
# 91750.000: /CONTROL_UNIT:: state U
# 92000.000: /BUS_UNIT:: state F
# 92250.000: /CONTROL_UNIT:: State G
# 92255.000: REGC:: Writing 0x32 on bus0
# 92500.000: REGC:: Writing 0x32 on bus0
# 92500.000: /BUS_UNIT:: State A
# 92500.000: /BUS_UNIT:: Pre-fetch empty...
# 92500.000: /BUS_UNIT:: Fill the prefetch
# 92750.000: /FLAGS_REG:: Just took i0 as 00
# 92750.000: REGC:: reading Data = 0x64

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-31-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-32-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-33-
This program is used to test the unconditional jump instruction. The choice of the mult instruction is not important.

00/01;
01/08;  #lda #$02
02/00;
03/02;
04/08;  #ldb #$03
05/10;
06/03;
07/b8;  #jump $0a
08/0a;
09/a0;  #halt
0a/98;  #mult c=#$00, d=#$06
0b/00;
0c/00;
0d/00;
0e/00;

# 500.000: /BUS_UNIT:: Reset Activated
# 1000.000: /BUS_UNIT:: State C
# 2000.000: /BUS_UNIT:: State C
# 3000.000: /BUS_UNIT:: state G
# 4000.000: /BUS_UNIT:: state P
# 5000.000: /BUS_UNIT:: state I
# 6000.000: /BUS_UNIT:: state O
# 7000.000: /BUS_UNIT:: state L
# 8000.000: /BUS_UNIT:: state D
# 9000.000: /BUS_UNIT:: state N
# 10000.000: /BUS_UNIT:: state N
# 11000.000: /INSTR_REG:: instr = 0x08
# 11000.000: /INSTR_REG:: instr(2:0)=0x01
# 11000.000: /INSTR_REG:: decode type = 0x00
# 11000.000: /BUS_UNIT:: state N
# 11500.000: /INSTR_REG:: instr = 0x08
# 11500.000: /INSTR_REG:: instr(2:0)=0x01
# 11500.000: /INSTR_REG:: decode type = 0x00
# 12000.000: /BUS_UNIT:: State A
# 12000.000: /BUS_UNIT:: Pre-fetch empty...
# 12000.000: /BUS_UNIT:: Fill the prefetch
# 13000.000: /BUS_UNIT:: State C
# 14000.000: /BUS_UNIT:: State C
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-35-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-36-
# 74000.000: /INSTR_REG:: decode type = 0x02
# 74000.000: /BUS_UNIT:: state J
# 74500.000: /INSTR_REG:: instr = 0xb8
# 74500.000: /INSTR_REG:: instr(2:0)=0x07
# 74500.000: /INSTR_REG:: decode type = 0x02
# 75000.000: /BUS_UNIT:: state F
# 75505.000: /MDR:: reading Data=0xb8
# 76000.000: /MDR:: reading Data=0xb8
# 76000.000: /BUS_UNIT:: State A
# 76000.000: /BUS_UNIT:: Pre-fetch empty...
# 77000.000: /BUS_UNIT:: State C
# 78000.000: /BUS_UNIT:: State C
# 79000.000: /BUS_UNIT:: state G
# 80000.000: /BUS_UNIT:: state P
# 81000.000: /BUS_UNIT:: state H
# 82000.000: /BUS_UNIT:: state J
# 82500.000: /MDR:: reading Data=0x0a
# 83000.000: /MDR:: reading Data=0x0a
# 83000.000: /BUS_UNIT:: state J
# 83500.000: /MDR:: reading Data=0x0a
# 84000.000: /BUS_UNIT:: state J
# 85000.000: /BUS_UNIT:: state F
# 86000.000: /BUS_UNIT:: State A
# 86000.000: /BUS_UNIT:: Pre-fetch empty...
# 86000.000: /BUS_UNIT:: Fill the prefetch
# 86505.000: /MDR:: writing Data=0x0a
# 87000.000: /MDR:: writing Data=0x0a
# 87000.000: /BUS_UNIT:: State C
# 87500.000: /MDR:: writing Data=0x0a
# 88000.000: /MDR:: writing Data=0x0a
# 88000.000: /BUS_UNIT:: State C
# 88500.000: /MDR:: writing Data=0x0a
# 89000.000: /MDR:: writing Data=0x0a
# 89000.000: /BUS_UNIT:: state G
# 89500.000: /MDR:: writing Data=0x0a
# 90000.000: /MDR:: writing Data=0x0a
# 90000.000: /BUS_UNIT:: state P
# 90500.000: /MDR:: writing Data=0x0a
# 91000.000: /MDR:: writing Data=0x0a
# 91000.000: /BUS_UNIT:: state H
# 91500.000: /MDR:: writing Data=0x0a
# 92000.000: /MDR:: writing Data=0x0a
# 92000.000: /BUS_UNIT:: State A
# 92000.000: /BUS_UNIT:: Q[0] = 0xa0
# 92500.000: /MDR:: writing Data=0x0a
# 93000.000: /MDR:: writing Data=0x0a
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 -F-38-
# This program tests whether or not the conditional
# jumps work correctly. None of the jumps should
# occur since the Flags Register is set to $00.
#
00/01;  # ld flags #$00
02/00;
04/b9;  # jmpz #$aa (jump if z=1)
05/aa;
07/ab;  # jmpo #$ab (jump if over/under)
09/ac;
0b/ad;  # jmpc #$ac (jump if carry)
0d/00;
0e/00;
0f/00;
10/00;
11/00;
aa/a0;
ab/a0;
ac/a0;
ad/a0;

# 500.000: /BUS_UNIT:: Reset Activated
# 500.000: /BUS_UNIT:: State C
# 1000.000: /BUS_UNIT:: State C
# 1500.000: /BUS_UNIT:: state G
# 2000.000: /BUS_UNIT:: state P
# 2500.000: /BUS_UNIT:: state I
# 3000.000: /BUS_UNIT:: state O
# 3500.000: /BUS_UNIT:: state L
# 4000.000: /BUS_UNIT:: state D
# 4500.000: /BUS_UNIT:: state N
# 5000.000: /BUS_UNIT:: state N
# 5500.000: /INSTR_REG:: instr = 0x08
# 5500.000: /INSTR_REG:: instr(2:0)=0x01
# 5500.000: /INSTR_REG:: decode type = 0x00
# 5500.000: /BUS_UNIT:: state N
# 5750.000: /INSTR_REG:: instr = 0x08
# 5750.000: /INSTR_REG:: instr(2:0)=0x01
# 5750.000: /INSTR_REG:: decode type = 0x00
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-40-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll 6/29/92 -F-41-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-42-
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  

6/29/92 -F-43-
The Design and Implementation of an 8 bit CMOS microprocessor  
J. Correll  6/29/92  -F-44-
# 75000.000: /BUS_UNIT:: State A
# 75000.000: /BUS_UNIT:: Q[0] = 0x00  Q[1] = 0x00  Q[2] = 0x00
# This program tests the conditional jump
# instructions. All of the jumps should be
# successful since the flags register is set
# to all true.
#
00/01;
01/08;  #ld flags #$0f
02/40;
03/0f;
04/b9;  #jmpz $07
05/07;
06/a0;  #halt
07/ba;  #jmpo $0a
08/0a;
09/a0;  #halt
0a/bb;  #jmpc $0d
0b/0d;
0c/a0;  #halt
0d/bc;  #jmpn $10
0e/10;
0f/a0;  #halt
10/00;  #nop
11/a0;  #halt (goal)
12/00;
13/00;
14/00;
15/00;

# 500.000: /BUS_UNIT:: Reset Activated
# 500.000: /BUS_UNIT:: State C
# 1000.000: /BUS_UNIT:: State C
# 1500.000: /BUS_UNIT:: state G
# 2000.000: /BUS_UNIT:: state P
# 2500.000: /BUS_UNIT:: state I
# 3000.000: /BUS_UNIT:: state O
# 3500.000: /BUS_UNIT:: state L
# 4000.000: /BUS_UNIT:: state D
# 4500.000: /BUS_UNIT:: state N
# 5000.000: /BUS_UNIT:: state N
# 5500.000: /INSTR_REG:: instr = 0x08
# 5500.000: /INSTR_REG:: instr(2:0)=0x01
# 5500.000: /INSTR_REG:: decode type = 0x00
# 5500.000: /BUS_UNIT:: state N
# 5750.000: /INSTR_REG:: instr = 0x08
# 5750.000: /INSTR_REG:: instr(2:0)=0x01
The Design and Implementation of an 8 bit CMOS microprocessor J. Correll 6/29/92 -F-47-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-48-
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92 -F-49-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-50-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-51-
# 61000.000: /BUS_UNIT:: State A
# 61000.000: /BUS_UNIT:: Q[2] = 0xa0
# 61250.000: /MDR:: writing Data=0x0d
# 61500.000: /MDR:: writing Data=0x0d
# 61500.000: /BUS_UNIT:: state B
# 61500.000: /BUS_UNIT:: New addr = d
# 61750.000: /MDR:: writing Data=0x0d
# 62000.000: /MDR:: writing Data=0x0d
# 62000.000: /BUS_UNIT:: state B
# 62000.000: /BUS_UNIT:: New addr = d
# 62250.000: /MDR:: writing Data=0x0d
# 62500.000: /BUS_UNIT:: state B
# 63000.000: /BUS_UNIT:: State A
# 63000.000: /BUS_UNIT:: Pre-fetch empty...
# 63000.000: /BUS_UNIT:: Fill the prefetch
# 63500.000: /BUS_UNIT:: State C
# 64000.000: /BUS_UNIT:: State C
# 64500.000: /BUS_UNIT:: state G
# 65000.000: /BUS_UNIT:: state P
# 65500.000: /BUS_UNIT:: state H
# 66000.000: /BUS_UNIT:: State A
# 66000.000: /BUS_UNIT:: Q[0] = 0xbc
# 66500.000: /BUS.Unit:: state J
# 67000.000: /BUS_UNIT:: state J
# 67500.000: /INSTR_REG:: instr = 0xbc
# 67500.000: /INSTR_REG:: instr(2:0) = 0x07
# 67500.000: /INSTR_REG:: decode type = 0x02
# 67500.000: /BUS_UNIT:: state J
# 67750.000: /INSTR_REG:: instr = 0xbc
# 67750.000: /INSTR_REG:: instr(2:0) = 0x07
# 67750.000: /INSTR_REG:: decode type = 0x02
# 68000.000: /BUS_UNIT:: state F
# 68255.000: /MDR:: reading Data=0xbc
# 68255.000: /MDR:: reading Data=0xbc
# 68500.000: /MDR:: reading Data=0xbc
# 68500.000: /BUS_UNIT:: State A
# 68500.000: /BUS_UNIT:: Pre-fetch empty...
# 69000.000: /BUS_UNIT:: State C
# 69500.000: /BUS_UNIT:: State C
# 70000.000: /BUS_UNIT:: state G
# 70500.000: /BUS_UNIT:: state P
# 71000.000: /BUS_UNIT:: state H
# 71500.000: /BUS_UNIT:: state J
The Design and Implementation of an 8 bit CMOS microprocessor  
J. Correll  
6/29/92 -F-53-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-54-
93500.000: /BUS_UNIT:: state G
94000.000: /BUS_UNIT:: state P
94500.000: /BUS_UNIT:: state H
95000.000: /BUS_UNIT:: State A
95000.000: /BUS_UNIT:: Q[2] = 0x00  Q[3] = 0x00

95000.000: /BUS_UNIT:: Fill the prefetch
95500.000: /BUS_UNIT:: State C
96000.000: /BUS_UNIT:: State C
96500.000: /BUS_UNIT:: state G
97000.000: /BUS_UNIT:: state P
97500.000: /BUS_UNIT:: state H
98000.000: /BUS_UNIT:: State A
98000.000: /BUS_UNIT:: Q[2] = 0x00  Q[3] = 0x00  Q[0] = 0x00

98500.000: /BUS_UNIT:: State A
98500.000: /BUS_UNIT:: Q[2] = 0x00  Q[3] = 0x00  Q[0] = 0x00

99000.000: /BUS_UNIT:: State A
99000.000: /BUS_UNIT:: Q[2] = 0x00  Q[3] = 0x00  Q[0] = 0x00

The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92   -F-55-
# This program tests the SUB, NOT and ADD instructions.
#
00/01;  01/08;  #lda #$ff
02/00;
03/ff;
04/54;  #not
a=#$00, zero=1,PSW=0x04
05/0F;
06/40;  #adda #$80
a=#$80, neg=1,PSW=0x08
07/00;
08/80;
09/48;  #suba #$70
a=#$10, zero=neg=0,PSW=0x00
0a/00;
0b/70;
0c/08;  #ldb #$ff
b=#$ff
0d/10;
0e/ff;
0f/40;  #addb #$01
b=#$00 zero=carry=1,PSW=0x05
10/10;
11/01;
12/00;
13/00;
14/00;
15/00;

# 250.000: /CONTROL_UNIT: State A
# 500.000: /CONTROL_UNIT: Reset activated
# 500.000: /BUS_UNIT: Reset Activated
# 500.000: /BUS_UNIT: State C
# 750.000: /CONTROL_UNIT: State T
# 1000.000: /BUS_UNIT: State C
# 1250.000: /CONTROL_UNIT: State T
# 1500.000: /BUS_UNIT: state G
# 1750.000: /CONTROL_UNIT: State T
# 2000.000: /BUS_UNIT: state P
# 2250.000: /CONTROL_UNIT: State T
# 2500.000: /BUS_UNIT: state I
# 2750.000: /CONTROL_UNIT: State T
# 3000.000: /BUS_UNIT: state O
# 3250.000: /CONTROL_UNIT: State T
# 3500.000: /BUS_UNIT: state L
# 3750.000: /CONTROL_UNIT: State T
# 4000.000: /BUS_UNIT: state D
# 4250.000: /CONTROL_UNIT: State T
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-57-
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-59-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 - F-60-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-61-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-62-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll  6/29/92  -F-64-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-65-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-66-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-67-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-68-
#
# This program is used to test the OR, XOR and AND instructions
#
00/01;
01/08;  #ldb #$00
02/10;
03/00;
04/60;  #orb #$ff  b #$ff, neg=1, PSW=0x08
05/10;
06/ff;
07/68;  #xorb #$55  b #$aa, neg=1, PSW=0x08
08/10;
09/55;
0a/58;  #andb #$44  b #$00, zero=1, PSW=0x04
0b/10;
0c/44;
0d/08;  #ldc #$00  c #$00
0e/20;
0f/00;
10/68;  #xorc #$aa  c #$aa, neg=1, PSW=0x08
11/20;
12/aa;
13/00;
14/00;
15/00;
16/00;
17/00;

# 250.000: /CONTROL_UNIT:: State A
# 500.000: /CONTROL_UNIT:: Reset activated
# 550.000: /BUS_UNIT:: Reset Activated
# 500.000: /BUS_UNIT:: State C
# 750.000: /CONTROL_UNIT:: State T
# 1000.000: /BUS_UNIT:: State C
# 1250.000: /CONTROL_UNIT:: State T
# 1500.000: /BUS_UNIT:: state G
# 1750.000: /CONTROL_UNIT:: State T
# 2000.000: /BUS_UNIT:: state P
# 2250.000: /CONTROL_UNIT:: State T
# 2500.000: /BUS_UNIT:: state I
# 2750.000: /CONTROL_UNIT:: State T
# 3000.000: /BUS_UNIT:: state O
# 3250.000: /CONTROL_UNIT:: State T
# 3500.000: /BUS_UNIT:: state L
# 3750.000: /CONTROL_UNIT:: State T
# 4000.000: /BUS_UNIT:: state D
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 -F-70-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 -F-71-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92  -F-72-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll
6/29/92
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-74-
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  

6/29/92 -F-75-
The Design and Implementation of an 8 bit CMOS microprocessor  
J. Correll  
6/29/92  
F-76-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-77-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 -F-78-
The Design and Implementation of an 8 bit CMOS microprocessor

J. Correll

6/29/92 - F-79 -
The Design and Implementation of an 8 bit CMOS microprocessor  

J. Correll  
6/29/92  

- F-80-
The Design and Implementation of an 8 bit CMOS microprocessor  J. Correll  6/29/92 -F-81-