1. Introduction
Here is a simulator for a processor
called sim68k which is
defined in Section 2. The sim68k processor has a
subset of the addressing modes of MC68000 and has 32 instructions, most
of which
are derived from those of MC68000. The simulator will take as
input a program written in sim68k assembly/machine language and will
execute it, that is to say it will execute each of the sim68k machine
instructions in the input program.
What we provide is the following:
- The description of sim68k containing
- The description of the sim68k architecture
- The fetch-execute (or instruction) cycle of sim68k
- The description of the 6 addressing modes of sim68k
- A list of 32 instructions of sim68k
- The description of the instruction encoding & status
bit (HNZVC) update
- The list of errors detected
- A simulator (Sim68k.class)
written in Java, together with a function library (SimUnit.class),
- A translator/compiler (Translator.class)
used to convert sim68k assembly programs (.a) to binary files (.b)
readable by the simulator.
- Several test programs use to illustrate how to program/use this
simulator
2. General Architecture of sim68k
In this description of sim68k, the following type definitions are used:
- All data and addresses are in hexadecimal.
- A "bit" represents either TRUE (if it is 1) or FALSE (if
it is 0).
- A "twobits" represents two bits and is used as an index
from 0 to 3.
- A "byte" represents 2 hexadecimal digits ($00...$FF).
- A "word" represents 2 bytes ($0000...$FFFF).
- A "long" represents a long word which is 2 words, i.e., 4 bytes
($0000 0000 ... $FFFF FFFF).
- If a byte, word or long word represents data (i.e., a
value) then it is interpreted as a signed binary integer
in 2's CF.
Memory
- We simulate the memory as an array. The elements of the
memory array are of type "byte" because the memory is byte
addressable. The array index is of type "word" because a memory
address is a word. Addresses range from $0000 to $1000. (4097
bytes in total)
- The communication between the CPU and the memory is done
via four registers to be represented by global
variables in the simulator:
- MAR (Memory Address
Register, type
word),
- MDR (Memory Data
Register, type long),
- RW (READ/WRITE, type
bit), for read,
RW = TRUE, for write, RW = FALSE
- DS (Data Size, type
twobits), for a byte, DS = 00, for a
word, DS = 01, for a long word, DS = 10
- To read or write a value V (byte, word, or long word)
from or into the memory location X, you'll use a procedure
similar to the one in tc2111.

CPU Registers
In addition to MAR, MDR, RW and DS, the sim68k has the following
registers:
- PC (word) = program
counter. This register
holds the memory address of the NEXT instruction to be executed.
- OpCode (word) = the
opcode of the instruction currently being
executed.
- OpAddr1 (word) = the
address of the first operand of the instruction currently being
executed (if & when needed (Sections 5&6)).
- OpAddr2 (word) = the
address of the second operand of the instruction currently being
executed (if & when needed (Sections 5&6)).
- D0 and D1 (long) = Data registers. The
MC68000 has 8 such registers, but we will use only 2 in the simulator.
- A0 and A1 (long) = Address registers. The
MC68000 has 8 such
registers, but we will use only 2 in the simulator.
- TMPS, TMPD, TMPR (long)
= Buffer (or Temporary) registers. Used as
input and output to the ALU (Source, Destination and Result)
Status Bits
sim68k has the following status bits:
- H (bit) = Status bit
"Halt".
- N (bit) = Status bit
"Negative".
- Z (bit) = Status bit
"Zero".
- V (bit) = Status bit
"Overflow"
- C (bit) = Status bit
"Carry"
During the execution of some instructions, some of these bits are
updated. See Section 6 for information on when and how these bits are
updated.
In the MC68000, the status bits are located in a special register (CC -
Code Condition). In your simulator, you will implement them as
individual boolean variables to simplify their manipulation.
3. Fetch-Execute Cycle of sim68k
In the simulator, the "controller" simulates the fetch-execute cycle
with the following algorithm:
1.
Initialization
(* Set PC to $0000 and set the status bits to FALSE *)
2.
Repeat
2.1 Fetch_OpCode
2.2 Decode_Instruction
(* According to the fields in the opcode *)
(* Example: OpName = bits 15 to 11 in OpCode *)
(* NbOper =
bit 10 in OpCode + 1 *)
(*
DS = bits 9 and 8 in OpCode *)
(*
M1 = bits 7 to 5 in OpCode *)
(*
N1 = bit 4 in OpCode, etc. *)
2.3 Fetch_Operands
(* In OpAddr1 and OpAddr2, if operands are required. *)
2.4 If NOT(H)
Then
2.4.1 Execute_Instruction
(* Execution of most instructions follows these steps: *)
(* 1. Fill TMPS (if
necessary)
*)
(* 2. Fill TMPD (if
necessary)
*)
(* 3. Compute TMPR using TMPS and
TMPD
*)
(* 4. Update status bits HNZVC (if
necessary) *)
(* 5. Store result in the destination (if necessary)
*)
Until (H = true) (* If H = True, then halt *)
4. Addressing Modes of sim68k
Six addressing modes are available to most instructions:
- Data register
direct (using D0 or D1)
- Address register direct
(using A0 or A1)
- Address register indirect (using (A0)
or (A1))
- Address register indirect with post-increment
(using (A0)+ or (A1)+)
- Address register indirect with pre-decrement
(using -(A0) or -(A1))
- Relative/Absolute addressing (using a label, an
identifier or an absolute address between $0000 to $1000).
Several instructions are constrained to specific addressing modes:
- Branching instructions (BRA, BVS, BEQ, BCS, BGE, BLE)
can only use Relative/Absolute addressing. In this
case, the branching address is indicated by the given relative or
absolute
address which is a label or an absolute address between $0000 to
$1000.
- For example, "BRA.W address" means branch unconditionally to
the given address,
i.e., PC := address.
- The "MOVEA Source, Destination" instruction
only allows Relative/Absolute addressing for the Source
operand and address register direct for the Destination
operand. In this case, the Source operand is indicated by the given
relative or absolute address which is an identifier or an absolute
address between $0000 to $1000.
- For example, "MOVEA.W address, Ai" means copy
the given address itself to the register Ai, i.e., Ai := address.
IMPORTANT: Note that several
assembly instructions in the test programs make use of labels (LABEL)
as well as of variable definitions (DEF). All these names are
preceded by an arobas symbol (@). The value of these labels/variables
is the address where they have been defined. Hence, it becomes simple
to define a loop label (LABEL @Loop ) and then reference it (BRA
@Loop). The compiler (assembler) will compute the address
of each label and variable name in the program. These names are then
replaced by their address in the binary code.
5. sim68k Instruction Set
The instruction set of sim68k contains 32 instructions where:
- S is the source operand.
- D is the destination operand.
- "data" represents a 4-bit integer constant (i.e.,
$0..$F) directly encoded in the instruction as an immediate
operand.
- "address" represents an address between $0000 and $1000.
- At the end of the explanation for each instruction, you
will find an interpretation of the operation in a
pseudo-language, to understand the functionality of the
instruction.
- Data Size (.B, .W, .L) is not indicated here but will be
indicated in the assembly/machine language programs.
Arithmetic (in 2's CF)
* ADD S, D
Binary integer addition (Regular). D := D + S;
* ADDQ data, D
Binary integer addition (Quick). D := D + data;
* SUB S, D
Binary integer subtraction (Regular). D := D - S;
* SUBQ data, D
Binary integer subtraction (Quick). D := D - data;
* MULS S, D
Binary integer multiplication (Signed). D.L := D.W * S.W;
* DIVS S, D
Binary integer division (Signed). LSW(D) := D.L DIV S.W;
MSW(D) := D.L MOD S.W;
* NEG D
Binary integer negation (Regular). D := 2's complement of D;
* CLR D
Clear (set to 0). D := 0;
Logic
* NOT D
Logical NOT. D := NOT(D);
* AND S, D
Logical AND. D := D AND S;
* OR S, D
Logical OR. D := D OR S;
* EOR S, D
Logical Exclusive-OR. D := D XOR S;
Shift/rotate
* ASL data, D
Arithmetic Shift Left. D := D SHL data;
* ASR data, D
Arithmetic Shift Right. D := D SHR data where the sign of the new value
of D is the same as that of the previous value
of D;
* ROL data, D
Rotate Left. D := D ROL data;
* ROR data, D
Rotate Right. D := D ROR data;
Comparison
* CMP S, D
Compare. (* Adjust HNZVC according to D - S *)
* TST D
Test. (* Adjust HNZVC according to D *)
Branch
* BRA address
Branch unconditionally. PC := address;
* BVS address
Branch if overflow is set. if V then PC := address;
* BEQ address
Branch if equal. if Z then PC := address;
* BCS address
Branch if carry is set. if C then PC := address;
* BGE address
Branch if greater than or equal. if (N XOR V)' then PC := address;
* BLE address
Branch if less than or equal. if (N XOR V) OR Z then PC := address;
Transfer
* MOVE S, D
Move (Regular). D := S;
* MOVEQ data, D
Move (Quick). D := data;
* EXG S, D
Swap S and D. S <--> D;
* MOVEA address, Ai
Move address to the register Ai. Ai := address;
Others (not in MC68000)
* INP D
Input from keyboard.
* DSP S
Display on terminal (Source and its content).
System.out.println("S:", S);
* DSR
Display on terminal the contents of the status bits
(i.e., Display Status Register).
System.out.println("H:",H,"N:",N,"Z:",Z,"V:",V,"C:",C);
* HLT
Halt program. H := True;
Formats
All opcodes are on 16 bits. Opcodes may be of two formats.
- F1 is a format where at
most two ordinary operands are allowed.
- F2 is a format where 1
immediate operand, encoded directly in the
opcode, and at most 1 ordinary operand are allowed.
In format F1, Opref-1
(reference to operand-1) and Opref-2 (reference to operand-2) give
information on operand-1 and operand-2, respectively.
In format F2, Opref-2
(reference to operand-2) gives information on operand-2.
| OpCodeInfo |
Opref-1 | Opref-2 |
+----------+----+----+------+----+------+----+
F1: | O | DS | P
| M1 | N1 | M2 | N2 |
+----------+----+----+------+----+------+----+
| OpCodeInfo |
Opref-1 | Opref-2 |
+----------+----+----+-----------+------+----+
F2: | O | DS | P
| Data | M2 | N2 |
+----------+----+----+-----------+------+----+
* O (5 bits) : Opcode name (i.e., OpName)
* P (1 bit) : Number of operands minus 1
(i.e., NbOper)
(i.e., P = 0 ==> one operand; P = 1 ==> two operands)
As a special case if the number of operands is zero,
then P = 0.
* DS (2 bits) : Size ( i.e., 00 ==> Byte, 01
==> Word, 10 ==> Long Word)
* M1 (3 bits) : Addressing mode of operand-1 (if any)
* N1 (1 bit) : Register number of operand-1
(if any)
(Not considered when M1 = 011)
* M2 (3 bits) : Addressing mode of operand-2 (if any)
* N2 (1 bit) : Register number of operand-2
(if any)
(Not considered when M2 = 011)
* Data (4 bits): 4-bit integer constant (i.e., $0..$F)
One has to form a byte by putting $0 to the
left of the hexadecimal digit
represented by "data".
For example, if "data" is $A, then form a byte
$0A.
If the size is .B then use the byte $0A.
If the size is .W or .L then sign-extend the
byte $0A
(by putting zeros to the left of $0A) to the
required length.
Addressing modes (M1 and M2) are encoded on 3 bits:
- 000: Data register direct
- 001: Address register direct
- 011: Relative/Absolute addressing
- 100: Address register indirect
- 110: Address register indirect with post-increment
- 111: Address register indirect with pre-decrement
If M1 = 000 then N1 refers to either D0 or D1, (i.e., N1=0
refers to D0, N1=1 refers to
D1).
If M1 = 001 or 100 or 110 or 111 then N1 refers to either
A0 or A1, (i.e., N1=0 refers to A0, N1=1 refers to
A1).
If one replaces M1 by M2 and N1 by N2, the above two
statements remain valid.
6. Instruction encoding and status bits updating
The following table contains more information on the
format and encoding of each instruction.
* Mnemo: Mnemonic
* Fmt: Format (F1 or F2)
* OpName: Opcode name (in binary, 5 bits).
* P: Number of operands minus 1 (in binary, 1 bit).
* Size: Allowed Size(s). B = Byte, W = Word, L = Long.
* HNZVC: Status bits.
For updating the status bits, we will
use the following notation:
* - : Not affected
* 0 : Set to false
* 1 : Set to true
* * : Affected as follows: N=true iff
Rm=1; Z=true iff R=0
* ? : See comments
* Sm : the most significant bit of Source
operand S
* Dm : the most significant bit of Destination
operand D
* Rm : the most significant bit of Result R
* r : Shift count
When we refer to the most significant bit (Sm, Dm, or Rm), we have to
consider what size is in use. For example, for Dm:
* If Size=B, then Dm is bit #07 of D
* If Size=W, then Dm is bit #15 of D
* If Size=L, then Dm is bit #31 of D
Mnemo
Fmt OpName P Size HNZVC Comments
-----
--- ------ - ----- -----
------------------------
ADD
F1 00000 1 B,W,L -**?? V=Sm.Dm.Rm'
+ Sm'.Dm'.Rm
C=Sm.Dm + Rm'.Dm + Sm.Rm'
ADDQ
F2 00001 0 B,W,L -**?? V=Sm.Dm.Rm'
+ Sm'.Dm'.Rm
C=Sm.Dm + Rm'.Dm + Sm.Rm'
SUB
F1 00010 1 B,W,L -**?? V=Sm'.Dm.Rm'
+ Sm.Dm'.Rm
C=Sm.Dm' + Rm.Dm' + Sm.Rm
SUBQ
F2 00011 0 B,W,L -**?? V=Sm'.Dm.Rm'
+ Sm.Dm'.Rm
C=Sm.Dm' + Rm.Dm' + Sm.Rm
MULS
F1 00100 1 W
-**00
DIVS
F1 00101 1 L
-**?0 V=division overflow
NEG
F1 00110 0 B,W,L -**??
V=Dm.Rm, C=Dm+Rm
CLR
F1 00111 0 B,W,L -**00
NOT
F1 01000 0 B,W,L -**00
AND
F1 01001 1 B,W,L -**00
OR
F1 01010 1 B,W,L -**00
EOR
F1 01011 1 B,W,L -**00
ASL
F2 01100 0 B,W,L -**?? V=Dm.Rm' +
Dm'.Rm
If r > 0 then C=D(m-r+1)
else C=false
ASR
F2 01101 0 B,W,L -**0? If r > 0
then C=D(r-1)
else C=false
ROL
F2 01110 0 B,W,L -**0? If r > 0
then C=D(m-r+1)
else C=false
ROR
F2 01111 0 B,W,L -**0? If r > 0
then C=D(r-1)
else C=false
CMP
F1 10000 1 B,W,L -**?? V=Sm'.Dm.Rm'
+ Sm.Dm'.Rm
C=Sm.Dm' + Rm.Dm' +Sm.Rm
TST
F1 10001 0 B,W,L -**00
BRA
F1 10010 0 W
----- Operand is an address (M1=011)
BVS
F1 10011 0 W
----- Operand is an address (M1=011)
BEQ
F1 10100 0 W
----- Operand is an address (M1=011)
BCS
F1 10101 0 W
----- Operand is an address (M1=011)
BGE
F1 10110 0 W
----- Operand is an address (M1=011)
BLE
F1 10111 0 W
----- Operand is an address (M1=011)
MOVE
F1 11000 1 B,W,L -**00
MOVEQ
F2 11001 0 B,W,L -**00
EXG
F1 11010 1 L
---00 Both operands are registers
MOVEA
F1 11011 1 W
-**00 Source operand is an address (M1=011)
Destination is an address register
INP
F1 11100 0 B,W,L -**00
DSP
F1 11101 0 B,W,L -----
DSR
F1 11110 0 B
----- Although P = 0, there is no operand.
HLT
F1 11111 0 B
1---- Although P = 0, there is no operand.
7. Errors detected
Here are several types of errors that have to be detected while
executing a given program. When an error is detected, the H bit must be
set to true after your simulator displays an appropriate error message.
* Invalid number of operands (address and, if you want,
instruction):
For instance, if the address $0124 contains
the opcode $0900,
i.e., (ADDQ.B #$0, D0), the following message
could be displayed:
*** ERROR *** Invalid number of operands for
ADDQ at address $0124.
* Invalid data size (address and, if you want,
instruction):
For instance, if the address $0124 contains
the opcode $9060,
i.e., (BRA.B $<some address>), the
following message could be
displayed:
*** ERROR *** Invalid data size for BRA at
address $0124.
* Invalid addressing mode (address and, if you want,
instruction):
For instance, if the address $0124 contains
the opcode $9200,
i.e., (BRA.W D0), the following message could
be displayed:
*** ERROR *** Invalid addressing mode for BRA
at address $0124.
* Division by 0 (address and, if you want, instruction):
For instance, if the address $0124 has a
signed division (DIVS)
with 0 as an operand, the following message
could be displayed:
*** ERROR *** Division by 0 for DIVS at
address $0124.
8. Test programs
Assembly file format .68a
Contains the source code in sim68k assembly language. comments start
qith a semi-colon (;). It is advised to put only one instruction per
line.
Binary file format .68b
- Comments are between slashes ( / commentaire / ).
- Programs, shown as a sequence of bytes, are shown in hexadecimal
(preceded by $: $3E $5A $03...).
- Empty lines are not advised. Please use:
- Comment
- Comment Byte, Byte, Byte, Byte...
Tests
These tests in binary format (.68b) can be used to check the
simulator. The source files are also provided.
- Addressing modes: modes.68a , modes.68b , modes.res
- Arithmetic instructions: arith.68a
,arith.68b , arith.res
- Logic nstructions: logic.68a , logic.68b , logic.res
- Shift/Rotation instructions:
shiftrot.68a , shiftrot.68b , shiftrot.res
- High-Low game (branching instructions):
hilow.68a , hilow.68b , hilow.res
- Detectable errors: erreurs.68b
, erreurs.res
9. Translator and Simulator
Translator
- Translator.class
- Usage: java Translator
program
- Effect: program.68a
(assembler) will be converted to program.68b
(binary). A temporary file program.tmp
will also be created along the way (but is no longer required).
Simulator
You can also get all these files (together with the HTML description)
by downloading sim68k.zip .