Sketch of Next Gen Stack CPU ISA

Posted on

Sketch of Next Gen Stack CPU ISA

A. Instruction Reference

The instructions documented below assume a 64-bit cell width. However, the ISA is designed to be extensible, ranging from 16-bits (minimum width supported) to up to 1024 bits wide. Note that some CPUs may choose to re-allocate opcodes intended for unsupported widths for other purposes. These opcodes are machine-specific, and not guaranteed to be upward compatible with future revisions of the ISA, however.

A.1. Group 0 Instructions

Instructions in group 0 are hard to categorize elsewhere.

A.1.1. BRK ( -- )

IF COND THEN
    Trap(BREAKPOINT)
END

Performs a breakpoint trap.

A.1.2. SC ( ... -- ... )

IF COND THEN
    Trap(SYSCALL)
END

Performs a system call trap. Generally speaking, a service number is placed onto the top of the data stack, indicating which service the operating system is to perform. The input and output stack effects are, much like any subroutine, defined by the service performed.

A.1.3. POP ( -- x ) (R: x -- )

Move a cell from the return stack to the data stack.

A.1.4. PUSH ( x -- ) (R: -- x )

Move a cell from the data stack to the return stack.

A.1.5. JMPDI ( a -- )

address := POP(D)
IF COND=TRUE THEN
    PC := address
END

Jump to the absolute address on the data stack. This instruction may jump conditionally if prefixed with the COND prefix.

A.1.6. CALLDI ( a -- ) (R: -- pc+1 )

IF COND=TRUE THEN
    address := POP(D)
    PUSH(R, PC+1)
    PC := address
END

Call the subroutine whose address is on the data stack. This instruction may jump conditionally if prefixed with the COND prefix.

A.1.7. RET (aka JMPRI) ( -- ) (R: a -- )

address := POP(R)
IF COND=TRUE THEN
    PC := address
END

Return from the current subroutine by jumping to the address at the top of the return stack. This instruction may jump conditionally if prefixed with the COND prefix.

A.1.8. SWITCH (aka CALLRI) ( -- ) (R: a -- pc+1 )

IF COND=TRUE THEN
    address := POP(R)
    PUSH(R, PC+1)
    PC := address
END

Switch co-routines by swapping the next instruction's address and the address at the top of the return stack. This instruction may jump conditionally if prefixed with the COND prefix.

A.1.9. CR! ( x1 x2 -- )

Stores x1 into control register x2. The side-effects this has is control register dependent.

NOTE: On some processor variants and/or control registers, this instruction may trap for emulation in software.

A.1.10. CR@ ( x -- x )

Reads a control register's current value and places it onto the data stack. This might have side-effects; refer to the control register's documentation for more details.

NOTE: On some processor variants and/or control registers, this instruction may trap for emulation in software.

A.1.11. DI ( -- )

Disable interrupts. This is typically a faster and more atomic shortcut for regNum CR@ mask BIC regNum CR! .

NOTE: On some processor variants, this instruction may trap for emulation in software. Some processor variants may not support interrupts.

A.1.12. EI ( -- )

Enable interrupts. This is typically a faster and more atomic shortcut for regNum CR@ mask OR regNum CR! .

NOTE: On some processor variants, this instruction may trap for emulation in software. Some processor variants may not support interrupts.

A.1.13. SEC ( -- )

The COND instruction typically performs its comparisons against the top of the data stack (T). This prefix alters COND so that it works with the second top of stack (S). It has no other effects.

A.1.14. R@ ( -- x ) (R: x -- x )

Fetches the current top of the return stack and places it onto the data stack. It does NOT pop the return stack. This is a faster and more atomic equivalent of POP DUP PUSH.

A.2. Group 1 Instructions

Instructions in group 1 push a signed or unsigned literal onto the data stack.

.-----------+---+---------------.
|    siz    | S | 0   0   0   1 |
`-----------+---+---------------'

The siz-bits indicates the size of the datum to push onto the stack. The S bit is true if the value is to be sign-extended; false for zero-extended.

A minimum of 8- and 16-bit quantities must be supported.

A.3. BOOL2 ( x1 x2 -- x ) and BOOL2 ( x1 x2 -- x1 x2 x )

Instructions in groups 2 and 3 comprise the BOOL2 instruction. These two groups differ in whether the input parameters are first popped off the stack (group 3) or not (group 2).

The BOOL2 instruction computes a boolean function given two parameters from the data stack. The upper four bits of the opcode forms a look-up table which determines the operation to perform.

.---------------+-----------+---.
| a   b   c   d | 0   0   1 | D |
`---------------+-----------+---'

Given two bits (one each from x1 and x2), use abcd above to calculate the result according to this table:

x1 x2 || r  AND NAND    OR  NOR XOR XNOR    BIC
-----------------------------------------------
 0  0    a  0   1       0   1   0   1       0
 0  1    b  0   1       1   0   1   0       0
 1  0    c  0   1       1   0   1   0       1
 1  1    d  1   0       1   0   0   1       0

You can also use BOOL2 to push zero and negative-1 constants onto the stack more quickly and compactly than you can with any of the LIT instructions. Setting abcd=0000 pushes zero, while setting abcd=1111 pushes negative one.

Many stack manipulation operations are implemented using BOOL2 as well.

DROP    NIP     OVER    DUP
D=1     D=1     D=0     D=0

0       0       0       0
0       1       0       1
1       0       1       0
1       1       1       1

Some common operations are encoded as follows:

00000010    0                       00000011    2DROP 0
00010010    2DUP AND                00010011    AND
00100010    2DUP BIC                00100011    BIC
00110010    OVER                    00110011    DROP
01000010    2DUP SWAP BIC           01000011    SWAP BIC
01010010    DUP                     01010011    NIP
01100010    2DUP XOR                01100011    XOR
01110010    2DUP OR                 01110011    OR
10000010    2DUP NOR                10000011    NOR
10010010    2DUP XNOR               10010011    XNOR
10100010    DUP INVERT              10100011    INVERT NIP
10110010    2DUP INVERT OR          10110011    INVERT OR
11000010    OVER INVERT             11000011    DROP INVERT
11010010    2DUP SWAP INVERT OR     11010011    SWAP INVERT OR
11100010    2DUP NAND               11100011    NAND
11110010    -1                      11110011    2DROP -1

A.4. BOOL1 ( x1 -- x ) and BOOL1 ( x1 -- x1 x )

Instructions in groups 4 and 5 comprise the BOOL1 instruction. These two groups differ in whether the input parameter is first popped off the stack (group 5) or not (group 4).

The BOOL1 instruction computes a boolean function given one parameter from the data stack. The upper four bits of the opcode forms a look-up table which determines the operation to perform.

.---------------+-----------+---.
| 0 . 0 . c . d | 0 . 1 . 0 | D |
`---------------+-----------+---'

For each bit in x1, use cd above to calculate the result according to this table:

x1 || r     ZERO    NEG1    INVERT  NOP
-------
 0    c     0       1       1       0
 1    d     0       1       0       1

You can also use BOOL1 to push zero and negative-1 constants onto the stack more quickly and compactly than you can with any of the LIT instructions. Setting cd=00 pushes zero, while setting cd=11 pushes negative one. While this overlaps with BOOL2 instruction encodings, it's useful to have these instructions for those cases where you only want to encode DROP 0 or DROP -1 instead of 2DROP 0 or 2DROP -1.

Some common operations are encoded as follows:

00000100    0               00000101    DROP 0
00010100    DUP             00010101    NOP
00100100    DUP INVERT      00100101    INVERT
00110100    -1              00110101    DROP -1

A.4. ADDSUB2 ( x1 x2 -- x ) and ADDSUB2 ( x1 x2 -- x1 x2 x )

Instructions in groups 6 and 7 comprise the ADDSUB2 instruction. These two groups differ in whether the input parameter is first popped off the stack (group 7) or not (group 6).

The ADDSUB2 instruction computes a 2's-compliment sum given two parameters from the data stack. The upper bits of the opcode controls the precise data path through the ALU to calculate this sum. The result can be used for addition or subtraction, depending on configuration.

.---+---+-------+-----------+---.
| 0 | b |  cc   | 0   1   1 | D |
`---+---+-------+-----------+---'

The b bit inverts the x2 operand if set; otherwise, it leaves the value unchanged. The cc field offers input carry control:

00  Ignore carry flag; assume carry is clear.
01  Ignore carry flag; assume carry is set.
10  Use carry flag as-is.
11  Use inverted carry flag.

Note that this instruction always updates carry.

Some common operations are encoded as follows:

00000110   2DUP ADD        00000111    ADD
00010110   2DUP ADD 1+     00010111    ADD 1+
00100110   2DUP ADC        00100111    ADC
00110110   2DUP ADC 1-     00110111    ADC 1-
01000110   2DUP SUB 1-     01000111    SUB 1-
01010110   2DUP SUB        01010111    SUB
01100110   2DUP SBC        01100111    SBC
01110110   2DUP SBC 1-     01110111    SBC 1-

A.5. COND ( x2 -- ) and SEC COND ( x1 x2 -- x2 )

IF not prefixed with SEC THEN
    value := POP(D)
ELSE
    value := POP(S)
END
COND := ((value < 0) & a) | ((value = 0) & b) | (carry & c) == y

The COND prefix alters the behavior of a subsequent control flow instruction, such as the instructions in the JUMPI or BRANCHES groups. The COND prefix has no effect on instructions which do not transfer control.

.---+---+---+---+-----------+---.
| 0 | a | b | c | 1   0   0 | y |
`---+---+---+---+-----------+---'

Assuming the c bit is set to 0, the encoding of a, b, and y can give the following checks:

    Signed      Unsigned
a b y   Check       Check
---------------------------------
0 0 0   always      always
0 0 1   never       never
0 1 0   value != 0  value > 0
0 1 1   value = 0   value = 0
1 0 0   value >= 0  
1 0 1   value < 0
1 1 0   value > 0
1 1 1   value <= 0

With the c bit set, there is an additional check on the carry flag.

SEC is a prefix that modifies the COND prefix to work with the second top of stack instead of the direct top of stack. This is most often used for emulating S16X4A control flow instructions.

A.6. Direct Jumps and Calls

Instructions in group 10 are responsible for direct transfer of program control. Without the COND prefix, the control flow transfers are unconditional; with the COND prefix, they are conditional.

.---+---+---+---+---------------.
|    sss    | C | 1   0   1   0 |
`---+---+---+---+---------------'

The size (sss) field indicates how big the displacement to the program counter is. The C bit is set for a subroutine call, clear for a simple jump.

A.7. Shifts and Rotations

The instructions in group 11 perform bitwise rotations and shifts.

.---+-----------+---------------.
| 0 |    fff    | 1   0   1   1 |
`---+-----------+---------------'

The function (fff) field selects the precise operation to perform, according to the opcode encodings below:

00001011   LSL     ( n cnt -- n' )
00011011   LSR     ( n cnt -- n' )
00101011   ASR     ( n cnt -- n' )
00111011   PERMUTE ( n idx -- n' )
01001011   RL      ( n cnt -- n' )
01011011   RLC     ( n cnt -- n' )
01101011   RR      ( n cnt -- n' )
01111011   RRC     ( n cnt -- n' )

The LSL and LSR operations perform logical shifts (left and right, respectively). ASR also performs a right shift, but does so arithmetically. RL and RR perform rotations left and right without cycling through the carry flag. RLC and RRC do so including the carry flag as an additional bit. For example, if we execute the instructions SLIT8 $88 ULIT8 $01, then the following instructions will produce the following results:

LSL

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |   | 1 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'
  |                                           ^
  |                                           |
  `-------------------------------------------'

LSR

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'

ASR

     N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 1 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'

RL

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |   | 1 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'
  |                                   ^       ^
  |                                   |       |
  `-----------------------------------'-------'

RLC

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |<--| 1 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'
  |                                           ^
  |                                           |
  `-------------------------------------------'

RR

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'
  ^                                   |
  |                                   |
  `-----------------------------------'

RRC

 N-1      7   6   5   4   3   2   1   0       C
.---+-/-+---+---+---+---+---+---+---+---.   .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---'   `---'
  ^                                           |
  |                                           |
  `-------------------------------------------'

TODO: Look into ways of unifying rotations and shifts using option bits. It might not be possible due to having too few bits available; but, if it can be done, we should use that approach instead of function decodes.

The PERMUTE instruction is useful for re-arranging the bytes within a multi-byte cell. You have direct control of which nybbles in an input cell land in an output cell. If multiple source nybbles are routed to the same destination nybble, they are logically-ORed.

For example, on a 32-bit processor, a do-nothing permutation would look like this: ULIT32 $12345678 ULIT32 $76543210 PERMUTE. To reverse the bytes: ULIT32 $12345678 ULIT32 $10325476 PERMUTE.

This instruction is quite useful for implementing conversions to/from big-endian representation.

A.8. PC-Relative Effective Addresses

The group 12 set of instructions are used to push PC-relative addresses onto the stack for subsequent access by loads and stores (group 13 and 14 instructions).

.---+---+---+---+---------------.
|    sss    | S | 1   1   0   0 |
`---+---+---+---+---------------'

The size (sss) field indicates how big the PC-relative displacement is (8 bits, 16 bits, etc.). The S bit indicates which stack to push the effective address onto; if clear, the return stack. If set, the data stack.

A.9. Stores and Loads

The group 13 instructions allows data to be stored into memory. Group 14 instructions can retrieve this data back via a set of signed and unsigned loads.

Note that loads and stores may cause a trap. Some CPUs may offer memory protection. Others require loads and stores to occur only on naturally aligned fields in memory. Etc.

A.9.1. Stores

.---+---+---+-------------------.
|    sss    | 0   1   1   0   1 |
`---+---+---+-------------------'

The size field (sss) indicates the size of the data to store into memory. As indicated below, the data stored into memory comes from the lowest set of bits.

 6               3       1
 3               1       5   7 0
.---------------------------+---.
|///////////////////////////|   |  sss=000  Byte
+-----------------------+---+---+
|///////////////////////|       |  sss=001  Half-word
+---------------+-------+-------+
|///////////////|               |  sss=010  Word
+---------------+---------------+
|                               |  sss=011  Double-word
`-------------------------------'

A.9.2. Loads

.---+---+---+---+---------------.
|    sss    | S | 1   1   1   0 |
`---+---+---+---+---------------'

The size field (sss) indicates the size of the data to load from memory. The S bit indicates if the data retrieved is interpreted as an unsigned (0; zero-extended) or signed (1; sign-extended) quantity. A load always affects the full cell width in the data stack.

 6               3       1
 3               1       5   7 0
.---------------------------+---.
|///////////////////////////|   |  sss=000  Byte
+-----------------------+---+---+
|///////////////////////|       |  sss=001  Half-word
+---------------+-------+-------+
|///////////////|               |  sss=010  Word
+---------------+---------------+
|                               |  sss=011  Double-word
`-------------------------------'

.---.
|///| Sign-extension or zero-extension, depending on S bit.
`---'

B. S16X4(A) to ISA/NG Migration

Just as the 8086 was intended to be source-code compatible with the 8008 and 8080, so too is ISA/NG intended to be as source-code compatible with the S16X4(A) MISC processors as possible. However, they are not binary compatible.

Here's the complete mapping of instruction sequences from the S16X4 to the ISA/NG.

S16X4A      ISA/NG
======      ======
NOP         NOP
LIT         ULIT16 or PEAD16 depending on context
FWM         ULD16
SWM         UST16
ADD         ADD
AND         AND
XOR         XOR
LIT/ZGO     COND(T=0)/JMP  or  ULIT16/SEC/COND/JMPDI
ZGO         SEC/COND/JMPDI
FBM         ULD8
SBM         UST8
LCALL       CALL
ICALL       CALLDI  or  SWITCH (depending on context)
LIT/NZGO    COND(T<>0)/JMP  or  ULIT16/SEC/COND/JMPDI
NZGO        SEC/COND/JMPDI
LIT/GO      JMP  or  ULIT16/JMPDI
GO          JMPDI or RET (depending on context)

C. Instruction Mapping

C.1. Binary Encoding

00000000        BRK         EEPROM Patch Breakpoint
00010000        SC          System Call
001f0000        PUSHPOP     PUSH and POP instructions
01sc0000        JUMPI       Indirect Control Flow
100f0000        CRLDST      Control Register accessors
101e0000        EIDI        Enable/Disable interrupts
11000000        SEC         Prefix for COND
11010000        R@          Return stack accessor
sssS0001 ...    LIT     Literal load group
ffff001d        BOOL2
00ff010d        BOOL1
0bcc011d        ADDSUB2
0abc100y        COND
sssc1010 ...    BRANCHES    Direct control flow
ffff1011        MULSHF2
sssy1100 ...    PEA     Support for PC-relative programs
sss01101        STORES
sssS1110        LOADS

Illegal Opcode Encodings (These will trap)

111x0000
01xx010x
1xxx01xx
1xxx1011
xxx11101
xxxx1111

JUMPI Group

01000000   JMPDI   PC=T
01010000   JSRDI   R=PC+1; PC=T
01100000   RET PC=R
01110000   SWITCH  R=PC+1; PC=R

BRANCHES

sss01010   JMP PC+ea
sss11010   CALL PC+ea

BOOL2

00000010   0
00010010   2DUP AND
00100010   2DUP BIC
00110010   OVER
01000010   2DUP SWAP BIC
01010010   DUP
01100010   2DUP XOR
01110010   2DUP OR
10000010   2DUP NOR
10010010   2DUP XNOR
10100010   DUP INVERT
10110010   2DUP INVERT OR
11000010   OVER INVERT
11010010   2DUP SWAP INVERT OR
11100010   2DUP NAND
11110010   -1
00000011   2DROP 0
00010011   AND
00100011   BIC
00110011   DROP
01000011   SWAP BIC
01010011   NIP
01100011   XOR
01110011   OR
10000011   NOR
10010011   XNOR
10100011   INVERT NIP
10110011   INVERT OR
11000011   DROP INVERT
11010011   SWAP INVERT OR
11100011   NAND
11110011   2DROP -1

BOOL1

00000100   0
00010100   DUP
00100100   DUP INVERT
00110100   -1
00000101   DROP 0
00010101   NOP
00100101   INVERT
00110101   DROP -1

ADDSUB2

00000110   2DUP ADD
00010110   2DUP ADD 1+
00100110   2DUP ADC
00110110   2DUP ADC 1-
01000110   2DUP SUB 1-
01010110   2DUP SUB
01100110   2DUP SBC
01110110   2DUP SBC 1-
00000111   ADD
00010111   ADD 1+
00100111   ADC
00110111   ADC 1-
01000111   SUB 1-
01010111   SUB
01100111   SBC
01110111   SBC 1-

MULSHF2

00001011   LSL
00011011   LSR
00101011   ASR
00111011   PERMUTE
01001011   RL
01011011   RLC
01101011   RR
01111011   RRC
1xxx1011   illegal

C.2. Opcode Map

        0             1           2           3           4           5           6           7
0       BRK           ULIT8       BOOL2       BOOL2       BOOL1       BOOL1       ADDSUB2     ADDSUB2
1       SC            SLIT8       BOOL2       BOOL2       BOOL1       BOOL1       ADDSUB2     ADDSUB2
2       POP           ULIT16      BOOL2       BOOL2       BOOL1       BOOL1       ADDSUB2     ADDSUB2
3       PUSH          SLIT16      BOOL2       BOOL2       BOOL1       BOOL1       ADDSUB2     ADDSUB2
4       JMPDI (JUMPI) ULIT32      BOOL2       BOOL2       ---         ---         ADDSUB2     ADDSUB2
5       CALLDI(JUMPI) SLIT32      BOOL2       BOOL2       ---         ---         ADDSUB2     ADDSUB2
6       RET   (JUMPI) ULIT64      BOOL2       BOOL2       ---         ---         ADDSUB2     ADDSUB2
7       SWITCH(JUMPI) SLIT64      BOOL2       BOOL2       ---         ---         ADDSUB2     ADDSUB2
8       CR!           ---         BOOL2       BOOL2       ---         ---         ---         ---
9       CR@           ---         BOOL2       BOOL2       ---         ---         ---         ---
A       DI            ---         BOOL2       BOOL2       ---         ---         ---         ---
B       EI            ---         BOOL2       BOOL2       ---         ---         ---         ---
C       SEC[1]        ---         BOOL2       BOOL2       ---         ---         ---         ---
D       R@            ---         BOOL2       BOOL2       ---         ---         ---         ---
E       ---           ---         BOOL2       BOOL2       ---         ---         ---         ---
F       ---           ---         BOOL2       BOOL2       ---         ---         ---         ---

        8         9           A           B           C           D           E           F
0       COND[2]   COND[2]     JMP8        LSL         PEAR8       ST8         ULD8        ---
1       COND[2]   COND[2]     CALL8       LSR         PEAD8       ---         SLD8        ---
2       COND[2]   COND[2]     JMP16       ASR         PEAR16      ST16        ULD16       ---
3       COND[2]   COND[2]     CALL16      PERMUTE     PEAD16      ---         SLD16       ---
4       COND[2]   COND[2]     JMP32       RL          PEAR32      ST32        ULD32       ---
5       COND[2]   COND[2]     CALL32      RLC         PEAD32      ---         SLD32       ---
6       COND[2]   COND[2]     JMP64       RR          PEAR64      ST64        ULD64       ---
7       COND[2]   COND[2]     CALL64      RRC         PEAD64      ---         SLD64       ---
8       ---       ---         ---         ---         ---         ---         ---         ---
9       ---       ---         ---         ---         ---         ---         ---         ---
A       ---       ---         ---         ---         ---         ---         ---         ---
B       ---       ---         ---         ---         ---         ---         ---         ---
C       ---       ---         ---         ---         ---         ---         ---         ---
D       ---       ---         ---         ---         ---         ---         ---         ---
E       ---       ---         ---         ---         ---         ---         ---         ---
F       ---       ---         ---         ---         ---         ---         ---         ---

[1] - Instruction Prefix.  Modifies behavior of COND.
[2] - Instruction Prefix.  Modifies behavior of control flow instructions.