A Possible STEbus Interface for the 65816 Processor

Posted on

A Possible STEbus Interface for the 65816 Processor

The 65816's local bus is known to be very simple; indeed, one might even say too simple. It is generally synchronous, although I documented how to implement a 68000-like asynchronous interface with only a handful of logic.

The STEbus isn't pseudo-asynchronous, unlike the 68000. It is truly asynchronous. Although, it is perfectly legal to implement it in a pseudo-asynchronous way. There is a 16MHz clock provided on each expansion slot for this convenience; however, none of the signals on the remaining pins are in any way referenced to this clock.

So, how does one interface a fully synchronous processor to a fully asynchronous expansion backplane?

State machines. And registers. Lots of registers.

I'm going to focus on the asynchronous handshake that happens between the bus master (65816) and slave peripherals. Note: From here-on out, I'm going to prefer to use the expressions initiator and peripheral, respectively, instead.

Writes to Memory or I/O

The simplest possible transaction to support is a write into memory or I/O space. Writing requires registering the address bus, the data bus, and R/W to generate the corresponding signals on the STEbus side.

                        +----------+
                        |          |+
    A0-A19 (cpu) ======>|D  '373  Q|=====> A0-A19 (ste)
                        |          ||
           CPUWR o------|LE        ||
         _STEDRA o-----o|/OE       ||
                        |          ||
                        +----------+|
                        |          |+
            R/W  o------|D        Q|-----> CM0
            A20  o------|D  '373  Q|-----> CM1
            +Vcc o------|D        Q|-----> CM2
                        |          |
           CPUWR o------|LE        |
         _STEDRA o-----o|/OE       |
                        |          |
                        +----------+
                        |          |
     D0-D7 (cpu) ======>|D  '373  Q|=====> D0-D7 (ste)
                        |          |
           CPUWR o------|LE        |
         _STEDRD o-----o|/OE       |
                        |          |
                        +----------+

This circuit will latch the desired address and the data from the initiator. We don't actually drive the bus until _STEDRA and _STEDRD (STE drive address and data, respectively) are asserted. That will be the topic of the subsequent circuit below.

The data path is fairly straight-forward, and is directed by the following signals:

SignalDescription
CPUWRAsserted to latch the 65816's address and data buses.
phi2The 65816's synchronous bus clock. For the ForthBox design, it'll run between 4MHz and 8MHz.
R/W1 if the 65816 is reading from memory, 0 if writing.
STECSAsserted if the 65816's address bus indicates a byte somewhere in STE address space (memory or I/O).
STEDRAAsserted to drive the STEbus address and command modifier buses with the latched values from the 65816.
STEDRDAsserted to drive the STEbus data bus with the latched value of the 65816's data bus.

While the data path is really simple, the control logic is where things start to get intricate. The state machine we want to implement is as follows:

    T0      Assert CPURDY to let the CPU continue.
            Wait for CPUWR high.

    T1      Wait for CPUWR low.

    T2      Drive the address bus and data bus.
            Negate CPURDY to prevent an accidental back-to-back write while the STEbus is busy.

    T3      Assert `_ADRSTB` and `_DATSTB`.
            If `_DATACK` and `_TFRERR` are negated, wait; else goto T4.

    T4      Release the address bus and data bus.
            Negate `_ADRSTB` and `_DATSTB`.
            If either `_DATACK` or `_TFRERR` are asserted, wait; else goto T0.

or, put more formally,

TTransition RuleSignals AssertedDescription
0If (phi2, STECS, R/W) = (1, 1, 0), goto T1; else T0.CPURDYWait for a write to STE space.
1If phi2 = 1, goto T1; else T2.CPURDY, CPUWRWait for write-cycle to complete; latch signals while we wait.
2Goto T3.STEDRA, STEDRDDrive address/command bus with latched signals. (Provide set-up time.)
3IF xxxACK = 0, goto T3; else T4.STEDRA, STEDRD, xxxSTBDrive strobes; wait for DATACK or TFRERR.
4IF xxxACK = 1, goto T4; else T0.Clear address, command, data; wait for cycle to end.

NOTE: In the table above, I treat all signals as 1 if it's asserted or 0 if it's negated. In a real circuit, some signals with be active-high and some will be active-low, as dictated by the needs of various interface chips. This is why the table references, e.g., STEDRA, while the datapath schematic references _STEDRA instead. This might seem awkward if you're not used to this convention; however, when working with programmable logic (especially FPGAs!), this is usually the norm.

This assumes we have a state machine synchronously clocked at a rate much faster than the host CPU. In our case, since ADRSTB*/DATSTB* need a setup time of 35ns, 28.5MHz seems a natural fit. This results in a timing sequence like the following, which shows a back-to-back write into STEbus space:

                    |          write          |          wait           |           write          |
                                  ____________              ____________               ____________
    phi2            \____________/            \____________/            \_____________/            \
                    _____
    R/W                  \__________________________________________________________________________
                    _______________
    _STECS                     \\\\\________________________________________________________________
                                      _______________                                   ____________
    CPUWR           _________________/               \_________________________________/
                    _________________________________                            ___________________
    CPURDY                                           \__________________________/
                    ________________ ________________ _____ _________ _________ _________ __________
    T               _______0________X_______1________X__2__X____3____X____4____X____0____X_____1____
                    __________________________________                 _____________________________
    _STEDRx                                           \_______________/
                                                       _______________
    A0-A19 (ste)    ----------------------------------<_______________>-----------------------------
                                                       _______________
    D0-D7 (ste)     ----------------------------------<_______________>-----------------------------
                    ________________________________________           _____________________________
    xxxSTB*                                                 \_________/
                    _______________________________________________           ______________________
    DATACK*                                                        \_________/

The state machine guarantees that the CPU's RDY signal is negated to prevent mutual interference. (We assume the address decoding logic routes the CPURDY signal back to the CPU's RDY input for as long as _STECS is asserted. If the CPU were to address another bus resource after kicking off an STEbus write, the wait state would be avoided entirely.)

While the state machine is in motion, it starts by driving the STEbus with the latched value of the address, data, and command buses. 35ns later (thanks to the 28.5MHz clocking of the state machine), ADRSTB* and DATSTB* are asserted. While neither DATACK* nor TFRERR* are asserted, the machine waits in its current state.

Once one of the acknowledges is asserted, it progresses to the next state, which removes the address, data, and commands from the bus, and sits and waits for the acknowledge to be released. Only once this is done will the CPU's RDY signal assert again, and the state machine is ready for another transfer.

Reading

Memory writes looks complicated, and compared to the typical peripheral interfaces, it kind of is. But, reading is more complex still, because we cannot rely on the parallel operation of the CPU and the bus bridge. Instead, the CPU really does have to wait for the data to arrive off the STEbus before it can continue. In other words, the state of CPURDY must negate immediately after a read from STE-space is detected, not just after the CPU write is complete.

Instead of the CPU writing a byte and then we ferry it to the STE-space, the state machine must first request the byte from the STE-space and then ferry it to the 65816.

ASIDE: If you've ever wondered why writes to video cards on PCs were always so much faster than reads from the video frame buffer, this is exactly why: with writes, you can fire-and-forget, and let the bus bridge deal with the responsibility of ferrying data while the CPU is off doing other things. When reading, the CPU has no choice but to wait for the bridge to accomplish its goal first.

We still latch the address and command like before (and can reuse that circuitry). However, instead of driving the data bus, we instead drive a transparent latch whose outputs are CPU-facing. So, our data path now looks something like this:

                                        +----------+
                                        |          |+
                    A0-A19 (cpu) ======>|D  '373  Q|=====> A0-A19 (ste)
                                        |          ||
                           CPUWR o------|LE        ||
                         _STEDRA o-----o|/OE       ||
                                        |          ||
                                        +----------+|
                                        |          |+
                            R/W  o------|D        Q|-----> CM0
                            A20  o------|D  '373  Q|-----> CM1
                            +Vcc o------|D        Q|-----> CM2
                                        |          |
                           CPUWR o------|LE        |
                         _STEDRA o-----o|/OE       |
                                        |          |
                                        +----------+
                                        |          |
                                 #=====>|D  '373  Q|=====#
                                 #      |          |     #
                     CPUWR o------------|LE        |     #
                   _STEDRD o-----------o|/OE       |     #
                                 #      |          |     #=====> D0-D7 (ste)
               D0-D7 (cpu) >=====#      +----------+     #
                                 #      +----------+     #
                                 #      |          |     #
                                 #======|Q  '373  D|<====#
                                        |          |
                           STERD o------|LE        |
                         _CPUDRD o-----o|/OE       |
                                        |          |
                                        +----------+

And our timing diagram can be refactored into a sections that shows reads distinctly from writes. Two different sets of states will cover reads (T0-T4, as before) and writes (T0, T5-T9), respectively.

                                  ____________              ____________               ____________
    phi2            \____________/            \____________/            \_____________/            \
                    _______________
    _STECS                     \\\\\________________________________________________________________
                    ___ _________ __ ____ ___________ ______ __ _____________________ __ ___________
    D0-D7 (wr)      |||X_________X__X||||X____W1_____X||||||X__X_________W2__________X__X_______W2__
                    ___ _________ __ _______________________ __ ______ ______ _______ __ ___________
    D0-D7 (rd)      |||X_________X__X|||||||||||||||||||||||X__X||||||X__R1__X|||||||X__X___________

                    . . . . . . . . . . . . . . . . . . . writes . . . . . . . . . . . . . . . . . .

                                      _______________                                   ____________
    CPUWR           _________________/               \_________________________________/
                    _________________________________                            ___________________
    CPURDY                                           \__________________________/
                    ________________ ________________ _____ _________ _________ _________ __________
    T               _______0________X_______1________X__2__X____3____X____4____X____0____X_____1____
                    __________________________________                 _____________________________
    _STEDRx                                           \_______________/
                                                       _______________
    A0-A19 (ste)    ----------------------------------<_______________>-----------------------------
                                                       _______________
    D0-D7 (ste)     ----------------------------------<_______________>-----------------------------
                    ________________________________________           _____________________________
    xxxSTB*                                                 \_________/
                    _______________________________________________           ______________________
    DATACK*                                                        \_________/

                    . . . . . . . . . . . . . . . . . . . reads . . . . . . . . . . . . . . . . . .

                    ________________ _____ ______________________ _____ _____ ______________________
    T               _______0________X__5__X___________6__________X__7__X__9__X___________0__________
                                  ____________              ____________               ____________
    phi2            \____________/            \____________/            \_____________/            \
                    _______________                                           ______________________
    _STECS                     \\\\\_________________________________________///////////
                    ___ _________ __ _______________________ __ ______ ______ _______ __ ___________
    D0-D7 (rd)      |||X_________X__X|||||||||||||||||||||||X__X||||||X__R1__X|||||||X__X___________

                    __________________                                  ____________________________
    CPURDY (rd)                       \________________________________/             
                                       ____________________________
    A0-A19 (ste)    ------------------<____________________________>--------------------------------
                                              ___________________ ______
    D0-D7 (ste)     -------------------------<___________________X__R1__>---------------------------
                    _______________________                        _________________________________
    xxxSTB* (rd)                           \______________________/
                    _____________________________________________       ____________________________
    DATACK*                                                      \_____/
                    __________________                              ________________________________
    _STEDRA                           \____________________________/
                                       ____________________________
    STERD           __________________/                            \________________________________

The T-state table is modified as follows (note how T0 is adjusted, and T5-T9 are added):

TTransition RuleSignals AssertedDescription
0If (phi2, STECS, R/W) = (1, 1, 0), goto T1; else if (1, 1, 1) goto T5; else T0.CPURDYWait for a read or write to STE space.
1If phi2 = 1, goto T1; else T2.CPURDY, CPUWRWait for write-cycle to complete; latch signals.
2Goto T3.STEDRA, STEDRDDrive address/command bus with latched signals. (Provide set-up time.)
3IF xxxACK = 0, goto T3; else T4.STEDRA, STEDRD, xxxSTBDrive strobes; wait for DATACK or TFRERR.
4IF xxxACK = 1, goto T4; else T0.Clear address, command, data; wait for cycle to end.
5Goto T6.CPUWR, STEDRAHalt CPU right away. Latch address and command from CPU.
6If xxxACK = 0, goto T6; else T7.STEDRA, xxxSTB, STERDLatch data bus from STEbus; wait for acknowledgement.
7If (xxxACK, phi2) = (1, 0), goto T8; else if (1, 1), goto T9; else T7.Wait for acknowledge to clear.
8If phi2 = 0, goto T8; else T9.Wait for phi2 high (synchronize against 65816 bus cycle)
9If phi2 = 1, goto T9; else T0.CPURDY, CPUDRDWait for end of host CPU cycle. Drive data for host CPU.

As you can imagine, the logic for implementing this table isn't particularly hard to make, however it is laborious and error-prone if building this out with discrete components. This is why most (all?) STEbus initiators use some flavor of programmable logic.