The 65xx bus, being single-cycle (meaning, one CPU cycle maps to one memory cycle), makes creating expansion backplanes for it somewhat difficult due to the timing requirements involved. As if that weren't enough, the relative expense of creating peripherals is worsened by expensive connector technologies. The most prevalent method is to use PCI slots, which can cost upwards of $4 per slot if you know where to get them. This is also true for 96-pin Eurocard connectors too. For an interconnect, that's quite hefty in terms of price.
From a mechanical point of view, there is the minor issue of deciding the final form-factor of the product as well. In particular, do you want 2 slots? 4 slots? 8 slots? Maybe 6? How far apart do we space them? Are they buffered or unbuffered? Do we expose the raw CPU bus, or make a CPU independent bus? What if I don't want a card cage at all? What if I do, but want it to be a half-rack unit instead of a full-rack unit? If I design it for a rack unit, how many Us do I make it? I think you get the idea.
These issues all add to the expense of solving the basic problem: getting data from point A, to point B. None of these issues address the software side of things either!
The operating system would need to be aware of how to probe the attached hardware for device identification, so that it may intelligently seek out the suitable device drivers. Or, maybe the card will place its own drivers in ROM. Or, maybe the OS will load the drivers into local RAM from a ROM-disk provided by the card. Do I have split configuration and I/O spaces? Do I merge configuration, I/O, and drivers into a single address space? Do I use parallel ROMs or serial ROMs for storing driver/device ID information? Etc.
The hardware is greatly simplified by employing a basic serial interface, to be documented elsewhere. However, a side effect of using a serial interface is that peripherals must employ some element of intelligence in order to be useful. It is not possible, with the resources available to the Kestrel, for the hardware to perform a write to an I/O location, and to have that write transmitted over the serial link, to be decoded by a remote DMA engine (indeed, this is precisely how Firewire works, but requires extremely specialized hardware on both sides of the link for this to work). Therefore, a higher-level protocol must be defined to enable a cheap interconnect to have maximum benefit.
The SBDS register is used to select which device is currently selected. The register layout is as follows:
| 15 | 14-3 | 2-0 |
|---|---|---|
| TxDone | 0 | DeviceID |
Bit 15 is set if the current transmission is complete. It does not mean that the operation performed on the currently addressed peripheral is done. It just means that you are allowed to send another byte or deselect the current device if you wish. While transmitting, this bit is clear.
Bits 14 through 3 are currently undefined, and must be written as zeros for future compatibility. Do not depend on them being zero when reading, however.
Bits 2 through 0 selects which device on the serial bus you wish to talk to. Seven devices are supported. Device 0 is reserved for null device, which is a "device" that ignores all data sent to it. In other words, if you wish to deselect all attached devices, select device 0. Devices 1 through 7 may or may not be present. Detection of peripherals is discussed later.
The SBDS register is read/write.
Each device attached to the SerBus provides a single, flat address space. Either I/O registers or RAM may appear in this address space. Although the protocol currently permits address spaces to be as wide as 120 bits, spaces larger than 64 bits are currently undefined.
The protocol is optimized to favor smaller addresses over the wire. This means that it is faster to address location $01 than it is $010000, since fewer bytes need to be transmitted to address it. Therefore, it should advantageous to place the most commonly used I/O resources in low memory, and buffers higher up in memory.
Example: A disk device will put its command registers in zero-page ($00-$FF), while the RAM buffers for I/O transactions appear higher up in memory (e.g., $0400-$07FF).
This specification does not specify what appears in the memory spaces of peripherals. Autoconfiguration support is planned, but the requirements for autoconfiguration will be documented separately.
Each peripheral on the SerBus is a SPI device. The SPI controller provides the coupling between the SPI bus and the peripheral's normal operations. Indeed, it is the SPI controller which ultimately provides the memory space abstraction discussed above. There is no requirement that the peripheral itself adhere to the memory map imposed by the SPI controller.
Because the SPI provides a memory-based abstraction, and because the SerBus lacks a reset signal, it must operate completely independently of the peripheral. That is, when a host PC requests to read a register in the peripheral's address space, it must respond, even if the peripheral itself is still busy with a previous operation.
For this reason, the SPI interface on a real-world hardware device will typically appear as a separate microcontroller chip, which typically costs in the $2 range (at the time this document was written). However, the precise architecture chosen by a peripheral's designer is not specified in this document.
When a device is selected via the SBDS register, the SPI controller must be attentive to incoming data immediately. The communications frame is discussed below.
When a device is deselected, the SPI controller may safely ignore the SPI bus, until it is selected again. If it has duties other than tending to the SPI bus, it may safely do so now.
When selected, the SPI controller will expect a specific frame of data. The frame format is defined as follows:
| CmdAddr | .. address .. | .. data .. |
The CmdAddr byte is broken up into two fields. Bits 7-4 specify the SPI controller command to perform. The following commands are specified:
| Command | Description |
|---|---|
| 0 | Write data to the device, with address auto-increment. |
| 1 | Read data from the device, with address auto-increment. |
| 2 | Write data to the device, without address auto-increment. |
| 3 | Read data from the device, without address auto-increment. |
| 4 | Reserved for SerBus switches (used to expand the number of peripherals you can attach to the bus). Devices which are not switches MUST ignore this command. |
| 5-15 | Not specified. Peripherals MUST ignore these commands. |
The address field is transmitted least significant byte first. The length of this field is specified by bits 3-0 of the CmdAddr byte. Lengths 1 through 8 are supported by the standard to refer to addresses up to 8 bytes long. Lengths of 9-15 and of 0 are reserved for future expansion. Specifying an unsupported address width MUST result in the whole operation being ignored.
The data field is either sent to, or received from, the peripheral. NOTE: When receiving data, there is no gap byte between the command and address and the first byte. The very next byte is the byte read at the specified address. It is incumbent on the SPI controller to ensure this happens.
The code that follows is not intended to be canonical. They have not been tested as of this writing (however, their Forth equivalents have been tested). The purpose of this code is to demonstrate the mechanics of sending and receiving data on the bus, and to demonstrate working with the SerBus chip in the Kestrel-2's I/O space.
.proc waitForSerBus
pha
stillBusy:
lda SBDS
bpl stillBusy
pla
rts
.endproc
.proc selectDevice
jsr waitForSerBus
and #7
ora SBDS
sta SBDS
rts
.endproc
.proc sendByte
jsr waitForSerBus
sta SBDATX
rts
.endproc
.proc determineAddressLength
cpx #256
bcs longAddress
lda #1
rts
longAddress:
lda #2
rts
.endproc
.proc sendAddress
and #$0F
pha
txa
plx
jsr sendByte
dex
beq done
xba
jsr sendByte
done:
rts
.endproc
.proc sendWriteHeader
; A = device
; X = address on device (16-bit address max)
jsr selectDevice
jsr determineAddressLength
jsr sendByte
jmp sendAddress
.endproc
.proc writeByte
; A = device
; X = address on device (16-bit address max)
; Y = byte to send
jsr sendWriteHeader
tya
jsr sendByte
lda #0
jmp selectDevice
.endproc
.proc writeBuffer
; A = device
; X = address on device
; $4000 <= Y < $4400 buffer containing data to send
jsr sendWriteHeader
notEndOfBufferYet:
lda a:0,y
jsr sendByte
iny
cpy #$4400
bcc notEndOfBufferYet
lda #0
jmp selectDevice
.endproc
.proc receiveByte
lda #0
jsr sendByte
jsr waitForSerBus
lda SBDARX
rts
.endproc
.proc sendReadHeader
jsr selectDevice
jsr determineAddressLength
ora #CMD_READ
jsr sendByte
jmp sendAddress
.endproc
.proc readBuffer
; A = device
; X = address on device
; $4000 <= Y < $4400 buffer to contain the data
jsr sendReadHeader
notEndOfBufferYet:
jsr receiveByte
sta a:0,y
iny
cpy #$4400
bcc notEndOfBufferYet
lda #0
jmp selectDevice
.endproc