|
Unsuitable on Unsuitable: Concepts
Samuel A. Falvo II
kc5tja -at- arrl.net 2010 Jun 26 20:23 PDT
Most open source and commercial applications alike lack an overview of the core concepts used in its development. Without this overview, anyone trying to understand the program will experience difficulty internalizing the design. In this article, I discuss three of Unsuitable's core concepts. Without these key concepts, Unsuitable would be substantially more complex to understand and implement. 1 Persistence at the Lowest LevelAt the lowest level, Unsuitable's machine model assumes two distinct, but related, address spaces. The first address space, named the dictionary by ANSI Forth specifications, offers a non-persistent, private, temporary storage space which comes and goes as web requests commence and finish. In contrast, all CGI handler instances share the persistent space, which outlasts all CGI handler lifetimes, and thus stores the content the blog delivers. Unsuitable treats both address spaces as byte-accessible using fetch and store operations. Unsuitable distinguishes between dictionary and persistent spaces — erroneous behavior results from using a persistent pointer with @ or ! or a dictionary address with @f or !f. Address 10 in the dictionary does not refer to address 10 in persistent space. The very first definitions issued on every CGI request provides the Forth words necessary to access this persistent address space (taken from mappings.fs): : >core dup 10 rshift block swap 1023 and + ; : @f >core @ ; : !f >core ! update ; The @f (fetch-field) word fetches a cell from persistent storage. Except for the fact that it reads from persistent space, a programmer uses @f exactly the same way he'd use @. Likewise, !f (store-field) stores a cell to persistent storage, used exactly like !. Note that !f automatically takes care of marking altered blocks as dirty for you. @f and !f work with byte addresses; hence, a 32-bit Forth system will allow access to 4GiB of persistent storage space. A 64-bit Forth environment will of course allow much larger address spaces. While Unsuitable should be able to run on 16-bit Forth systems, you probably won't get anything more than a toy, as 64KiB of persistent storage space isn't much to work with. 2 Memory Map
When I first conceived of this twin space model of storage management, I anticipated creating a system of dynamic memory management that would encompass everything I needed to preserve. With address 0 reserved as a pointer to some As my work on Unsuitable unfolded, it turned out that a mixture of static and dynamic allocation techniques would be used. Static allocation dominates the functional storage requirements, while dynamic allocation manages the space used in the General Object Store (GOS). Below lies the current persistent memory map (at the time of this writing): FROM TO DESCRIPTION
00040000 0007FFFF 256K GOS
00013800 0003FFFF Unused.
00011000 000137FF 10K Article Relation
00010800 00010FFF 2K GOS Handle Table
00000800 000107FF 64K unused. (Formerly used by GOS before expanding it to 256K)
00000400 000007FF GOS and Article Table Metablock
00000400 GOS allocation fencepost
00000404 Next article ID
00000408-07FF Unused.
00000000 000003FF A vestigial block reserved by ANSI Forth for boot-loader purposes,
dating back to when Forth systems didn't require OSes to boot up.
Thankfully, we don't need to memorize base addresses; the remainder of the code refers to the various allotments and their sizes by name. Thus, the source file mappings.fs exists to define these names1: : >core dup 10 rshift block swap 1023 and + ; : blocks 1024 * ; : @f >core @ ; : !f >core ! update ; ( block 1 = meta-block ) : g-fencepost 1 block ; : a-nextId 1 block [ 1 cells ] literal + ; ( blocks 2 .. 65 = 64K unused space -- formerly general object store ) ( blocks 66 .. 67 = handle table for general object store ) 1 blocks constant /hfields 66 blocks constant addrs addrs /hfields + constant lens ( blocks 68 .. 77 = article table columns ) 2 blocks constant /afields 68 blocks constant articleIds articleIds /afields + constant titles titles /afields + constant leads leads /afields + constant bodies bodies /afields + constant timestamps ( blocks 256 .. 511 = General Object Store [256K version] ) 256 blocks constant gorg gorg 256 blocks + constant gend 3 Parallel ArraysUnsuitable proudly uses parallel arrays (column-major storage) to store structured data instead of data structures or records (row-major storage). Forth lacks integrated support for structures, and I find the level of abstraction offered by them not worth the hassle to implement. They also permit adding or removing whole columns of data without affecting already persisted information, quite a handy feature while the software evolves! The Wikipedia article linked to above lists four objections to using parallel arrays. I'd like to address them here.
4 What's NextWe discussed a few concepts which future articles will treat as a foundation on which to build. In the next article, I discuss the most important storage abstraction Unsuitable has: the General Object Store. Stay tuned. 1 For those who are familiar with coding for or administering the IBM mainframe line, mappings.fs serves the same role as defining data-sets in MVS or z/OS. Since Forth thankfully deals with logical block addresses, though, I posit Forth is better than MVS and its successors, at least in this respect, because I don't have to deal with physical CHS addressing. 2 Or, at least, organized as if it were; note Forth also lacks syntactic support for module-level encapsulation. |