|
The Commodore PET
(Model : CBM 8096) |
Software - File Level Access
Introduction
As described on the Disk Images
page, the easiest way of working with SD card floppy disk
emulators is to use CBM format disk images, but, what if you
want to read & write individual files on the SD card?
The information on this page applies to petSD / petSD+ and
petSD-duo, but will also apply to other SD card floppy disk
emulators using sd2iec firmware.
PET/CBM SD Card Discrete File Access
The floppy disk emulator firmware makes individual SD card files
appear to PET/CBM BASIC in the same way that the Disk Operating
System (DOS)
in a Commodore disk drive would, the low level disk routines are
handled by the firmware or DOS, but there are some issues to be
aware of.
Question : When is a "P" not
a "P" ?
Answer : When it's in a PET !
The video display for the 8032/96 is generated using a MOS 6545,
the forerunner of the Motorola 6845 Cathode Ray Tube Controller
(CRTC). The display is character based, each character being
made up of a block of 8 x 8 pixels, at 64 bits/character, the
corresponding bit patterns for a set of up to 256 characters can
be stored in a 2kB ROM. However, not all of the "characters" are
printable, the range also includes control codes such as Line
Feed (0Ah), Carriage Return (0Dh), etc.
When considering displays based on the ASCII standard, printable
characters usually start with character 32 (20h)
representing a <space>, character 80 (50h)
representing "P", etc.
Things are a little more complicated in a PET though . . . .
The PET character set is not the same as the
ASCII set
that was typically used in similar applications of the time,
rather, it uses a custom set of characters, commonly referred to
as PETSCII.
The character ROM contains two separate character sets, one set
has upper and lower case characters for usage in text only mode,
the other set has upper case and graphics characters that could
be used for simple graphic displays. The character mode can be
swapped by a POKE to memory address 59468 - this has the effect
of switching ALL of the characters on the screen to the
alternate character set - characters from both can not be
displayed on the screen at the same time without additional
programming.
There were minor changes to the PETSCII character set between
different Commodore computers and different start-up behaviour
between different models of PET. Early PETs such as the
Model 2001 started up in upper case mode, whilst later PETs,
such as the
Model 8032 started up in lower case mode. The 8032 can be switched to upper case/graphics by entering "POKE
59468,12" and back to lower/upper case with "POKE
59468,14". The case can also be changed on an 8032 by "print
chr$(142)" (upper case/graphics) and "print chr$(14)"
(upper/lower case) - "print chr$(14)" also has the
effect of increasing the line spacing which is not done when the
POKE command is used.
Question : Why is this relevant to File Access ?
Answer : Because of the way that Commodore implemented
BASIC 4.0.
Let's take an example, using my 8032, after start-up,
pressing the letter "P" on the keyboard displays a letter "p" on
the screen, that's as expected, since we know that the 8032
starts up in lower case mode.
Typing print 10, returns "10" -
again, as expected, however, PRINT 10 returns "? syntax error"
Perhaps not a total surprise, but Commodore BASIC 4.0 is
obviously case sensitive, that in itself would potentially be a
minor inconvenience, but this has wider implications because of
the difference between how key presses are displayed and how
they are interpreted. In the table below, the 5th and 6th
columns show how keys pressed on an 8032 would be displayed
on screen, but this does not represent how the key strokes are
interpreted by PET BASIC 4.0.
|
Key |
ASCII |
PETSCII |
Upper /
Lower |
Upper /
Graphics |
Unshifted |
p |
112 (70h) |
80 (50h) |
p |
P |
Shifted |
P |
80 (50h) |
208 (D0h) |
P |
[Symbol] |
When using the upper case/graphics character
set, it is easy to see the impact of the shift key from the
feedback on the screen, typing a shifted letter would result
in the display of a graphic symbol and it would be obvious
if the resultant display was not as intended.
However, the BASIC interpreter only
recognises commands using text in the code range 65 (40h)
to 90 (5Ah), that is, characters "a" to "z" in
upper/lower case character set and characters "A" to "Z" in
the upper case/graphics character set.
This explains why entering PRINT 10 returns "? syntax error"
when entering a command in immediate mode using the
upper/lower case character set - the interpreter does not
recognise character 208 etc. as valid and reports an error.
The situation is made worse for a PET "newbie" like me when
entering a BASIC program as PET BASIC does not perform
syntax checking when a line is entered, only when it is run.
Entering a BASIC program in exactly the same form as, say, a
printed example, may appear to have been accepted without
error, but will fail when executed if upper case letters had
been used. Listing the program will display what the BASIC
interpreter "sees" when it tries to run the program
As typed |
Result |
As listed |
10 REM |
? syntax error" |
10 |
20 PRINT "Hello World" |
(not reached) |
20 "Hello World" |
|
|
|
10 rem |
(nothing) |
10 rem |
20 print "Hello World" |
"Hello World" |
20 print "Hello World" |
This issue is compounded when working with
files on disk and using capital letters in filenames. Was
that disk file created in upper/lower case mode, or was it
in upper case/graphics mode? Regardless of how they may
appear on the display, the file names are NOT the same,
using only lower case file names will avoid such potential
confusion, at least, in the PET domain. Things are even more
complex when we consider reading & writing the SD card from,
for example, a Windows based PC.
PC < - > Emulator SD Card File Transfer
PET SD card floppy disk emulators typically use
FAT32 formatted SD cards, when connected to a PC running
under, for example, Microsoft Windows, disk images and discrete
files can be transferred to & from the SD card using standard
Operating System tools such as Windows Explorer.
SD Card Filenames
Like any other computer data, a filename is stored internally
as a number, this obviously requires a mechanism for translating
the letters into the number to be stored in the computer, i.e.,
the data needs to be encoded. One of the earliest character
encoding schemes was the American Standard Code for Information
Interchange (ASCII),
which, not surprisingly, is based on the English alphabet. A
7-bit encoding scheme such as ASCII can only handle 128
characters, well short of what would be required to encode
international character sets.
Nowadays, encoding is done using
Unicode,
described on the Unicode
Consortium website as "The standard for digital
representation of the characters used in writing all of the
world's languages. Unicode provides a uniform means for storing,
searching, and interchanging text in any language. It is used by
all modern computers and is the foundation for processing text
on the Internet." Unicode itself can be implemented using
different character encodings, a common one being
UTF-8.
A FAT32 formatted SD card will normally store filenames
encoded with Unicode, but there is no way to automatically map
PETSCII to other encoding formats. To map PETSCII codes to
Unicode (or ASCII), the device would need to know which
Commodore computer with which nationalisation stored the files
and whether or not it was in upper case/graphics mode or in
lower case/upper case mode.
If the computer sent a filename character encoded as 65,
should it map to Unicode 'a' or 'A'? The US ASCII PET character
set is different from the German DIN character set, etc. Since
the device has no way to acquire all the required information,
it is unable to do a proper character set conversion.
Even worse, even if we had all the required information, the
mapping would still not be bijective. Therefore, it is
impossible to convert from PETSCII to Unicode and back to
PETSCII without a possible loss of information, which may break
programs because they expect filenames that aren't available
because other codes were used for the same glyph.
Another problem is the usage of characters that aren't
allowed in a FAT file system.
In view of the constraints described,
Nils Eilers, the designer of
petSD, advises that the exchange
of simple, single file programs between PC and petSD works fine,
but PET/CBM programs that expect to load other files from disk
could have problems. For anything but the simplest of programs,
it is safer to use disk images.
[1]
.
Credits :
1.
Some of the above information about PET to SD card encoding was
provided by Nils Eilers, but any
errors or omissions are all mine!
|