Commodore PET (Software)

Computers Overview

Commodore PET

Adverts

Articles

Documents

Options

Photos

Projects

Repairs

Software

Disk Images

File Access

LOS-96

PET BASIC 4.0

Disk Cmds

Snippets

Programs

Sinclair ZX80

Sinclair ZX81

BBC Micro

Commodore 64

Sinclair ZXSpectrum

Memotech MTX

Memotech CP/M

Tatung Einstein

Atari ST

Commodore Amiga

PDAs

DEC 3000 AXP

OpenVMS

Raspberry Pi

The Commodore PET (Model : CBM 8096)

Software - File Level Access

Introduction

As described on the Disk Images page, the easiest way of working with SD card floppy disk emulators is to use CBM format disk images, but, what if you want to read & write individual files on the SD card?

The information on this page applies to petSD / petSD+ and petSD-duo, but will also apply to other SD card floppy disk emulators using sd2iec firmware.

PET/CBM SD Card Discrete File Access

The floppy disk emulator firmware makes individual SD card files appear to PET/CBM BASIC in the same way that the Disk Operating System (DOS) in a Commodore disk drive would, the low level disk routines are handled by the firmware or DOS, but there are some issues to be aware of.

Question : When is a "P" not a "P" ?

Answer : When it's in a PET !

The video display for the 8032/96 is generated using a MOS 6545, the forerunner of the Motorola 6845 Cathode Ray Tube Controller (CRTC). The display is character based, each character being made up of a block of 8 x 8 pixels, at 64 bits/character, the corresponding bit patterns for a set of up to 256 characters can be stored in a 2kB ROM. However, not all of the "characters" are printable, the range also includes control codes such as Line Feed (0A_h), Carriage Return (0D_h), etc. When considering displays based on the ASCII standard, printable characters usually start with character 32 (20_h) representing a <space>, character 80 (50_h) representing "P", etc.

Things are a little more complicated in a PET though . . . .

The PET character set is not the same as the ASCII set that was typically used in similar applications of the time, rather, it uses a custom set of characters, commonly referred to as PETSCII. The character ROM contains two separate character sets, one set has upper and lower case characters for usage in text only mode, the other set has upper case and graphics characters that could be used for simple graphic displays. The character mode can be swapped by a POKE to memory address 59468 - this has the effect of switching ALL of the characters on the screen to the alternate character set - characters from both can not be displayed on the screen at the same time without additional programming.

There were minor changes to the PETSCII character set between different Commodore computers and different start-up behaviour between different models of PET. Early PETs such as the Model 2001 started up in upper case mode, whilst later PETs, such as the Model 8032 started up in lower case mode. The 8032 can be switched to upper case/graphics by entering "POKE 59468,12" and back to lower/upper case with "POKE 59468,14". The case can also be changed on an 8032 by "print chr$(142)" (upper case/graphics) and "print chr$(14)" (upper/lower case) - "print chr$(14)" also has the effect of increasing the line spacing which is not done when the POKE command is used.

Question : Why is this relevant to File Access ?

Answer : Because of the way that Commodore implemented BASIC 4.0.

Let's take an example, using my 8032, after start-up, pressing the letter "P" on the keyboard displays a letter "p" on the screen, that's as expected, since we know that the 8032 starts up in lower case mode.

Typing print 10, returns "10" - again, as expected, however, PRINT 10 returns "? syntax error"

Perhaps not a total surprise, but Commodore BASIC 4.0 is obviously case sensitive, that in itself would potentially be a minor inconvenience, but this has wider implications because of the difference between how key presses are displayed and how they are interpreted. In the table below, the 5th and 6th columns show how keys pressed on an 8032 would be displayed on screen, but this does not represent how the key strokes are interpreted by PET BASIC 4.0.

Key

ASCII

PETSCII

Upper /

Lower

Upper /

Graphics

Unshifted

112 (70_h)

80 (50_h)

Shifted

80 (50_h)

208 (D0_h)

[Symbol]

When using the upper case/graphics character set, it is easy to see the impact of the shift key from the feedback on the screen, typing a shifted letter would result in the display of a graphic symbol and it would be obvious if the resultant display was not as intended.

However, the BASIC interpreter only recognises commands using text in the code range 65 (40_h) to 90 (5A_h), that is, characters "a" to "z" in upper/lower case character set and characters "A" to "Z" in the upper case/graphics character set.

This explains why entering PRINT 10 returns "? syntax error" when entering a command in immediate mode using the upper/lower case character set - the interpreter does not recognise character 208 etc. as valid and reports an error. The situation is made worse for a PET "newbie" like me when entering a BASIC program as PET BASIC does not perform syntax checking when a line is entered, only when it is run. Entering a BASIC program in exactly the same form as, say, a printed example, may appear to have been accepted without error, but will fail when executed if upper case letters had been used. Listing the program will display what the BASIC interpreter "sees" when it tries to run the program

As typed	Result	As listed
10 REM	? syntax error"	10
20 PRINT "Hello World"	(not reached)	20 "Hello World"

10 rem	(nothing)	10 rem
20 print "Hello World"	"Hello World"	20 print "Hello World"

This issue is compounded when working with files on disk and using capital letters in filenames. Was that disk file created in upper/lower case mode, or was it in upper case/graphics mode? Regardless of how they may appear on the display, the file names are NOT the same, using only lower case file names will avoid such potential confusion, at least, in the PET domain. Things are even more complex when we consider reading & writing the SD card from, for example, a Windows based PC.

PC < - > Emulator SD Card File Transfer

PET SD card floppy disk emulators typically use FAT32 formatted SD cards, when connected to a PC running under, for example, Microsoft Windows, disk images and discrete files can be transferred to & from the SD card using standard Operating System tools such as Windows Explorer.

SD Card Filenames

Like any other computer data, a filename is stored internally as a number, this obviously requires a mechanism for translating the letters into the number to be stored in the computer, i.e., the data needs to be encoded. One of the earliest character encoding schemes was the American Standard Code for Information Interchange (ASCII), which, not surprisingly, is based on the English alphabet. A 7-bit encoding scheme such as ASCII can only handle 128 characters, well short of what would be required to encode international character sets.

Nowadays, encoding is done using Unicode, described on the Unicode Consortium website as "The standard for digital representation of the characters used in writing all of the world's languages. Unicode provides a uniform means for storing, searching, and interchanging text in any language. It is used by all modern computers and is the foundation for processing text on the Internet." Unicode itself can be implemented using different character encodings, a common one being UTF-8.

A FAT32 formatted SD card will normally store filenames encoded with Unicode, but there is no way to automatically map PETSCII to other encoding formats. To map PETSCII codes to Unicode (or ASCII), the device would need to know which Commodore computer with which nationalisation stored the files and whether or not it was in upper case/graphics mode or in lower case/upper case mode.

If the computer sent a filename character encoded as 65, should it map to Unicode 'a' or 'A'? The US ASCII PET character set is different from the German DIN character set, etc. Since the device has no way to acquire all the required information, it is unable to do a proper character set conversion.

Even worse, even if we had all the required information, the mapping would still not be bijective. Therefore, it is impossible to convert from PETSCII to Unicode and back to PETSCII without a possible loss of information, which may break programs because they expect filenames that aren't available because other codes were used for the same glyph. Another problem is the usage of characters that aren't allowed in a FAT file system.

In view of the constraints described, Nils Eilers, the designer of petSD, advises that the exchange of simple, single file programs between PC and petSD works fine, but PET/CBM programs that expect to load other files from disk could have problems. For anything but the simplest of programs, it is safer to use disk images.^[1]

Credits :

1. Some of the above information about PET to SD card encoding was provided by Nils Eilers, but any errors or omissions are all mine!