CFOG's PIP, October 1987, Volume 5 No. 12, Whole No. 60, page 3
CP/M and MS-DOS Compared: User Areas and Sub-Directories
by Phil Hess
[This article originally appeared in the BAMDUA Newsletter published the Bay Area Micro Decision Users Association, Berkeley, CA. -- bhc]
Introduction
While most of us are contentedly putting up with CP/M's little oddities, there are millions of people out there, caught up in the swirl of microcomputer history, who are struggling with another operating system, called MS-DOS. MS-DOS is an acronym for Microsoft Disk Operating System, although IBM calls its version PC-DOS. Most MS-DOS users simply call it DOS ("doss"), just as most users of IBM systems, without stopping to think, refer to their computers as "PC's", even though all microcomputers are "PC's".
And while these MS-DOS users don't concern us much in our day-to-day computing, nevertheless many of us will eventually join their ranks, either because we will use a computer at work which runs MS-DOS, or else we'll buy a second computer (such as Morrow's Pivot) which runs MS-DOS, or we'll install a 16-bit coprocessor in our Micro Decisions to run MS-DOS part-time.
For one reason or another, many of us will find ourselves learning a second operating system, whether we want to or not. And while we're at an advantage over a lot of users since we already know one operating system, there are some differences between CP/M and MS-DOS which are bound to be confusing. The purpose of this article then is to point out a few of those differences, while revealing a few things about CP/M as well. This should help pave the way for those of us forced by circumstance or history to learn more than one operating system.
(I don't want to sound as though we're all going to be making this move any day now. Hardly. Our Micro Decisions still run swell, and Morrow users' groups offer great help and support... I certainly have no intention of getting rid of mine. But it's in the cards that a lot of us are going to be learning MS-DOS.)
CP/M user areas
Disk user areas are a feature of CP/M which you don't need to know much about if you're using floppy disks. Knowing about them won't provide you with any more disk space. You normally need to use only user area 0, which is the "logged-in" user area when you power up your system and, unless you change it, is still the logged-in user area when you power your system down. However, other user areas, numbered 1 through 15, exist on the disk. Or rather, the potential for other user areas exists. (You can log into one of these other user areas by entering USER followed by a space and a number between 1 and 15. Press the <RETURN>, then enter DIR to see if there are any files in the user area -- there probably won't be -- then enter USER 0 to get back to where you started.)
When I first read about user areas in the CP/M 2.2 manual, I couldn't figure out what they were all about. In my literal-minded way I assumed that these were areas of the disk reserved for or available to other users, even though I could see that only one user -- normally me -- would ever be using this computer.
Actually, user areas are not separate or even potentially separate areas on the disk. Rather, they're just another way of further specifying a file. In addition to a file name and extension, a user number is stored in the directory for every file on the disk. However, CP/M only "sees" the files with the logged-in user number. When you enter DIR, CP/M displays only the names of those files on the disk with the logged-in user area's number. With floppy-based systems, this number is normally 0 for every file on the disk. This means that when you power up into user area 0, you can display the names of all files on the disk, list any file, delete any file, and so on. This is because unless you have used programs written specifically to access other user areas, all the files on your disks were created in user area 0.
So, unless you specify otherwise, all files are created "in" the logged-in user area (normally 0) and you don't need to worry any more about user areas. The potential for putting files in other areas exists, but you probably have no need for it.
However, as soon as you begin to think about a CP/M system with a fixed disk such as Morrow's MD-5 or MD-11, you must begin thinking about user areas. This is because with a large-capacity fixed disk, there needs to be some way of organizing the files on the disk, not only for the sake of operating system efficiency but for your own efficiency as well. Listing the directory in alphabetical order helps some, as does listing a few files at a time instead of listing the entire directory, but at some point -- maybe 50 files, maybe 100 files, whatever your breaking point -- there get to be so many files in the directory that you no longer can keep track of what's going on.
This is where user areas come in handy. You can put a bunch of little-used files in a little-used user area, put a few more files in another user area to keep them logically together, and maybe keep the bulk of your programs in user area 0 so that they're ready to go when you power up into user area 0. Whatever system works best for you is best.
This is also where having CP/M 3.0 comes in handy because of its additional features aimed specifically at managing more than one user area. For one thing, it's easier to copy files from one user area to another with the CP/M 3.0 PIP command. With CP/M 2.2, you first had to follow the cryptic instructions at the bottom of page 22 of the manual before you could even use PIP to put files in a user area other than user area 0.
Some quite understandable confusion is possible when using more than one user area. For one thing, more than one file with the same name can now exist on the same disk. This is possible because even though two files on a disk may have the same name and extension, their user numbers will be different. CP/M can tell the files apart and know which goes where by the user numbers stored with the file names in the directory.
There are several ways of thinking about user areas. You can think of the user areas as different disks named A0, Al, A2, through A15, for example, but sixteen disks that are always available on-line, with disk A0 being the logged-in disk when you power up. Or you can think of them as sixteen pigeon holes, each of which can be empty or can contain any number of files, with user area 0 having a special significance. Or you can simply think of them as user areas, sixteen "logical" (as opposed to physical) areas on the disk that share the same fixed amount of disk space. Whatever is easiest for you to conceptualise. Just as long as you're comfortable with user areas and understand their usefulness.
Here's one possible way to think about user areas:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The sixteen user areas are all on the same "level" and all are "equal" in the sense that you can be logged into any user area and access files in the same way from any other user area (with the exception of user area 0 with CP/M 3.0, which we'll discuss shortly).
MS-DOS subdirectories
Similarly, MS-DOS has the ability to divide the directory into named "subdirectories" in order to manage the potentially large number of files on a fixed disk. Or rather, the main or "root" directory can contain one or more subdirectories in addition to files, and each of these subdirectories can also contain files and more subdirectories. This is what is called a tree-structured file directory, although it's kind of an upside-down tree.
Here's a diagram of a possible directory structure, excluding the files which would normally be present in each of the five subdirectories:
ROOT
|
/ \
/ \
/ \
SUB.1 SUB.2
| |
/ \ \
SUB.4 SUB.5 SUB.3
With MS-DOS, each subdirectory can have a name, including an extension, just like a file. In fact, these subdirectories are actually special files that you create and name yourself as needed using a special set of commands. Only one subdirectory is considered "current" at any time. Initially, after booting the system, the "root" directory is the current directory. Like CP/M, this root directory remains the current directory unless you make one of the subdirectories the current directory (analogous to logging into a different user area).
With CP/M, storing a user number with each file name in the directory doesn't take up much directory space on the disk, in fact only one byte per directory entry. However, when each subdirectory has a name, the entire subdirectory name would have to be stored with each file name. In fact, with a tree-structured directory, the entire "path" of how to get to a file from the root directory would have to be stored in the directory for each file. (With CP/M, this "Path" is simple: it consists of the file's user area number.) In the above example, a file named FILE.1 in subdirectory SUB.3 would have to be stored with its path from the root directory as follows:
SUB.1|SUB.3|FILE.1
(The backslash is used to separate subdirectory names when specifying the path to a file.) [I've used a | because the printwheels that I have don't have a backslash. -- bhc]
This could quickly take up a lot of disk space. But fixed disks have a lot of space, right? Well, yes, they do, but as with CP/M, the amount of space on an MS-DOS disk set aside for the directory is fixed in advance during formatting. This disk space is called the "directory tracks" and is normally a part of the disk you never have to worry about. It's already been subtracted from the disk's total capacity during formatting.
Structural Differences
With CP/M 2.2 on a Morrow MD2, 4 kilobytes of disk space were reserved during formatting for the file directory, leaving 186K for the files. This 4K is enough room to store directory information for up to 128 files on each disk, which is more than enough for a disk of that size.
With CP/M 2.2 on an MD3, the directory takes up 6K of disk space, leaving 384K for up to 192 files. And on the MD5, 64K of the 5,436K capacity of the fixed disk is reserved for the fixed disk's directory. This 64K directory allows for the possibility of 2048 directory entries, which with CP/M 3.0 consist not only of file names, but file date and time information, file passwords, and a disk label.
With MS-DOS 2.0, the root directory is limited to 112 files with a double-sided floppy disk. This directory takes up 3.5K, leaving 354K for files on a non-system disk. But since one or more of these files can be a subdirectory, the number of files that can be stored on a disk is practically limited only by the capacity of the disk.
What this means is that with MS-DOS, if you want to put more than the maximum number of files allowed on a disk, or if you want to create subdirectories, you have to give up some of the disk space reserved for files to store the additional directory information.
In practice, little of this is apparent to most of us, and it really doesn't matter how the operating system does it -- you'll probably never run out of directory space on a disk. Rather, you'll almost always run out of file space first.
Operational differences
Beyond the differences in directory storage dictated by the use of a named, tree-structured directory (MS-DOS) as opposed to a numbered, single-level directory (CP/M), there are the more important differences (to us) of how to specify a file.
If you stay in user area 0 with CP/M and if you stick to the root directory with MS-DOS, you won't notice many differences. With CP/M, for example, you copy a file using PIP:
PIP FILE.2=FILE.1
With MS-DOS, you would use the COPY command to create the file:
COPY FILE.1 FILE.2
This is not a very important difference. All you have to remember is the name of the copy command, the order of the file names, and whether to separate them with an equal sign or a space. The rules for naming files are the same and basic commands like ERA, REN and DIR do about the same thing in MS-DOS as they do in CP/M.
The complications arise when you begin to organize your files. Actually, with CP/M it's still pretty simple since the user areas are numbered and they're all on the same "level". For example, to copy a file from user area 2 to the logged-in user area, you still use PIP (assume disk A is logged-in):
PIP A:=FILE.1[G2]
With MS-DOS, things also stay pretty simple if you're copying files to the root directory. For example, suppose the current subdirectory is SUB.2 and you want to copy a file named FILE.1 back to the root directory (refer to the previous diagram):
COPY FILE.1 |
[Again, the | represents a backslash. -- bhc]
Like CP/M, you don't need to specify the destination file's name if it will stay the same. The backslash indicates the root directory.
Everything becomes more complicated, though, when you want to copy a file to a subdirectory. For example, the following is required to copy a file to the SUB.4 subdirectory if SUB.2 is the current subdirectory:
COPY FILE.1 |SUB.1|SUB.4
[Again, the | represents a backslash. -- bhc]
The first backslash indicates that the path to our destination starts in the root directory, descends into the SUB.1 subdirectory, and finally ends in the SUB.4 subdirectory.
It can get pretty tricky remembering all the levels of the subdirectories and where the backslash goes. In my experience, as soon as most people begin asking (innocently) about subdirectories and get the low-down on "root" and "backslash" and "path", their eyes glaze over. As a result, I would not recommend anyone to venture below the first level of subdirectories when first organizing files on a disk. Stay in the root directory at first, where hopefully you've put all the necessary DOS programs, and put files in first-level subdirectories only to get them out of the way. Then, later, you might attempt to work from within a subdirectory or create even lower-level subdirectories.
Subtle differences
This brings up another difference between CP/M and MS-DOS. With CP/M 2.2, the various user areas are normally pretty well closed off from one another except via PIP. In general, programs cannot access files in other user areas unless, like NewWord, they have been specifically written to do so. With CP/M 3.0, this is also true with the exception that programs and files in user area 0 which have been designated as system (SYS) files can be accessed for read-only purposes from all user areas. [Actually, there are a good many CP/M programs for both 2.2 and 3.0 that deal easily and directly with files in user areas other than the current one. VDE, NSWP, and many utilities written since fixed disks and hence the use of user areas has become commonplace on CP/M systems. -- bhc]
For example, with CP/M 3.0, you could put all three WordStar program files in user area 0 and assign them the system status using the SET command. Then, when you're logged into another user area and type WS, CP/M first looks in that user area for WS.COM. If it can't find it there, CP/M looks for a system file named WS.COM in user area 0. Similarly, when WS.COM calls WSMSGS.OVR and WSOVLY1.OVR, CP/M will locate these files in user area 0 if they're not in the logged-in user area. With CP/M 3.0, you might say that there's an alternate "path" that the operating system can take in locating a file to run or read if it can't find it in the logged-in user area. In this case, the path always leads to user area 0. [For CP/M 2.2 users, installing ZCPR3 offers a path to specified user areas, too. -- bhc]
In MS-DOS, you can define your own "paths" using a special command called PATH. Thereafter, whenever you try to run a program which isn't in the current subdirectory, MS-DOS looks back along the path for the program. This can be handy, but it can also become complicated and confusing. Not only do you have to take care of setting up the path yourself, but with some programs the path doesn't do you much good.
For example, say we put the three WordStar files in the root directory and then set up a path leading back to the root so that we can work in a subdirectory where some of our document files are located. For example, if we enter WS, and SUB.1 is the current subdirectory, MS-DOS successfully locates WS.COM in the root directory and runs it. But after that the program is on its own. As soon as WordStar tries to access one of the overlay files (WSMSGS.OVR or WSOVLY1.OVR), it runs into problems and issues a file-not-found error message. This is because MS-DOS doesn't help programs out the way CP/M 3.0 does. MS-DOS only helps out in locating files at the command level, not at the program level.
Eventually these problems will be solved as more and more programs are written with subdirectories in mind; the programs will be able to search out overlay or data files themselves. [WS4 for both CP/M and MS-DOS has solved these problems for WS users. They recognize user areas and subdirectories, respectively, and WSCHANGE allows you to tell WS where the overlay files are located. -- bhc] But for now, much of the available software isn't "smart" enough to do that; many existing programs simply assume that all files are in the current subdirectory.
Summary
At first glance, it might appear as though MS-DOS would be the operating system of choice with regard to directory structure. It certainly has the features. However, in my experience, MS-DOS is often more difficult to work with, a factor which must also be considered.
When I first began working with CP/M user areas, I thought I would have difficulty remembering which user area contained which files. There were no "names" associated with the user areas, only numbers, and not even numbers I could assign myself. [ZCPR3 and CCP+, a replacement for CP/M 3.0, both offer named directories. WS4 and many utilities for Z-System recognize the named directories by their names as well as by their drive and user number, often called "du" nomenclature. -- bhc]
In practice, though, I found that with a fixed disk, the files in a particular user area rarely changed much from day to day, and so certain user area numbers soon came to mean as much to me as any name I might have come up with. Plus, user numbers required only a few keystrokes to enter.
In contrast, even though I use PC-DOS nearly every day, I still make numerous errors in entering the subdirectory names, copying files from one subdirectory to another, or even remembering which subdirectory is the current one.
Part of the problem with named subdirectories is that subdirectory names are often used in the same way as file names. For exainple, with the earlier example, suppose that SUB.2 is the current subdirectory and you want to copy a file named FILE.1 to subdirectory SUB.1, but mistype the command as follows:
COPY FILE.1 |STUB.1
[Again, the | represents a backslash. -- bhc]
Instead of creating a file named FILE.1 in the SUB.1 subdirectory, you have created a file named STUB.1 in the root directory. DOS simply did what it was told and you proceed without knowing that you have made a big mistake. When you do discover your error, confusion results. This sort of mistake would be hard to make in CP/M.
In summary, then, while the tree-structured, flexible directory of MS-DOS is often hailed as an "advance," in practice it often proves to be a source of confusion and frustration to users at all levels of experience. In contrast, while CP/M's directory structure is limited and inflexible, it is also easy to understand and, perhaps more importantly, easy to use.
(BAMDUA Editor's Note: I have found that user areas are quite a boon in a number of respects. On the MD3, using DSDD drives, we have room for over 300K for files, as you know. Sometimes it is convenient to have all the files that go with one program in a user area separate from the files that go with another program if both programs are on the same disk. Also, if you want an alternative to creating a library but do want a set of files placed together and separate from another set of files, using different user areas for the two groups is an ideal and simple solution. Of course, Newsweep can look into any area you tell it to! Even more of a boon is the use of user areas with quad density drives. In that case, well over 700K of space is available for files thus one disk can contain a good many of the programs and utilities you like to use repeatedly. I have placed my favorite utilities in one area, separate from programs for writing, which in turn are separate from special print formatting programs. Well, you get the idea -- it is very handy!).