Unix provides a feature whereby the standard output of one program can be piped directly into the standard input of another. (MS-DOS provides a limited version of this feature.) The vertical bar ('|') is the pipe operator. For example, if I want to show only those files in my current directory containing the string 'sjm', I can type:
ls | grep sjm
The 'ls' command lists files in a directory (like the MS-DOS 'dir' command). I've piped its output directly into 'grep', which is a powerful utility that searches for strings specified by regular expressions. The resulting output from 'grep' will be a list of those files that contain the string 'sjm'.
You can see what a powerful capability this is. Since OS 1100 provides no similar mechanism, this presentation examines how we might implement pipelines in user batch jobs. We will look at a couple of examples of how pipelining techniques can be used to help write smarter batch jobs.
Both of our examples will involve passing data between programs via a temporary file containing ASCII data images in the form of Stream Generation Statements (SGS), to be processed by the Symbolic Stream Generator (SSG).
Traditionally, disk space has been monitored using small utility processors that were written in assembly language and passed around from site to site. There are many of these, with names such as @DISCS, @MS, @AVAIL, @DISQUE, etc. All of these share a couple of common features. They read OS 1100's Logical Device Access Table and produce a printed report showing the names and status of configured disk drives. And they require a person to read and interpret their output.
The printed output from these processors has column headings, spacing and page breaks. While these make the report pleasant for a person to read, they make it difficult for another program to read. What another program needs is simply the data free from embellishments that aid human readers.
A common solution to this problem is to breakpoint the print to a file and then edit this file with a text editor, such as @IPF or @ED. A text editor macro could be used to massage the print file into a format that can be read by a program. This can work well but has three problems. First, writing the text editor macro is very time consuming and rather boring. Second, you're never really sure that the text editor macro is bullet-proof. After months of working correctly it may fail because a rarely seen warning message shows up in the print and disrupts the macro. Third, healthy paranoia tells you that the next release of the processor may slightly modify the print file format and your text editor macro might have to be re-written.
Another solution is to write a program specifically to solve your problem. We might choose to write an assembly language program to check for total available fixed disk space. The main drawback to this approach is that writing single purpose programs means we are constantly be re-inventing the wheel. We would rather have a set of general-purpose utilities that can be used as building blocks for a wide variety of applications.
So a better solution would be to have a general-purpose disk space utility processor produce a standard System Data Format (SDF) file containing just the selected data in a simple, program-readable format (i.e., ASCII). This file could then be piped into another program in a manner analogous to pipelining in Unix.
@DISKPL is such a program. Written in assembly language (MASM), it is approximately 300 lines long. The only thing it does is read the Logical Device Access Table and write a record for each configured disk drive. @DISKPL writes these records to a temporary file called PL$$DISKPL. It does this by calling a general-purpose MASM subroutine, PIPELINE.
PIPELINE is a 300-line assembly language subroutine that provides three entry points for pipelining--open the pipeline, write the pipeline, close the pipeline. This subroutine is collected with the calling program.
As part of its initialization code, @DISKPL calls PIPELINE to initialize the pipeline. PIPELINE assigns a temporary SDF file called PL$$xxxxxx (where 'xxxxxx' is the processor name), attaches an @USE name of PL$$ to it, and writes the SDF header. @DISKPL then calls PIPELINE to write the RELATION SGSs to the pipeline (see below). Then, for each disk drive found, @DISKPL calls PIPELINE to write an SGS describing that device. As part of its termination logic, @DISKPL calls PIPELINE to write the SDF terminator.
After @DISKPL terminates, PL$$ is available just like any other temporary file. Its format makes it ideal input for an SSG skeleton, but it can be input to any other program capable of reading SDF files, including ED and COBOL (the COBOL 'UNSTRING' verb would come in handy). Figure 1 shows a sample PL$$ file produced by @DISKPL.
RELATION DISK 1 DISK_LDATX . LDAT INDEX RELATION DISK 2 DISK_DEVNAM . DEVICE NAME RELATION DISK 3 DISK_EQUIP . EQUIPMENT MNEMONIC RELATION DISK 4 DISK_STATUS . STATUS (UP,DN,SU,RV) RELATION DISK 5 DISK_PACKID . LOGICAL PACK-ID RELATION DISK 6 DISK_PREP . PREP FACTOR RELATION DISK 7 DISK_TRKAVL . # OF TRACKS AVAILABLE RELATION DISK 8 DISK_FIXREM . PACK TYPE (FIX,REM) RELATION DISK 9 DISK_ASGCNT . ASSIGN COUNT . DISK 1 DA0 MDISK UP FIX101 112 25680 FIX \ . DISK 2 DA1 MDISK UP FIX102 112 48501 FIX \ . DISK 3 DA2 MDISK UP REM101 112 16451 REM 22 . DISK 4 DA3 MDISK UP REM102 112 26550 REM 22 . DISK 5 DA4 MDISK UP REM103 112 28671 REM 22 . DISK 6 DA5 MDISK UP REM104 112 91397 REM 6 . DISK 7 DA6 MDISK UP REM105 112 46216 REM 1 . DISK 8 DA7 MDISK DN \ \ \ \ \ .
The pipeline (PL$$) file contains SGSs with two different labels. The DISK SGS describes the disk drives configured on the system. There are fields desribing device name, pack name, device type, fixed or removable usage, etc.
The RELATION SGSs describe the fields on the DISK SGSs. If we think of the DISK SGSs as a set of third normal form relations describing disk drives, then we can think of the RELATION SGSs as the relational catalog. Thus, for example, the first RELATION SGS specifies that the first field on each DISK SGS is called 'DISK_LDATX'. The RELATION SGSs provide symbolic names that can be used in SSG skeletons to refer to fields on the DISK SGSs. By using symbolic names we make our skeletons easier to read and maintain. In addition, symbolic names allow the designer of the upstream program (e.g., @DISKPL) to make future changes in the order of the DISK SGSs without adversely affecting downstream programs.
Figure 2 shows part of a runstream that will check to ensure that there is at least 100,000 tracks of available fixed disk space. It invokes @DISKPL and then processes the SGSs in the resulting PL$$ file with an SSG skeleton, FIXEDSKEL, in Figure 3.
@RUN BIGJOB,,SJM @ . @DISKPL @SSG SKELFILE.FIXEDSKEL,PL$$. SGS MINFIX 100000 . minimum acceptable fixed disk tracks WAITTIME 2 . wait time (minutes) before trying again @EOF @EOF @ . @ . We now have lots of fixed mass storage, so . . . @ . @ASG,T BIGFILE.,F/25000//50000 etc.
1: *INCREMENT R_X TO [RELATION] 2: *SET [RELATION,R_X,3,1] = [RELATION,R_X,2,1] 3: *LOOP R_X 4: *. 5: *CLEAR TOT_FIX_TRKS . Total fixed tracks available 6: *. 7: *INCREMENT D TO [DISK] . For each disk drive 8: *IF [DISK,D,DISK_FIXREM,1] = FIX AND ; 9: [DISK,D,DISK_STATUS,1] = UP 10: *SET TOT_FIX_TRKS = TOT_FIX_TRKS + [DISK,D,DISK_TRKAVL,1] 11: *ENDIF 12: *LOOP . D 13: *. 14: *IF +TOT_FIX_TRKS < +[MINFIX,1,1,1] 15: *DISPLAY,O 'Only [*TOT_FIX_TRKS] tracks available'; 16: 'fixed disk.' 17: *DISPLAY,O 'I''ll wait [WAITTIME,1,1,1] minutes.' 18: *WAIT 60*[WAITTIME,1,1,1] 19: #DISKPL 20: #SSG [SOURCE$,1,1,1],PL$$. 21: SGS 22: MINFIX [MINFIX,1,1,1] 23: WAITTIME [WAITTIME,1,1,1] 24: #EOF 25: #EOF 26: *ENDIF
Lines 1-3 of FIXEDSKEL create global numeric variables from the RELATION SGSs (see Figure 1). These variables allow the skeleton to use symbolic field references, rather than numeric field references. This makes the skeleton more readable.
Lines 5-12 calculate the total number of tracks available on fixed mass storage. Line 14 tests whether this exceeds the minimum acceptable. If not, then the skeleton notifies the computer operator (lines 15-17), waits the required number of minutes (line 18), and then re-invokes the @DISKPL and FIXEDSKEL (lines 19-25). This will loop until the requird minimum number of tracks of fixed mass storage is available; then it will proceed to the next ECL statement in the job.
There are many other uses to which @DISKPL may be put. For example, a related processor, called @DISK, writes a report suitable for human readers. It does this by invoking @DISKPL, then reading the PL$$ file that @DISKPL produces. @DISK is written in SSG.
Consider this problem. I want to read the Master File Directory (MFD) and select all files on our system with qualifier 'SJM' that have not been referenced for at least one year. I then want to change the security attributes for all files selected: in particular, I want to attach an Access Control Record.
Rather than develop homegrown code, we will use a commercially available product--the Mass Storage Analysis and Retention (MSAR) utility. MSAR is developed and supported by TeamQuest Corporation, and jointly marketed with Unisys. Beginning with release 5R1, MSAR can select files from the MFD and write a set of SGSs. These SGSs can then be input to a user-written SSG skeleton. (And, of course, we could process the SGSs with a text editor or a high-level language.)
Figure 4 contains the MSAR commands that will select all files with qualifier 'SJM' that have not been referenced in at least one year. MSAR writes an SGS for each file selected into a file supplied by the user. This file may be temporary or catalogued. In Figure 4, I provided a temporary file called PL$$. The 'SSG_ADD' command directs MSAR to invoke a user-supplied SSG skeleton. In my example, I put the skeleton in 'TPF$.ACRSKEL'.
@ELT,IQ TPF$.ACRSKEL . Define my skeleton #SIMAN,B . B = SIMAN batch mode *INCREMENT M TO [MFDF] . For each file selected UPD FIL = [MFDF,M,1,1]*[MFDF,M,2,1]. FIL_ACC = ACR_CON ATT_ACR = ARC001 ACR_OWN = SECOFF ; *LOOP . M #EOF @EOF . End of skeleton @ . @ASG,T PL$$.,F . Temp file for SGSs @MFDRPT,I PL$$. QUALIFIER SJM REF_DAYS_GT 365 SSG_ADD 'TPF$.ACRSKEL' @EOF
Figure 5 shows a sample set of SGSs produced by these MSAR commands. MSAR would write these SGSs into the file I supplied (PL$$, in my example in Figure 4).
Note that MSAR creates one SGS for each file selected. The SGS has the label MFDF and contains various data about the file. My SSG skeleton uses the MFDF SGSs to generate Site Management Complex (SIMAN) commands to attach ACR 'ARC001' to the selected files.
MFDF SJM OLDFILE 2 SJM TECHOPS ; FIXED 0 2000 '''' '''' ''VP'' ''F'' '''' FAS021 MFDF SJM SAVEDATA 1 PREP TECHOPS ; FIXED 0 2000 '''' '''' ''VP'' ''F'' '''' FAS034 MFDF SJM FISCAL92 1 PREP TECHOPS ; FIXED 0 2000 '''' '''' ''VP'' ''F'' '''' FAS002
In addition to querying the MFD, MSAR can write another set of SGSs that provide enough information so that you can write an SSG skeleton that will re-create your TIP File Directory. You can periodically run an MSAR job to save your TIP File Directory as a set of SGSs. After a TIP initialization boot, you can then run another job to read these SGSs and re-create your TIP file directory (using FREIPS or TREG/TFUR).
When selecting files from the MFD via MSAR, you can sometimes generate more MFDF SGSs than current levels of SSG can handle. SSG's DBANK size has been limited to 262,000 words. This scaling limit is removed in SSG 23R1, due to be released with System Base 5R3.
I encourage designers of 1100/2200 software to consider add pipelining to their products. Almost any program that has a 'list' or 'select' function is a candidate for piping its output.
@DISKPL, @DISK, the PIPELINE subroutine, plus a host of other pipelining programs, are part of the Group W Toolset, written and supported (so far as is feasible) by Tom Nelson, Bill Toner, and me. You can download it from this Web site.