The Parts of a COBOL Program
Recall that a COBOL program is made up of four mandatory divisions. They always appear in the program in the order shown in Listing 1.2.
TYPE: Listing 1.2. COBOL's four divisions.
000100 IDENTIFICATION DIVISION.
000200 ENVIRONMENT DIVISION.
000300 DATA DIVISION.
000400 PROCEDURE DIVISION.
The IDENTIFICATION DIVISION marks the beginning of a COBOL program. The name of the program, which you assign, will be entered as a statement in the IDENTIFICATION DIVISION (more on this in a moment).
The ENVIRONMENT DIVISION contains statements or commands to describe the physical environment in which the program is running. The main use of the ENVIRONMENT DIVISION is to describe the physical structure of files that will be used in the program. You won't be working with files in these early lessons, so for now this DIVISION will be little used.
The DATA DIVISION contains statements describing the data used by the program. The DATA DIVISION and the PROCEDURE DIVISION are the most important divisions in a COBOL program; they do 95 percent of the work. You will start working in the DATA DIVISION in Day 2, "Using Variables and Constants."
The PROCEDURE DIVISION contains the COBOL statements that the program will execute after the program starts running. The PROCEDURE DIVISION is the real workhorse of a COBOL program. Without a PROCEDURE DIVISION, you wouldn't have a program, because all the other divisions are used to create the environment and data that are used by the PROCEDURE DIVISION to actually do something.
You already have seen hello.cbl (in Listing 1.1). It contains no data, no environment, and only one significant statement, which is in the PROCEDURE DIVISION. However, without the DISPLAY "Hello" command, the program would do nothing at all.
Each DIVISION in a COBOL program is broken down into smaller units, like an outline. Briefly, a DIVISION can contain SECTIONs, a SECTION can contain paragraphs, and paragraphs can contain sentences. For the moment, you can ignore SECTIONs, which are introduced in Day 2. Think of a COBOL program as DIVISIONs containing paragraphs containing sentences.
The requirements for the contents of each different DIVISION can vary, but most compilers require that only two things be present in a COBOL program--other than the four divisions--in order to compile it:
· PROGRAM-ID
· STOP RUN
The PROGRAM-ID is a paragraph that must appear in the IDENTIFICATION DIVISION and is used to give the program a name.
There must also be one paragraph in the PROCEDURE DIVISION that contains the STOP RUN statement.
Listing 1.3 is an example of the smallest possible COBOL program that will compile and run on any COBOL compiler. It contains the PROGRAM-ID paragraph and only one paragraph in the PROCEDURE DIVISION.
The paragraph PROGRAM-DONE contains only one sentence, STOP RUN. This sentence causes the program to stop running when the sentence is executed. Most versions of COBOL require this explicit command as a way of identifying the point in the program where the program terminates.
TYPE: Listing 1.3. minimum.cbl, the irreducible minimum COBOL program.
000100 IDENTIFICATION DIVISION.
000200 PROGRAM-ID. MINIMUM.
000300 ENVIRONMENT DIVISION.
000400 DATA DIVISION.
000500 PROCEDURE DIVISION.
000600
000700 PROGRAM-DONE.
000800 STOP RUN.
OUTPUT:
Nothing!
ANALYSIS: Clearly, minimum.cbl does even less than hello.cbl. In fact, minimum.cbl does nothing except stop running as soon as it starts. Its only function is to illustrate the minimum syntax that the COBOL compiler will accept.
Of all the errors that you can make in typing a COBOL program, an incorrect DIVISION name is one of the hardest errors to locate. In one compiler that I tested, misspelling the name of the IDENTIFICATION DIVISION as INDENTIFICATION DIVISION caused the compiler to report that the PROCEDURE DIVISION was missing. This is a difficult error to spot because the real problem was three divisions away, and everything about the PROCEDURE DIVISION was fine. It is important that the DIVISIONs be typed correctly.
Listing 1.4 is more useful than minimum.cbl. You will recognize some similarities to hello.cbl, but I have divided a couple of the lines to illustrate a few more things about COBOL.
TYPE: Listing 1.4. Three levels of COBOL grammar.
000100 IDENTIFICATION DIVISION.
000200 PROGRAM-ID. SENTNCES.
000300 ENVIRONMENT DIVISION.
000400 DATA DIVISION.
000500 PROCEDURE DIVISION.
000600
000700 PROGRAM-BEGIN.
000800 DISPLAY "This program contains four DIVISIONS,".
000900 DISPLAY "three PARAGRAPHS".
001000 DISPLAY "and four SENTENCES".
001100 PROGRAM-DONE.
001200 STOP RUN.
OUTPUT:
C>pcobrun comment
Personal COBOL version 2.0 from Micro Focus
PCOBRUN V2.0.02 Copyright (C) 1983-1993 Micro Focus Ltd.
This program contains four DIVISIONS,
three PARAGRAPHS
and four SENTENCES
ANALYSIS: Strictly speaking, the PROGRAM-ID, SENTNCES, is a sentence, but it has such a specialized role in a COBOL program (identifying the program) that it is not usually considered to be a sentence. The program is named sentnces.cbl (with the word sentnces deliberately shortened) because some operating systems (especially MS-DOS) limit filenames to eight characters plus an extension, and many compilers limit program names to eight characters.
DO match the filename and program name; for example, sentnces.cbl is the file name and SENTNCES is the PROGRAM-ID.
DON'T add confusion by using a PROGRAM-ID that is different from the filename.
I will stick with the use of eight or fewer characters for the names of programs and files throughout the text.The sentnces.cbl program contains all four DIVISIONs, three paragraphs (PROGRAM-ID, PROGRAM-BEGIN, and PROGRAM-DONE), and four sentences (the three DISPLAY statements in PROGRAM-BEGIN and STOP RUN at line 001200).
The paragraph name, PROGRAM-ID, is a required paragraph name and must be typed exactly as PROGRAM-ID. The paragraph names PROGRAM-BEGIN and PROGRAM-DONE are names I assigned when I wrote the program. Any of the paragraphs in the PROCEDURE DIVISION are given names you assign. The two paragraphs could have been named DISPLAY-THE-INFORMATION and PROGRAM-ENDS-HERE.
All the special words in COBOL (such as PROGRAM-ID, DATA, DIVISION, STOP, and RUN), as well as the paragraph names and program name (such as SENTNCES, PROGRAM-BEGIN, and PROGRAM-DONE), are created using the uppercase letters of the alphabet A through Z, the digits 0 through 9, and the hyphen (-). The designers of COBOL chose to allow a hyphen as a way of improving the readability of COBOL words. PROGRAM-BEGIN is easier to read than PROGRAMBEGIN.
The designers of COBOL also allowed for blank lines, such as line 000600 in Listing 1.4. Blank lines mean nothing in COBOL and can be used to spread things out to make them more readable.
You should type, compile (and, if necessary, link), and run Listing 1.4. See Appendix C for details. You might need to review this appendix a couple of times before you are completely comfortable with each of the steps involved in editing, compiling, and running.
Listing 1.4 illustrates the line-by-line organization of a COBOL program. There also is a left-to-right organization that determines what can be placed in certain columns.
A COBOL source code file has five areas, extending from left to right across the page. The first six characters or columns of a line are called the sequence number area. This area is not processed by the compiler, or if it is processed, it provides you only with warnings that numbers are out of sequence (if they are).
Character position 7 is called the indicator area. This seventh position is usually blank. If an asterisk is placed in this column, everything else on that line is ignored by the compiler. This is used as a method to include comments in your source code file.
The four character positions 8 through 11 are called Area A. DIVISIONs and paragraphs (and SECTIONs) must start in Area A. It is good coding practice to start DIVISIONs, SECTIONs, and paragraph names at column 8 rather than some random place in Area A.
Character positions 12 through 72 are called Area B. Sentences must start and end within Area B. It is good coding practice to start sentences at column 12 rather than some random place in Area B.
COBOL was designed as an 80-column language, but there is no formal definition of character positions 73 through 80. This is called the identification area (which has nothing to do with the IDENTIFICATION DIVISION).
The identification area is left to the designer of the COBOL compiler to use as needed. COBOL editors on large computers usually allow you to define an eight-character modification code that is inserted into the identification area whenever a line is changed or a new line is added. If you add lines to an existing program or change existing lines, it could be useful to know which lines were changed. Modification codes can be used to track down where a particular change was made. Modification codes are especially useful in companies where many programmers can work on many files. It helps keep track of changes, when they occurred, and who made them.
Some special COBOL editors place a modification code automatically in positions 73 through 80. This method of marking lines as modified usually depends on a special editor set up for COBOL that inserts these codes automatically. You probably will never see modification codes using COBOL on a PC.
Listing 1.5 is comment.cbl, which really is sentnces.cbl with a comment included and some lines that have been tagged with modification codes. I have deliberately left some of the sequence numbers out, or put them in incorrect order. The compiler will compile this without an error, but it might generate a warning that lines are out of order or sequence numbers are not consecutive. The compiler doesn't really care what is in the first six positions, but it might provide some warning information. Because of the width limits in a book, the example modification codes in Listing 1.5 do not actually start in column 73 as they must in a real COBOL example.