From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:18:42 1993 Article: 31113 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Message-ID: <1993Nov1.053219.1022@gerty.equinox.gen.nz> Summary: A course of study for learning 'C'. Keywords: C learning language Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 05:32:19 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 177 This archive contains a complete course for you to learn the 'C' computer language itself. The language used is correct conversational English, I have written the lessons using the same language constructions which I would use if I were teaching you directly. An outline of the course is available for you to read below The course is intended to demonstrate the language itself and a selection of the simpler standard library functions. I have assumed that you have had sufficient exposure to computing to be able to use a programmer's editor of your choice and are confident in the use of the command line interpreter, whether it be a unix shell, or a DOS ( shudder :-) prompt. Some knowledge, of computers and the jargon is assumed, but complicated concepts are fully explained. In other words the intent is to teach 'C' per se, not 'the fundamentals of how to program a computer using 'C' as a teaching medium.' 'C' is not a computer language for rank beginners. Start with an interpretive language and proceed to a compiled language which has an extensive error message vocabulary and run-time checking facilities. In the interests of speed of execution 'C' does very little to protect you from yourself! Throughout the course the fact that a compiler is a translater from a high level language to assembler code is kept to the fore, you are frequently advised to examine the assembler code which is output by the compiler. Some minimal knowledge of computer architecture is therefore assumed. Whilst I have taken considerable care to ensure that this material is free of errors I am well aware that to err is a common human failing, and in this I don't claim to be different from anybody else. Therefore your gentle critique is welcome together with notification of any factual errors. It is planned to make the lessons available as a printed book, complete with a programme diskette if there is sufficient interest. Syllabus for the 'C' Language Course. 1 a) Historical introduction to the Language. b) Demonstration of a very simple program. c) Brief explanation of how the computer turns your program text into an executing program. d) The basic differences between 'C' and other languages. The advantages and disadvantages. We make the assumption that you are able to turn on your machine, use the Operating System at the Control Line Interpreter prompt "$ ", "c:>" or whatever, and to use an editor to enter program text. 2 a) How the 'C' language arranges for the storage of data. An explanation of the keywords associated with data. The storage classes:- static auto volatile const. The variable types:- char int long float double The meaning of:- signed unsigned b) Introduction to the concept of pointers. c) Explanation of reading from the keyboard and writing to the screen. i.e. printf and scanf, the print formatted and scan formatted functions. d) The use of arguments to the main() function, argc argv env. e) A simple program to format text. 3 Structures, arrays and pointers. a) Explanation of more coplex data structures. b) Programs which demonstrate uses of pointers. 4 The operators of the language, arithmetic, pointer, logical, bitwise. a) Precedence. b) The unique bit and shifting operators. ( for a high level language ) 5 a) The Preprocesser. b) Header files What they are and what you put in them, both your own and those provided by the 'C' compiler vendor. A simple title which includes all sorts of things, both very useful and a number of traps. 6 The library, why we have them and some of the more useful routines. a) How to read the book. b) The string functions as an example. 7 a) Mistakes and how avoid making them. b) Debugging strategies. c) The assert macro. 8 a) More on the representation of data vis. struct, typdef. b) Tables of all sorts. Arrays of structures. Pre-initialisation of data structures. ( Including jump or dispatch tables ) The bit-field. c) Use of header files in this. 9 a) The control structures of the language, what (not) to use and when. 10 a) File IO This is an enormous subject and we we will really only just scratch on the surface. 11 a) Lint, and more on errors / bugs and how to avoid them. 12 The stack and a quick dip into assembler a) A study of the function calling mechanism used by most 'C' compilers and the effect on compiler output code of using the register storage class and the optimiser. 13 The heap. a) The 'heap', it's management, malloc(), calloc() and free(). 14 Portability Issues. a) Defaults for storage sizes. b) 'endianism'. Yes, there are big-endian and little-endian computers! c) Functions which can be called with a variable number of arguments. 15 Sample programs. Much is to be gained from examining public domain packages examining the code and reviewing the author's style. We will look at a number of functions and complete packages. in particular we will examine a number of sorting functions, a multi-threading technique, queues, lists, hashing, and trees. /* ----------------------------------------- */ Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+ From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:18:58 1993 Article: 31114 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Part 1. Message-ID: <1993Nov1.054135.1137@gerty.equinox.gen.nz> Summary: How to use 'cc' & its compilation phases. Keywords: C cc compiler Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 05:41:35 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 232 Archive Name: c-notes/Lesson01.txt Lesson One. Some Historical Background. The 'C' programming language was designed and developed by Brian Kernighan, and Dennis Ritchie at The Bell Research Labs. 'C' is a Language specifically created in order to allow the programmer access to almost all of the machine's internals - registers, I/O slots and absolute addresses. However, at the same time, 'C' allows for as much data hiding and programme text modularisation as is needed to allow very complex multi-programmer projects to be constructed in an organised and timely fashion. During the early 1960s computer Operating Systems started to become very much more complex with the introduction of multi-terminal and multi-process capabilities. Prior to this time Operating Systems had been carefully and laboriously crafted using assembler codes, and many programming teams realised that in order to have a working o/s in anything like a reasonable time this was now longer economically feasible. This then was the motivation to produce the 'C' Language, which was first implemented in assembler on a Digital Equipment Corporation PDP-7. Of course once a simple assembler version was working it was possible to rewrite the compiler in 'C' itself. This was done in short order and therefore as soon as the PDP-11 was introduced by DEC it was only necessary to change the code generator section of the compiler and the new machine had a compiler in just a few weeks. 'C' was then used to implement the UNIX o/s. This means, that a complete UNIX can be transported, or to use the simple jargon of today; 'ported to a new machine in literally just a few months by a small team of competent programmers. Enough of the past. Lets see the various actions, or compilation phases through which the `C' compilation system has to go in order that your file of `C' program text can be converted working program. Assuming that you are able to work an editor and can enter a script and create a file. Please enter the following tiny program. #ident "@(#) Hello World - my first program" #include char *format = "%s", *hello = "Hello World...\n"; main() { printf ( format, hello ); } Now save it in a file called hello.c. Lower case is allowed - encouraged, no less - under the UNIX operating system. Now type: cc -o hello hello.c The computer will apparently pause for a few moments and then the Shell, or Command Line Interpreter prompt will re-appear. Now type: hello Lo and behold the computer will print Hello World... Let's just look at what the computer did during the little pause. The first action is to activate a preliminary process called the pre-processor. In the case of hello.c all it does is to replace the line #include with the file stdio.h from the include files library. The file stdio.h provides us with a convenient way of telling the compiler that all the i/o functions exist. There are a few other little things in stdio.h but they need not concern us at this stage. In order to see what the pre-processor actually outputs, you might like to issue the command: cc -P hello.c The 'cc' command will activate the 'C' compilation system and the -P option will stop the compilation process after the pre-processing stage, and another file will have appeared in your directory. Have a look, find hello.i and use the editor in view mode to have a look at it. So issue the command: view hello.i You will see that a number of lines of text have been added at the front of the hello.c program. What's all this stuff? Well, have a look in the file called /usr/include/stdio.h again using the view command. view /usr/include/stdio.h Look familiar? Now the next stage of getting from your program text to an executing program is the compilation of your text into an assembler code program. After all that is what a compiler is for - to turn a high level language script into a program. Lets see what happens by issuing the command cc -S hello.c Once again there is another file in your directory - this time with a .s suffix. Lets have a look at it in the same way as the .i file view hello.s You will doubtless notice a few recognizable symbols and what appears to be a pile of gibberish. The gibberish is in fact the nmemonics for the machine instructions which are going to make the computer do what you have programmed it to do. Now this assembler code has to be turned into machine instructions. To do this issue the command. cc -g -c hello.s Now, yet again there is another file in your directory - this time the suffix is ".o". This file is called the object file. It contains the machine instructions corresponding exactly to the nmemonic codes in the .s file. If you wish you can look at these machine codes using one of the commands available to examine object files. dis -L -t .data hello.o >hello.dis The output from these commands won't be very meaningful to you at this stage, the purpose of asking you to use them is merely to register in your mind the fact that an object file is created as a result of the assembly process. The next stage in the compilation process is called by a variety of names - "loading", "linking", "link editing". What happens is that the machine instructions in the object file ( .o ) are joined to many more instructions selected from an enormous collection of functions in a library. This phase of the compilation process is invoked by the command:- cc -o hello hello.o Now, at last, you have a program to execute! So make it do it's thing by putting the name of the executable file as a response to the Shell or Command Line Interpreter prompt. hello Presto, the output from your program appears on the screen. Hello World... You are now allowed to rejoice and have a nice warm fuzzy to hold! You have successfully entered a `C' program, compiled it, linked it, and finally, executed it! Having gone through all the various stages of editing, pre-processing, compiling, assembling, linking, and finally executing, by hand as it were, you can now rest assured that all the stages are automated by the 'cc' command, and you can forget how to invoke them! Just remember that the computer has to do them in order for you to have a program to execute. The single command you use to activate the C Compiler is: cc -o hello hello.c The word after the -o option is the name of the executable file, if you don't provide a name here the compiler dreams up the name "a.out". The source file MUST have the .c extension otherwise the compiler complains and stops working. Notes: The command names used in the above text are those of standard UNIX, Your particular system may well use a different name for the 'C' compiler. bcc - for Borland 'C'. gcc - GNU 'C', which is standard on the Linux operating system. lc - Lattice 'C', available on IBM and clone P.C.s as well as the Amiga. Check in the Documentation which came with your compiler. The same notions apply to the text editor. Differences between 'C' and other languages. In the years since 'C' was developed it has changed remarkable little. This fact is a bouquet to the authors, who had the vision and understanding to create a language which has endured so well. The strengths and weaknesses should be pointed out here. The big plus is that it is possible to do everything ( well at least 99.9% ) in 'C' while other languages compel you to write a procedure, subroutine or function in assembler code. 'C' has very good facilities for creating tables of constant data within the source file. 'C' doesn't do very much to protect you from yourself. This means that the resulting code executes faster than most other high level languages, but a much greater degree of both care and understanding is demanded from the programmer. 'C' is not a closely typed language, although the newer compilers are offering type checking as part of the language itself as opposed to having to use a separate program for mechanised debugging. 'C' is a small language with very few intrinsic operations. All the heavy work is done by explicit library function calls. 'C' allows you to directly and conveniently access most of the internals of the machine ( the memory, input output slots, and CPU registers ) from the language without having to resort to assembler code. 'C' compilers have an optimisation phase which can be invoked if desired. The output code can be optimised for either speed or memory usage. The code will be just as good as that produced by an assembly code programmer of normal skill - real guru programmers can do only slightly better. Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+ From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!usc!elroy.jpl.nasa.gov!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:19:09 1993 Article: 31115 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!usc!elroy.jpl.nasa.gov!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Part 2. Message-ID: <1993Nov1.055434.1240@gerty.equinox.gen.nz> Summary: 'C' Data Storage. Keywords: c data Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 05:54:34 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 580 Archive-name: c-notes/Lesson02.txt Lesson 2 Data Storage Concepts. It has been stated that "data + algorithms = programs". This Lesson deals with with the first part of the addition sum. All information in a computer is stored as numbers represented using the binary number system. The information may be either program instructions or data elements. The latter are further subdivided into several different types, and stored in the computer's memory in different places as directed by the storage class used when the datum element is defined. These types are: a) The Character. This is a group of 8 data bits and in 'C' represents either a letter of the Roman alphabet, or a small integer in the range of 0 through to +255. So to arrange for the compiler to give you a named memory area in which to place a single letter you would "say": char letter; at the beginning of a program block. You should be aware that whether or not a char is signed or unsigned is dependant on the design of the processor underlying your compiler. In particular, note that both the PDP-11, and VAX-11 made by Digital Equipment Corporation have automatic sign extention of char. This means that the range of char is from -128 through to +127 on these machines. Consult your hardware manual, there may be other exceptions to the trend towards unsigned char as the default. This test program should clear things up for you. /* ----------------------------------------- */ #ident "@(#) - Test char signed / unsigned."; #include main() { char a; unsigned char b; a = b = 128; a >>= 1; b >>= 1; printf ( "\nYour computer has %ssigned char.\n\n", a == b ? "un" : "" ); } /* ----------------------------------------- */ Here ( Surprise! Surprise! ) is its output on a machine which has unsigned chars. Your computer has unsigned char. Cut this program out of the news file. Compile and execute it on your computer in order to find out if you have signed or unsigned char. b) The Integers. As you might imagine this is the storage type in which to store whole numbers. There are two sizes of integer which are known as short and long. The actual number of bits used in both of these types is Implementation Dependent. This is the way the jargonauts say that it varies from computer to computer. Almost all machines with a word size larger than sixteen bits have the the long int fitting exactly into a machine word and a short int represented by the contents of half a word. It's done this way because most machines have instructions which will perform arithmetic efficiently on both the complete machine word as well as the half-word. For the sixteen bit machines, the long integer is two machine words long, and the short integer is one. short int smaller_number; long int big_number; Either of the words short or long may be omitted as a default is provided by the compiler. Check your compiler's documentation to see which default you have been given. Also you should be aware that some compilers allow the you to arrange for the integers declared with just the word "int" to be either short or long. The range for a short int on a small computer is -32768 through to +32767, and for a long int -4294967296 through to +4294967295. c) The Real Numbers. Sometimes known as floating point numbers this number representation allows us to store values such as 3.141593, or -56743.098. So, using possible examples from a ship design program you declare floats and doubles like this: float length_of_water_line; /* in meters */ double displacement; /* in grammes */ In the same way that the integer type offers two sizes so does the floating point representation. They are called float and double. Taking the values from the file /usr/include/values.h the ranges which can be represented by float and double are: MAXFLOAT 3.40282346638528860e+38 MINFLOAT 1.40129846432481707e-45 MAXDOUBLE 1.79769313486231470e+308 MINDOUBLE 4.94065645841246544e-324 However you should note that for practical purposes the maximum number of significant digits that can be represented by a float is approximately six and that by a double is twelve. Also you should be aware that the above numbers are as defined by the IEEE floating point standard and that some older machines and compilers do not conform. All small machines bought retail will conform. If you are in doubt I suggest that refer to your machine's documentation for the whole and exact story! d) Signed and unsigned prefixes. For both the character and integer types the declaration can be preceded by the word "unsigned". This shifts the range so that 0 is the minimum, and the maximum is twice that of the signed data type in question. It's useful if you know that it is impossible for the number to go negative. Also if the word in memory is going to be used as a bit pattern or a mask and not a number the use of unsigned is strongly urged. If it is possible for the sign bit in the bit pattern to be set and the program calls for the bit pattern to be shifted to the right, then you should be aware that the sign bit will be extended if the variable is not declared unsigned. The default for the "int" types is always "signed", and, as discussed above that of the "char" is machine dependent. This completes the discussion on the allocation of data types, except to say that we can, of course, allocate arrays of the simple types simply by adding a pair of square brackets enclosing a number which is the size of the array after the variable's name: char client_surname[31]; This declaration reserves storage for a string of 30 characters plus the NULL character of value zero which terminates the string. Structures. Data elements which are logically connected, for example - to use the example alluded to above - the dimensions and other details about a sea going ship, can be collected together as a single data unit called a struct. One possible way of laying out the struct in the source code is: struct ship /* The word "ship" is known as the structure's "tag". */ { char name[30]; double displacement; /* in grammes */ float length_of_water_line; /* in meters */ unsigned short int number_of_passengers; unsigned short int number_of_crew; }; Note very well that the above fragment of program text does NOT allocate any storage, it merely provides a named template to the compiler so that it knows how much storage is needed for the structure. The actual allocation of memory is done either like this: struct ship cunarder; Or by putting the name of the struct variable between the "}" and the ";" on the last line of the definition. Personally I don't use this method as I find that the letters of the name tend to get "lost" in the - shall we say - amorphous mass of characters which make up the definition itself. The individual members of the struct can have values assigned to them in this fashion: cunarder.displacement = 97500000000.0; cunarder.length_of_water_line = 750.0 cunarder.number_of_passengers = 3575; cunarder.number_of_crew = 4592; These are a couple of files called demo1.c & demo1a.c which contain small 'C' programs for you to compile. So, please cut them out of the news posting file and do so. ---------------------------------------------------------------------- #ident demo1.c /* If your compiler complains about this line, chop it out */ #include struct ship { char name[31]; double displacement; /* in grammes */ float length_of_water_line; /* in meters */ unsigned short int number_of_passengers; unsigned short int number_of_crew; }; char *format = "\ Name of Vessel: %-30s\n\ Displacement: %13.3f\n\ Water Line: %5.1f\n\ Passengers: %4d\n\ Crew: %4d\n\n"; main() { struct ship cunarder; cunarder.name = "Queen Mary"; /* This is the bad line. */ cunarder.displacement = 97500000000.0; cunarder.length_of_water_line = 750.0 cunarder.number_of_passengers = 3575; cunarder.number_of_crew = 4592; printf ( format, cunarder.name, cunarder.displacement, cunarder.length_of_water_line, cunarder.number_of_passengers, cunarder.number_of_crew ); } ---------------------------------------------------------------------- Why is the compiler complaining at line 21? Well C is a small language and doesn't have the ability to allocate strings to variables within the program text at run-time. This program shows the the correct way to copy the string "Queen Mary", using a library routine, into the structure. ---------------------------------------------------------------------- #ident demo1a.c /* If your compiler complains about this line, chop it out */ #include /* ** This is the template which is used by the compiler so that ** it 'knows' how to put your data into a named area of memory. */ struct ship { char name[31]; double displacement; /* in grammes */ float length_of_water_line; /* in meters */ unsigned short int number_of_passengers; unsigned short int number_of_crew; }; /* ** This character string tells the printf() function how it is to output ** the data onto the screen. Note the use of the \ character at the end ** of each line. It is the 'continue the string on the next line' flag ** or escape character. It MUST be the last character on the line. ** This technique allows you to produce nicely formatted reports with all the ** ':' characters under each other, without having to count the characters ** in each character field. */ char *format = "\n\ Name of Vessel: %-30s\n\ Displacement: %13.1f grammes\n\ Water Line: %5.1f metres\n\ Passengers: %4d\n\ Crew: %4d\n\n"; main() { struct ship cunarder; strcpy ( cunarder.name, "Queen Mary" ); /* The corrected line */ cunarder.displacement = 97500000000.0; cunarder.length_of_water_line = 750.0; cunarder.number_of_passengers = 3575; cunarder.number_of_crew = 4592; printf ( format, cunarder.name, cunarder.displacement, cunarder.length_of_water_line, cunarder.number_of_passengers, cunarder.number_of_crew ); } ---------------------------------------------------------------------- I'd like to suggest that you compile the program demo1a.c and execute it. $ cc demo1a.c $ a.out Name of Vessel: Queen Mary Displacement: 97500000000.0 grammes Water Line: 750.0 metres Passengers: 3575 Crew: 4592 Which is the output of our totally trivial program to demonstrate the use of structures. Tip: To avoid muddles in your mind and gross confusion in other minds remember that you should ALWAYS declare a variable using a name which is long enough to make it ABSOLUTELY obvious what you are talking about. Storage Classes. The little dissertation above about the storage of variables was concerned with the sizes of the various types of data. There is just the little matter of the position in memory of the variables' storage. 'C' has been designed to maximise the the use of memory by allowing you to re-cycle it automatically when you have finished with it. A variable defined in this way is known as an 'automatic' one. Although this is the default behaviour you are allowed to put the word 'auto' in front of the word which states the variable's type in the definition. It is quite a good idea to use this so that you can remind yourself that this variable is, in fact, an automatic one. There are three other storage allocation methods, 'static' and 'register', and 'const'. The 'static' method places the variable in main storage for the whole of the time your program is executing. In other words it kills the 're-cycling' mechanism. This also means that the value stored there is also available all the time. The 'register' method is very machine and implementation dependent, and also perhaps somewhat archaic in that the optimiser phase of the compilation process does it all for you. For the sake of completeness I'll explain. Computers have a small number of places to store numbers which can be accessed very quickly. These places are called the registers of the Central Processing Unit. The 'register' variables are placed in these machine registers instead of stack or main memory. For program segments which are tiny loops the speed at which your program executes can be enhanced quite remarkably. The optimiser compilation phase places as many of your variables into registers as it can. However no machine can decide which of the variables should be placed in a register, and which may be left in memory, so if your program has many variables and two or three should be register ones then you should specify which ones the compiler. All this is dealt with at much greater detail later in the course. Pointers. 'C' has the very useful ability to set up pointers. These are memory cells which contain the address of a data element. The variable name is preceeded by a '*' character. So, to reserve an element of type char and a pointer to an element of type char, one would say. char c; char *ch_p; I always put the suffix '_p' on the end of all pointer variables simply so that I can easily remember that they are in fact pointers. There is also the companion unary operator '&' which yields the address of the variable. So to initialize our pointer ch_p to point at the char c, we have to say. ch_p = &c; Note very well that the process of indirection can procede to any desired depth, However it is difficult for the puny brain of a normal human to conceptualize and remember more that three levels! So be careful to provide a very detailed and precise commentry in your program if you put more than two or three stars. Getting data in and out of your programs. As mentioned before 'C' is a small language and there are no intrinsic operators to either convert between binary numbers and ascii characters or to transfer information to and fro between the computer's memory and the peripheral equipment, such as terminals or disk stores. This is all done using the i/o functions declared in the file stdio.h which you should have examined earlier. Right now we are going to look at the functions "printf" and "scanf". These two functions together with their derivatives, perform i/o to the stdin and stdout files, i/o to nominated files, and internal format conversions. This means the conversion of data from ascii character strings to binary numbers and vice versa completely within the computer's memory. It's more efficient to set up a line of print inside memory and then to send the whole line to the printer, terminal, or whatever, instead of "squirting" the letters out in dribs and drabs! Study of them will give you understanding of a very convenient way to talk to the "outside world". So, remembering that one of the most important things you learn in computing is "where to look it up", lets do just that. If you are using a computer which has the unix operating system, find your copy of the "Programmer Reference Manual" and turn to the page printf(3S), alternatively, if your computer is using some other operating system, then refer to the section of the documentation which describes the functions in the program library. You will see something like this:- NAME printf, fprintf, sprintf - print formatted output. SYNOPSIS #include int printf ( format [ , arg ] ... ) char *format; int fprintf ( stream, format [ , arg ] ... ) FILE *stream; char *format; int sprintf ( s, format [ , arg ] ... ) char *s, *format; DESCRIPTION etc... etc... The NAME section above is obvious isn't it? The SYNOPSIS starts with the line #include . This tells you that you MUST put this #include line in your 'C' source code before you mention any of the routines. The rest of the paragraph tells you how to call the routines. The " [ , arg ] ... " heiroglyph in effect says that you may have as many arguments here as you wish, but that you need not have any at all. The DESCRIPTION explains how to use the functions. Important Point to Note: Far too many people ( including the author ) ignore the fact that the printf family of functions return a useful number which can be used to check that the conversion has been done correctly, and that the i/o operation has been completed without error. Refer to the format string in the demonstration program above for an example of a fairly sophisticated formatting string. In order to fix the concepts of printf in you mind, you might care to write a program which prints some text in three ways: a) Justified to the left of the page. ( Normal printing. ) b) Justified to the right of the page. c) Centred exactly in the middle of the page. Suggestions and Hint. Set up a data area of text using the first verse of "Quangle" as data. Here is the program fragment for the data:- /* ----------------------------------------- */ char *verse[] = { "On top of the Crumpetty Tree", "The Quangle Wangle sat,", "But his face you could not see,", "On account of his Beaver Hat.", "For his Hat was a hundred and two feet wide.", "With ribbons and bibbons on every side,", "And bells, and buttons, and loops, and lace,", "So that nobody ever could see the face", "Of the Quangle Wangle Quee.", NULL }; /* ----------------------------------------- */ Cut it out of the news file and use it in a 'C' program file called verse.c Now write a main() function which uses printf alone for (a) & (b) You can use both printf() and sprintf() in order to create a solution for (c) which makes a good use of the capabilities of the printf family. The big hint is that the string controlling the format of the printing can change dynamically as program execution proceeds. A possible solution is presented in the file verse.c which is appended here. I'd like to suggest that you have a good try at making a program of you own before looking at my solution. ( One of many I'm sure ) /* ----------------------------------------- */ #include char *verse[] = { "On top of the Crumpetty Tree", "The Quangle Wangle sat,", "But his face you could not see,", "On account of his Beaver Hat.", "For his Hat was a hundred and two feet wide.", "With ribbons and bibbons on every side,", "And bells, and buttons, and loops, and lace,", "So that nobody ever could see the face", "Of the Quangle Wangle Quee.", NULL }; main() { char **ch_pp; /* ** This will print the data left justified. */ for ( ch_pp = verse; *ch_pp; ch_pp++ ) printf ( "%s\n", *ch_pp ); printf( "\n" ); /* ** This will print the data right justified. ** ** ( As this will print a character in column 80 of ** the terminal you should make sure any terminal setting ** which automatically inserts a new line is turned off. ) */ for ( ch_pp = verse; *ch_pp; ch_pp++ ) printf ( "%79s\n", *ch_pp ); printf( "\n" ); /* ** This will centre the data. */ for ( ch_pp = verse; *ch_pp; ch_pp++ ) { int length; char format[10]; length = 40 + strlen ( *ch_pp ) / 2; /* Calculate the field length */ sprintf ( format, "%%%ds\n", length ); /* Make a format string. */ printf ( format, *ch_pp ); /* Print line of verse, using */ } /* generated format string */ printf( "\n" ); } /* ----------------------------------------- */ If you cheated and looked at my example before even attempting to have a go, you must pay the penalty and explain fully why there are THREE "%" signs in the line which starts with a call to the sprintf function. It's a good idea to do this anyway! So much for printf(). Lets examine it's functional opposite - scanf(), Scanf is the family of functions used to input from the outside world and to perform internal format conversions from character strings to binary numbers. Refer to the entry scanf(3S) in the Programmer Reference Manual. ( Just a few pages further on from printf. ) The "Important Point to Note" for the scanf family is that the arguments to the function are all POINTERS. The format string has to be passed in to the function using a pointer, simply because this is the way 'C' passes strings, and as the function itself has to store its results into your program it ( the scanf function ) has to "know" where you want it to put them. Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+ From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:19:15 1993 Article: 31116 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Part 3. Message-ID: <1993Nov1.060751.1370@gerty.equinox.gen.nz> Summary: Arrays and Pointers Keywords: c language Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 06:07:51 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 312 Archive-name: c-notes/Lesson03.txt Lesson 3 Arrays and Pointers. You can allocate space for an array of elements at compile time with fixed dimension sizes of any data type, even functions and structs. So these are legal array definitions: char name[30]; /* An array of 30 signed characters. */ char *strings[50]; /* 50 pointers to strings. */ unsigned long int *(*func)()[20];/* An array of pointers to functions which */ /* return pointers to unsigned long ints. */ You can declare a pointer to point at any type of data element, and as in the array situation above functions and structs are included. struct ship { char name[30]; double displacement; /* in grammes */ float length_of_water_line; /* in meters */ unsigned short int number_of_passengers; unsigned short int number_of_crew; }; So using the ship concept from Lesson 2 you can declare a pointer to point at one of the ship structs in an array. struct ship *vessel_p; Note the use of the suffix "_p". This is my way of reminding myself that the variable is a pointer. struct ship fleet[5]; /* This allocates enough storage for 5 ships' info. */ Now lets set the pointer to point at the first vessel in the fleet. vessel_p = fleet; This pointer can be made to point at other ships in the fleet by incrementing it or doing additive arithmetic on it: vessel_p++; /* point a the next ship in the fleet array. */ vessel_p = fleet + 3; Also we can find out the index of the ship in the fleet at which we are pointing: i = vessel_p - fleet; It is also legal to find out the separation of two pointers pointing at elements in an array: d = vessel_p - another_vessel_p; /* This gives the separation in elements. */ So summarising, pointers may be, incremented, decremented, and subtracted one from another or have a constant subtracted from them. Any other mathematical operation is meaningless and not allowed. Assembler programmers should note that while the pointer variables contain a byte machine address, when the arithmetic is done using pointers the compiler also issues either a multiply or a divide as well as the add or subtract instruction so that the result is ALWAYS expressed in elements rather than bytes. Have a go and write yourself a trivial little program, and have a look at the compiler ouput code. Lesson 1 told you how! When using a pointer to reference a structure we have to use a "pointer offset" operator in order to access the member of the struct we require: vessel_p = fleet; vessel_p->name = "Queen Mary"; vessel_p->displacement = 97500000000.0; vessel_p->length_of_water_line = 750.0 vessel_p->number_of_passengers = 3575; vessel_p->number_of_crew = 4592; Remember: It's a "." when accessing a struct which is in storage declared in the program. It's a "->" when accessing a struct at which a pointer is pointing. Initialisation of arrays. 'C' has the facility to initialise variables in a program script. Some examples: char *qbf = "The quick brown fox jumped over the lazy dogs back"; int tic_tac_toe[3][3] = { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9 } }; struct ship fleet[2] = { { "Queen Elizabeth", 97500000000.0, 750.0, 3575, 4592 }, { "Queen Mary", 115000000000.0, 875.0, 4500, 5500 } }; Take a careful note of where the commas and semi-colons go ( and don't go )! Initialised Tables of Indeterminate Length. One nice feature 'C' offers is that it is able to calculate the amount of storage required for a table by 'looking' at the number of initialisers. char *verse[] = { "On top of the Crumpetty Tree", "The Quangle Wangle sat,", "But his face you could not see,", "On account of his Beaver Hat.", "For his Hat was a hundred and two feet wide.", "With ribbons and bibbons on every side,", "And bells, and buttons, and loops, and lace,", "So that nobody ever could see the face", "Of the Quangle Wangle Quee." NULL }; Note the * character in the definition line. This means that we are going to make an array of pointers to variables of type char. As there is no number between the [ ] characters the compiler calculates it for us. With this kind of set-up it is nice and easy to add extra information to the table as program development proceeds. The compiler will calculate the new dimension for you. The point to remember is that the program has to know - from the contents of the table - that it has come to the end of the table! So you have to make a special entry which CANNOT under any circumstances be a real data element. We usually use NULL for this. The other way is to calculate the size of the table by using the sizeof operator - Note that although use of sizeof looks like a function call it is in fact an intrinsic operator of the language. The result is available at compile time. So one can say:- #define SIZE_OF_VERSE sizeof verse There is one final initialised data type, the enum. It is a fairly recent addition to the language. enum spectrum { red, orange, yellow, green, blue, indigo, violet } colour; In this construct the first symbol is given the value of 0 and for each following symbol the value is incremented. It is however possible to assign specific values to the symbols like this: enum tub { anorexic = 65, slim = 70, normal = 80, fat = 95, obese = 135 }; Some compilers are bright enough to detect that it is an error if an attempt is made to assign a value to an enum variable which is not in the list of symbols, on the other hand many are not. Take care! In practice there is little difference between the enum language construct and a number of define statements except perhaps aesthetics. Here is another trivial program which demonstrates the use of enum and a pre-initialised array. #include enum spectrum { red, orange, yellow, green, blue, indigo, violet } colour; char *rainbow[] = { "red", "orange", "yellow", "green", "blue", "indigo", "violet" }; main() { for ( colour = red; colour <= violet; colour++ ) { printf ( "%s ", rainbow[colour]); } printf ( "\n" ); } The output of which is ( not surprisingly ): red orange yellow green blue indigo violet One quite advanced use of initialised arrays and pointers is the jump or dispatch table. This is a efficient use of pointers and provides a very much better ( In my opinion ) method of controlling program flow than a maze of case or ( heaven forbid ) if ( ... ) goto statements. Please cut out this program, read and compile it. ------------------------------------------------------------------------ char *ident = "@(#) tellme.c - An example of using a pointer to a function."; #include #include #include /* These declarations are not in fact needed as they are all declared extern in math.h. However if you were to use routines which are not in a library and therefore not declared in a '.h' file you should declare them. Remember you MUST declare external routines which return a type other than the int type. extern double sin (); extern double cos (); extern double tan (); extern double atof (); */ struct table_entry { char *name; /* The address of the character string. */ double (*function)(); /* The address of the entry point of the function. */ }; typedef struct table_entry TABLE; double help ( tp ) TABLE *tp; { printf ( "Choose one of these functions:- " ); fflush ( stdout ); for ( ; tp -> name; tp++ ) printf ( "%s ", tp -> name ); printf ( "\nRemember the input is expressed in Radians\n" ); exit ( 0 ); return ( 0.0 ); /* Needed to keep some nit-picking dumb compilers happy! */ } /* * This is the array of pointers to the strings and function entry points. * Is is initialised at linking time. You may add as many functions as you * like in here PROVIDED you declare them to be extern, either in some .h * file or explicitly. */ TABLE interpretation_table [ ] = { { "sin", sin }, { "tan", tan }, { "cos", cos }, { "help", help }, { NULL, NULL } /* To flag the end of the table. */ }; char *output_format = { "\n %s %s = %g\n" }; extern int errno; extern void perror(); main( argc, argv ) int argc; char **argv; { TABLE *tp; double x, answer; if ( argc > 3 ) { errno = E2BIG; perror ( "tellme" ); exit ( -1 ); } for (;;) /* This is the way to set up a continuous loop. */ { for ( tp = interpretation_table; ( tp -> name && strcmp ( tp -> name, argv[1] )); tp++ ) ; /* Note use of empty for loop to position tp. */ if ( tp -> function == help ) (*tp -> function )( interpretation_table ); if ( tp -> name == NULL ) { printf ( "Function %s not implemented yet\n", argv[1] ); exit ( 1 ); } break; /* Leave the loop. */ } x = atof ( argv[2] ); /* Convert the character string to a double. */ answer = ( *tp -> function )( x );/* Execute the desired function. */ printf ( output_format, /* Pointer to printf()'s format string. */ argv[1], /* Pointer to the name of the function. */ argv[2], /* Pointer to the input number ascii string. */ answer /* Value ( in double floating point binary ) */ ); } Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+ From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:19:25 1993 Article: 31117 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Part 4. Message-ID: <1993Nov1.061057.1444@gerty.equinox.gen.nz> Summary: Operators of the c language. Keywords: c operators language Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 06:10:57 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 284 Archive-name: c-notes/Lesson04.txt Lesson 4. The operators of the language. I have mentioned that 'C' is a small language with most of the heavy work being done by explicit calls to library functions. There is however a rich mix of intrinsic operators which allow you to perform bit level operations, use pointers, and perform immediate operations on varables. In other words, most of a machine's instruction set is able to be used in the object program. At the time when 'C' was designed and first written these were unique for a high level language. Lets start with a discussion about precedence. This really means that the compiler puts invisable parentheses into your expression. Casting your mind back to Arithmetic in the primary school I expect you remember the nmemonic "My Dear Aunt Sally". The 'C' language does as well! So the following expression is correct 15 + 4 * 11 = 59 The compiler has rendered the expression as: 15 + ( 4 * 11 ) = 59 Now the 'C' language has a much larger collection of operators than just Multiply Divide Add Subtract, in fact much too big to try to remember the precedence of all of them. So my recomendation is to ALWAYS put in the parentheses, except for simple arithmetic. However, for the sake of completeness as much as anything else, here is the list. First up come what are called the primary-expression operators: () Function. [] Array. . struct member ( variable ). -> struct member ( pointer ). The unary operators: * Indirection via a Pointer. & Address of Variable. - Arithmetic Negative. ! Logical Negation or Not. ~ Bit-wise One's Complement. ++ Increment. -- Decrement. sizeof Which is self explanitary. Now the binary operators: Arithmetic Operators. * Multiply. My / Divide. Dear % Modulo, or Remainder of Integer Division. + Addition. Aunt - Subtraction. Sally The Shifting Operators. >> Bit-wise Shift to the Right. << Bit-wise Shift to the Left. Logical Relation Operators. < Less Than. > Greater Than. <= Less Than or Equal. >= Greater Than or Equal. == Equal. != Not Equal. Bit-wise Boolean Operators. & Bit-wise And. ^ Bit-wise Exclusive-or. | Bit-wise Or. The Logical Operators. && Logical And. || Logical Or. The Assignment Operators. ( They all have the same priority. ) = The normal assignment operator. The Self-referencing Assignment Operators. += -= *= /= %= >>= <<= &= ^= |= Some explanation is in order here. The machine instructions in your computer include a suit of what are called "immediate operand" instructions. These instructions have one of the operands in a register and the other is either part of the instruction word itself ( if it is numerically small enough to fit ) or is the next word in the address space "immediately" after the instruction code word. 'C' makes efficient use of this machine feature by providing the above set of operations each of which translates directly to its corresponding machine instruction. When the variable in question is a 'register' one, or the optimiser is in use, the compiler output is just the one "immediate" machine instruction. Efficiency Personified!!! These two lines will make things clearer. a = 8; a += 2; /* The result is 10 */ The exclusive-or operation is very useful you can toggle any combination of bits in the variable using it. a = 7; a ^= 2; /* Now a is 5 */ a ^= 2; /* and back to 7. */ Naturally, you can use the other operations in exactly the same way, I'd like to suggest that you make a utterly simplistic little program and have a look at the assembler code output of the compiler. Don't be afraid of the assembler codes - they don't bite - and you will see what I was on about in the paragraph above. Historical Note and a couple of Cautions. In the Oldend Days when 'C' was first written all the self-referencing operations had the equals symbol and the operand around the other way. Until quite recently ( unix system V release 3.0 ) the 'C' compiler had a compatability mode and could cope with the old style syntax. A sample or test program is probably in order here. /* ----------------------------------------- */ #include char *mes[] = { "Your compiler", " understands", " does not understand", " the old-fashioned self-referencing style." }; main() { int a; a = 5; a=-2; printf ( "%s %s %s\n", mes [ 0 ], mes [ ( a == -2 ) ? 2 : 1 ], mes [ 3 ] ); } /* ----------------------------------------- */ The 'C' compiler issued with unix System V release 3.2 seems to have ( thankfully ) dropped the compatability mode. However a collegue, who was using an old compiler, and I spent hours trying to find this strange bug! The cure for the problem is either to put spaces on either side of the '=' sign or to bracket the unary minus to the operand. a=(-2); a = -2; Either is acceptable, and might save you a lot of spleen if sombody tries to install your work of art program on an ancient machine. The other caution is the use of the shifting instructions with signed and unsigned integers. If you shift a signed integer to the right when the sign bit is set then in all probability the sign will be extended. Once again a little demo program. Please cut it out of the news file with your editor and play with it. /* ----------------------------------------- */ #ident "#(@) shifts.c - Signed / Unsigned integer shifting demo." #include #define WORD_SIZE ( sizeof ( INTEGER int ) * 8 ) #define NIBBLE_SIZE 4 #define NIBBLES_IN_WORD (( WORD_SIZE ) / NIBBLE_SIZE ) #define SIGN_BIT ( 1 << ( WORD_SIZE - 1 )) char *title[] = { " Signed Unsigned", " Signed Unsigned" }; main () { INTEGER int a; unsigned INTEGER int b, mask; int ab, i, j, bit_counter, line_counter; a = b = SIGN_BIT; printf ( "%s\n\n", title [ ( WORD_SIZE == 16 ) ? 0 : 1 ] ); for ( line_counter = 0; line_counter < WORD_SIZE; line_counter++ ) { for ( ab = 0; ab < 2; ab++ ) { mask = SIGN_BIT; for ( i = 0; i < NIBBLES_IN_WORD; i++ ) { for ( j = 0; j < NIBBLE_SIZE; j++ ) { printf ( "%c", ((( ab ) ? b : a ) & mask ) ? '1' : '0' ); mask >>= 1; } printf ( " " ); } printf ( "%s", ( ab ) ? "\n" : " " ); if ( ab ) { b >>= 1; } else { a >>= 1; #if defined(FIX_COMPILER_BUG) # if (INTEGER == long) a |= SIGN_BIT; /* This is a work-around for the 3b2 compiler bug. */ # endif #endif } } } } /* ----------------------------------------- */ This little program might well produce some interesting surprises on your machine in the same way it did on mine. I have an AT&T 3b2/400 and use the K & R style compiler. Interestingly, the above program did what I expected it to do when the integers were short, the sign bit is extended, but when the integers are long the sign bit is NOT extended. In this case the different behaviour is caused by the compiler always issuing a Logical Shift instruction, when it should issue a Arithmetic Shift instruction for signed integers and a Logical Shift instructon for unsigned ones. In the case of the short int the varable is loaded from memory into the register with a sign extend load instruction, this makes the Logical Shift instruction right work correctly for short ints, but not for longs. I had to examine the assember codes output by the compiler in order to discover this. Here are the compiler invocation lines. cc -olong.shifts -DFIX_COMPILER_BUG -DINTEGER=long shifts.c and cc -oshort.shifts -DINTEGER=short shifts.c Experiment with the "-DFIX_COMPILER_BUG" and see what your compiler does. Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+ From uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris Fri Nov 5 12:19:31 1993 Article: 31118 of comp.lang.c Newsgroups: comp.lang.c Path: uwe-bristol!gdt!niss!warwick!zaphod.crihan.fr!pipex!howland.reston.ans.net!spool.mu.edu!olivea!decwrl!decwrl!waikato!canterbury.ac.nz!equinox.gen.nz!equinox!gerty!chris From: chris@gerty.equinox.gen.nz (Christopher Sawtell) Subject: Notes for 'C' Programmers Part 5. Message-ID: <1993Nov1.071315.1647@gerty.equinox.gen.nz> Summary: A Lesson about the Pre-processor and Header files. Keywords: c header-files pre-processor Organization: Cogicorp X-Newsreader: TIN [version 1.2 PL0] Date: Mon, 1 Nov 1993 07:13:15 GMT Expires: Wed, 1 Dec 1993 00:00:00 GMT Lines: 332 Archive-name: c-notes/Lesson05.txt Lesson 5. The Pre-processor and Header Files. The pre-processor is activated by a '#' character in column one of the source code. There are several statements vis: #include #define #undef #if #else #endif #ifdef #ifndef #pragma #include. In the programming examples presented in the previous lessons you will probably have noticed that there is this statement: #include right at the start of the program text. This statement tells the pre-processor to include the named file in the your program text. As far as the compiler is concerned this text appears just as if you had typed it yourself! This is one of the more useful facilities provided by the 'C' language. The #include statement is frequently combined with the #if construct. In this program fragment the file "true.h" is included in your program if the pre-processor symbol FLAG is true, and "false.h" included if FLAG is false. #if ( FLAG ) # include "true.h" #else # include "false.h" #endif This mechanism has many uses, one of which is to provide portability between all the 57,000 slightly different versions of unix and also other operating systems. Another use is to be able to alter the way in which your program behaves according to the preference of the user. Of course, you will be asking the question "Where is the file stored?". Well, if the filename is delimited by the "<" and ">" characters as in the example above the file comes from the /usr/include directory, but if the name of the file is delimited by quotes then the file is to be found in your current working directory. (This is not quite the whole truth as 'C' compilers allow you to extend the search path for the include files using command line option switches. - See your compiler manual for the whole story. ) So, I would like to suggest that you to have a look around the /usr/include directory and its /sys sub-directory. You should use either your editor in 'view' mode or the pg utility. This will ensure that you can't have an accident and alter one of the files by mistake if you are slightly silly and just happen to be logged on as the super-user. A typical file to examine is usr/include/time.h. It's quite small so here it is. /* Copyright (c) 1984 AT&T */ /* All Rights Reserved */ /* THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF AT&T */ /* The copyright notice above does not evidence any */ /* actual or intended publication of such source code. */ #ident "@(#)/usr/include/time.h.sl 1.5 4.2 04/20/87 18195 AT&T-SF" /* 3.0 SID # 1.2 */ struct tm { /* see ctime(3) */ int tm_sec; int tm_min; int tm_hour; int tm_mday; int tm_mon; int tm_year; int tm_wday; int tm_yday; int tm_isdst; }; extern struct tm *gmtime(), *localtime(); extern char *ctime(), *asctime(); int cftime(), ascftime(); extern void tzset(); extern long timezone, altzone; extern int daylight; extern char *tzname[]; As you can see ( forgetting about the comments and #ident ) there are three different uses for the file. a) The definition of data structures and types. b) The declaration of functions which use the data structures. c) The declaration of of external data objects. These lines of code are all you need in your program in order to be able to use, in this case, the library routine to access the clock in the computer, but of course the paradigm applies to all programs which are created by one programmer and used by another member of the programming team. Note that, by proxy, or whatever, the author of the library routines has in effect become a member of your programming team. You might care to write a program or two which use this header file, and for those who are motivated it might be an idea to re-implement localtime so that it understands Summer Time in the Southern Hemisphere. (!) Using another totally trivial example in order to get the idea across please examine the hello world program printed immediately below. /* ------------------------------------------------------------ */ #ident "@(#) hw_uc.h UPPER CASE version." #define HELLO_MESSAGE "HELLO WORLD...\n"; /* ------------------------------------------------------------ */ #ident "@(#) Hello World" #include #include HW_H #if !defined( HELLO_MESSAGE ) # error "You have forgotten to define the header file name." #endif char *format = "%s", *hello = HELLO_MESSAGE; main() { printf ( format, hello ); } /* ------------------------------------------------------------ */ You will no doubt notice that the symbol HW_H is used instead of a header file name. This gives us the ability to force the inclusion of any file we wish by defining the symbol HW_H to be the desired file name. It can be done like this: cc -DHW_H="\"hw_uc.h\"" hello.c The compiler output is placed, by default, in the file a.out, so to execute it issue the command: a.out Which, fairly obviously, produces the output: HELLO WORLD... As we are going to generate another version of the program we had better move the executable image file to another file name: mv a.out hello_uc Now to produce the other version issue the command line: cc -DHW_H="\"hw_lc.h\"" hello.c; mv a.out hello_lc; hello_lc Which compiles the other version of the hello.c program, using this version of the include file: /* ------------------------------------------------------------ */ #ident "@(#) hw_lc.h Lower Case version." #define HELLO_MESSAGE "Hello World...\n"; /* ------------------------------------------------------------ */ and then moves the executable image to a different file and executes it. Note that more than one command per line can be issued to the shell by separating the commands with the ';' delimiting character. Here - Surprise, Surprise - is the output of the second version. Hello World... I'd like to suggest that you use your editor to cut these example programs and the shell file below out of the mail file and have a play with them. /* ----------------------------------------- */ # @(#) Shell file to do the compilations. cc -o hello_uc -DHW_H="\"hw_uc.h\"" hello.c cc -o hello_lc -DHW_H="\"hw_lc.h\"" hello.c /* ----------------------------------------- */ #define This statement allows you to set up macro definitions. The word immediately after the #define, together with its arguments, is expanded in the program text to the whole of the rest of the line. #define min(a, b) ((a #define min(a, b) ((a