Having seen the basic data structures and their use let us now take a look through the code. The tiny shell has been designed in a basically top-down fashion and that is the order in which we'll work through it.
The top level design of any interpreter is basically a loop which just gets something to do and then goes away to do it. The tiny shell is no exception so that the main() function is particularly simple:
#include "def.h" main (void) { int i; initcold(); for (;;) { initwarm(); if (getline()) if (i = parse()) execute(i); } }
The first thing that happens is that main() calls a one-off initialization function called initcold(). This is followed by an endless loop which repeatedly prints a prompt, gets lines of input, parses them to extract the simple commands and their parameters, and then executes each of the simple commands, dealing with I/O redirection and pipes as appropriate.
In detail, the body of the loop first calls initwarm() to initialize things for each input command line and display the tsh: prompt. This is followed by a call to getline(), which copies the user's command line into the line[] array and returns a non-zero value if the length of the entered line was less than MAXLINE.
If a non-zero value is returned by getline(), then the parse() function is called next. This function has the task of parsing the contents of the line[] array and extracting the relevant information from it. This includes the simple command names and their associated parameters, the input and output file names for any I/O redirection, and the pipeline background (&) operator, if specified.
Assuming there are no errors, the parse() function returns a positive value which is the number of simple commands in the input command pipeline. On error parse() returns the value zero. If there have been no errors so far, then, finally, the execute() function is called with the count returned by parse() passed in as a parameter. The execute() function uses fork() and exec() to execute each of the simple commands in turn, creating pipes and opening I/O redirection files as required.
The contents of the initcold() initialization function are currently commented out so that the function actually does nothing. This is because the tiny shell has no built-in commands like the real shell. What this means is that shell commands like exit, which will terminate the real shell, cannot be used with the tiny shell in the same way. The easiest way to terminate the tiny shell is therefore to send it an interrupt (Ctrl-c) or quit (Ctrl-\) signal from the keyboard. Obviously, if the contents of initcold() were uncommented, then this would not be possible and the tiny shell would then be more difficult to terminate.
initcold (void) { /* signal(SIGINT, SIG_IGN); signal (SIGOUIT, SIG_IGN); */ } initwarm(void) { int i; backgnd = FALSE; lineptr = line; avptr = avline; infile[0] = '\0'; outfile[O] = '\0'; append = FALSE; for (i = 0; i<PIPELINE; ++i) { cmdlin[i].infd = 0; cmdlin[i].outfd = 1; } for (i = 3; i<OPEN_MAX; ++i) close(i); printf("tsh: "); fflush(stdout); }
The initwarm() function is executed at the start of each iteration of the 'get a command, execute a command' loop. Essentially, it just initializes the global variables, sets the infd and outfd file descriptors for each of the simple command structures in the cmdlin[] array to their default values, close()s all file descriptors other than the standard input, output and error output devices, and finally displays the tsh: prompt.
getline (void) { int i; for (i = 0; (line[i] = getchar())!='\n' && i<MAXLINE; ++i); if (i==MAXLINE) { fprintf(stderr, "Command line too long\n"); return(ERROR); } line[i+1] = '\0'; return (OKAY); }
This function just takes characters from the standard input device and copies them into the line[] array, until either a newline (\n) character is encountered or MAXLINE number of characters have been entered, in which case the line[] array is full.
If MAXLINE characters have been entered then the function returns ERROR; otherwise a string terminator (\0) is added after the last character entered, and the OKAY value is returned instead.
The syntax of a command pipeline looks as follows:
cmd [ < filename ] [ | cmd ]... [ or filename ] [ & ]
Here cmd means any simple command including its parameters, and or is either of the output redirection operators (> or >>). Square brackets are used to enclose optional items and the ellipses (...) mean that the preceding bracket contents may be repeated zero or more times.
Piecing it all together, what this says is that a tiny shell command pipeline starts with a simple command. This is then optionaUy followed by an input redirection operator and its associated filename. This in turn can optionally and repeatedly be followed by a pipe operator and then another simple command. After the last simple command, which may also be the first if no pipes are used, an optional output redirection operator and its associated filename may be specified. Finally, the whole pipeline may optionally be set to run in the background by specifying the & operator.
The parse() function needs to make sure that the syntax of its input conforms to this specification, whilst at the same time extracting the tokens belonging to each simple command and the file names for any input or output redirection that might be specified.
parse (void) { int i; /* 1 */ command(0); /* 2 */ if (check("<")) getname(infile); /* 3 */ for (i = 1; i<PIPELINE; ++i) if (check("|")) command(i); else break; /* 4 */ if (check(">")) { if (check(">")) append = TRUE; getname(outfile); } /* 5 */ if (check("&")) backgnd = TRUE; /* 6 */ if (check("\n")) return(i); else { fprintf(stderr, "Command line syntax error\n"); return (ERROR); } }
The numbered comments in this listing are described as follows:
Basically, this function copies the names of commands and their parameters one word at a time from the line[] array to the avline[] array. Each time a word is copied, a pointer is set up to point to the start of the word in one of the av[] arrays in the cmdlin[] array of structures:
command(int i) { int j, flag, inword; for (j = 0; j<MAXARG-l; ++j) { while (*lineptr==' ' || *lineptr=='\t') ++lineptr; cmdlin[i].av[j] = avptr; cmdlin[i].av[j+l] = NULL; for (flag = 0; flag==0;) { switch (*lineptr) { case '>': case '<': case '|': case '&': case '\n': if (inword==FALSE) cmdlin[i].av[j] = NULL; *avptr++ = '\0'; return; case ' ': case '\t': inword = FALSE; *avptr++ = '\0'; flag = 1; break; default: inword = TRUE; *avptr++ = *lineptr++; break; } } } }
Within the tiny shell in general, and this function in particular, the variable lineptr points to the current place of interest within the line[] array and the variable avptr points to the next character position to use in the avline[] array.
The command() function consists of a pair of nested for loops. The inner loop copies the characters of a word from the line[] array to the avline[] array and terminates the word in the avline[] array with a \0 byte. The outer loop is responsible for setting up pointers to the words in the appropriate av[] array elements.
In each complete pipeline there can only be one input redirection file specified, attached to the first simple command, and only one output redirection file can he specified, attached to the last simple command. Obviously, if there is only one simple command in the pipeline then it qualifies as both the first and the last command on the line and, consequently, it can have both its input and output redirected to files.
If there is more than one simple command in a pipeline, say j of them, then j-1 pipes will need to be created to connect the simple commands together. Bearing these points in mind we can now look at the code for the execute() function:
execute(int j) { int i, fd, fds[2]; /* 1 */ if (infile[0] !='\0') cmdlin[0].infd = open(infile, O_RDONLY); /* 2 */ if (outfile[0] !='\0') if (append==FALSE) cmdlin[j-1].outfd = open(outfile, O_WRONLY | O_CREAT | O_TRUNC, 0666); else cmdlin[j-1].outfd = open(outfile, O_WRONLY | O|CREAT | O_APPEND, 0666); /* 3 */ if (backgnd==TRUE) signal (SIGCHLD, SIG_IGN); else signal (SIGCHLD, SIG_DFL); /* 4 */ for (i = 0; i <j; ++i) { /* 5 */ if (i<j-1) { pipe (fds); cmdlin[i].outfd = fds[1]; cmdlin[i+1].infd = fds[0]; } /* 6 */ forkexec(&cmdlin[i]); /* 7 */ if ((fd = cmdlin[i].infd)!=0) close(fd); if ((fd = cmdlin[i].outfd)!=1) close(fd); } /* 8 */ if (backgmd==FALSE) while (wait (NULL) != lastpid); }
Remember that the execute() function takes a single integer parameter, which is the number of simple commands in the command pipeline. This value is assigned to the variable j. The numbered comments in the listing are explained as follows:
The forkexec() function has the task of creating a child process with the fork() system call Then within the child process it performs any required I/O redirection before using execvp() to execute the required command.
forkexec(struct cmd *ptr) { int i_pid; /* 1 */ if (pid = fork()) { /* 2 */ if (backgnd==TRUE) printf("%d\n", pid); lastpid = pid; } else { /* 3 */ if (ptr->infd==0 && backgnd==TRUE) ptr->infd = open("/dev/null", O_RDONLY); /* 4 */ if (ptr->infd!=0) { close(0); dup(ptr->infd); } if (ptr->outfd!=1) { close(1); dup(ptr->outfd); } /* 5 */ if (backgnd==FALSE) { signal (SIGINT, SIG_DFL); signal (SIGQUIT, SIG_DFL); } /* 6 */ for (i = 3; i<OPEN_MAX; ++i) close(i); /* 7 */ execvp(ptr->av[0] ptr->av); exit(1); } }
The forkexec() function takes as its parameter a pointer to the struct cmd associated with the simple command it is about to execute. Remember that this structure contains the file descriptors that the command is to use for its standard input and standard output (though the redirection has not taken place yet) and it also contains an array of pointers to command line parameters (including the command name itself) that will become the command's argv parameter.
The numbered comments in this function are described as follows:
check(char *ptr) { char *tptr; while (*lineptr== ' ' ) lineptr++; tptr = lineptr; while (*ptr!='\0' && *ptr==*tptr) { ptr++; tptr++; } if (*ptr!='\0') return(FALSE); else { lineptr = tptr; return(TRUE); } }
The check() function compares the string passed to it as a parameter with the characters in the line[] array starting at the position of lineptr. If a match is found then lineptr is moved on beyond the characters matched and the value TRUE is returned. If check() doesn't find a match then a value of FALSE is returned and lineptr is left unchanged so that checks can be performed for other strings starting in the same place in the line[] array.
getname(char *name) { int i; for (i = 0; i<MAXNAME; ++i) { switch (*lineptr) { case '>': case '<': case '|': case '&': case ' ': case '\n': case '\t': *name = '\0'; return; default: *name++ = *lineptr++; break; } } *name = '\0'; }
This function just copies a file name up to MAXNAME characters long into the char array pointed to by its name parameter.