In this assignment, you will implement a pieces of a UNIX shell, and thereby get some familiarity with some UNIX library calls and the UNIX process model. By the end of the assignment, you will have a shell that can run complex pipelines of commands, such as:
cat /usr/share/dict/words | grep cat | sed s/cat/dog/ > doggerel.txt
The above pipeline takes /usr/share/dict/words
(a file generally installed on UNIX systems that contains a list of English words), selects out the words containing the string "cat", and then uses sed to replace "cat" with dog, so that, for example, "concatenate" becomes "condogenate". The results are output to "doggerel.txt". (You can find detailed descriptions of each of the commands in the pipeline by consulting the manual page for the command; e.g.: "man grep" or "man sed".)
Start by downloading the sh.c skeleton. You don't have to understand how the parser works in detail, but you should have a general idea of how the flow of control works in the program. You will also see the "// your code here" comments, which is where you will implement the functionality to make the shell actually work.
Next, try to compile the source code to the shell:
$ gcc sh.c -o sh
You can then run and interact with the shell by typing ./sh
:
user@cs3224:~$ ./a.out
cs3224> ls
exec not implemented
cs3224>
Note that the command prompt for our shell is set to cs3224>
to make it easy to tell the difference between it and the Linux/OS X shell. You can quit your shell by typing Control-C or Control-D.
Note: This assignment does not involve modifying or using xv6 (although the code for sh.c
is adapted from the xv6 shell). You should write, compile, and test your code on OS X or Linux, rather than in QEMU. Also note that while it may be tempting to just copy xv6's implementation, there are enough differences between the xv6 APIs and those in Linux/OS X that doing so would be a bad idea. You can look at how it works for inspiration though.
Implement basic command execution by filling in the code inside of the case ' '
block in the runcmd
function. You will want to look at the manual page for the exec(3) function by typing "man 3 exec" (Note: throughout this course, when referring to commands that one can look up in the man pages, we will typically specify the section number in parentheses -- thus, since exec is found in section 3, we will say exec(3)).
Once this is done, you should be able to use your shell to run single commands, such as
cs3224> ls
cs3224> grep cat /usr/share/dict/words
Hint:
PATH
environment variable.Now extend the shell to handle input and output redirection. Programs will be expecting their input on standard input and will write to standard output, so you will have to open the file and then replace standard input or output with that file. As before, the parser already recognizes the '>' and '<' characters and builds a redircmd
structure for you, so you just need to use the information in that redircmd
to open a file and replace standard input or output with it.
Hints:
rcmd->fd
.rcmd->fd
is coming from, look at the redircmd
function and remember that 0 is standard input, 1 is standard output.open
call; in particular, make sure you read about the case when you pass the O_CREAT
flag.When this is done, you should be able to redirect the input and output of commands:
cs3224> ls > a.txt
cs3224> sort -r < a.txt
The final task is to add the ability to pipe the output of one command into the input of another. You will fill out the code for the '|' case of the switch statement in runcmd to do this.
Hints:
pcmd->left
and the right command in pcmd->right
.Note that fork(2) creates an exact copy of the current process. The two process share any file descriptors that were open at the time the fork occurred. You can get a sense for this behavior by looking at a small test program like this one:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
int main() {
int filedes;
filedes = open("myfile.txt", O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
int rv;
rv = fork();
if (rv == 0) {
char msg[] = "Process 1\n";
printf("Hello, I'm in the child, my process ID is %d\n", getpid());
write(filedes, msg, sizeof(msg));
}
else {
char msg[] = "Process 2\n";
printf("This is the parent process, my process ID is %d and my child is %d\n", getpid(), rv);
write(filedes, msg, sizeof(msg));
}
close(filedes);
}
If you put that code into a file, compile it, and then run the resulting program, you should see a result like:
user@cs3224:~$ ./a.out
This is the parent process, my process ID is 56968 and my child is 56969
Hello, I'm in the child, my process ID is 56969
user@cs3224:~$ cat myfile.txt
Process 2
Process 1
You can see that both the parent and child process both got a copy of "filedes", and that writes to it from each process went to the same underlying file.
You may find it helpful to re-read the first chapter of the xv6 book, which describes in detail how the xv6 shell works. Note that the code show there will not work as-is -- you will have to adapt it for the Linux/OS X environment.
Once this is done, you should be able to run a full pipeline:
cs3224> cat /usr/share/dict/words | grep cat | sed s/cat/dog/ > doggerel.txt
cs3224> grep con < doggerel.txt
You can now submit your modified sh.c
on NYU Classes. Be sure to include a file named partner.txt
listing who you worked with, if anyone.
Credits: This assignment is adapted from MIT 6.828 Homework 1