Operating systems
Second laboratory exercise
Table of Contents
1. Introduction
The goal of this laboratory is to become familiar with various low-level Unix/POSIX system calls related to process creation and management.
You will write a simple shell, fsh
(Fer SHell), that implements a few basic functionalities.
2. Examples of shell functionality
This exercise must be implemented in a UNIX-like environment (e.g. GNU/Linux, FreeBSD) using the C or C++ programming language (you are welcome to use any standard of these languages).
You are also welcome to use any functions provided by the standard C library except the system, execlp, execvp and execvP
functions.
2.1. Accessing system call and library function documentation
During this exercise, you will need to use complex system calls whose behaviour needs to be analyzed beforehand to ensure that your shell functions properly.
The main source of information regarding system calls and library functions are the so called man pages, accessed using the man
command.
The man
command takes a program or function name as an argument and displays the corresponding documentation page.
It is important to note that the man pages are divided into discrete sections, grouped by the nature of the content they describe.
For instance, the third section documents library functions, the second section documents system calls, and the first section documents builtin shell commands and programs.
If invoked with only a name argument, the man
function will sequentially scan the sections until it finds the first matching documentation page.
Unfortunately, this can lead to confusing situations as some programs and functions share names, leading to man
displaying the wrong documentation page, as depicted in the following listing.
$ man sleep SLEEP(1) User Commands SLEEP(1) NAME sleep - delay for a specified amount of time SYNOPSIS sleep NUMBER[SUFFIX]... sleep OPTION ... |
$ man 3 sleep sleep(3) Library Functions Manual sleep(3) NAME sleep - sleep for a specified number of seconds LIBRARY Standard C library (libc, -lc) SYNOPSIS #include <unistd.h> unsigned int sleep(unsigned int seconds); ... |
Fortunately, the man
command takes an optional section number argument, searching the documentation page only in the specified section.
Detailed information for any system call can be obtained using the man 2 <syscall_name>
command.
Detailed information for any standard C library function can be obtained using the man 3 <function_name>
command.
2.2. Basic command execution
Pseudocode 1 roughly covers the basic shell execution flow.
Before accepting user input, your shell must print "fsh>
" to signal that it is ready to process the user's input.
Each command consists of a command name and, optionally, multiple arguments that are separated using one or more whitespace characters.
You don't need to implement deleting characters when typing commands.
while(1){ print "fsh> "; wait for user input; read user input and parse; command = find_command(); if(command not builtin){ create a new process; execute the command in the child process; wait until the child process has exited; } }
fsh> /usr/bin/pwd /home/student fsh> /usr/bin/echo "hello" hello fsh> /usr/bin/ls -lh # directory contents fsh> /bin/asdf fsh: Unknown command: /bin/asdf
2.2.1. Recommended solving steps
- Write the basic shell loop,
- Implement command and argument parsing,
- Thoroughly study the
fork
,wait
, andexecv
system calls, - Implement support for "recognizing" builtin commands (will be required for the next task),
- Use the system calls listed in the third step to implement basic command execution.
2.3. Basic builtin commands and signal propagation
2.3.1. The cd
command
The cd
built-in command provides basic filesystem navigation. We recommend using the chdir()
standard library function (man 3 chdir
) for implementing this command.
An example of a properly implemented cd
command can be found in Listing 3.
Your shell must also print a suitable message for any errors encountered while navigating the filesystem (e.g. non-existent directories).
fsh> /usr/bin/pwd /home/student fsh> cd Documents fsh> /usr/bin/pwd /home/student/Documents fsh> cd /nonexistent cd: The directory '/nonexistent' does not exist
2.3.2. The exit
command
The exit
built-in command must terminate the shell when invoked, or when the user
inputs an EOF
character using the CTRL+D
keyboard shortcut.
2.3.3. Signal propagation
Your shell must allow the user to terminate a running command using the SIGINT
signal (CTRL+C
keyboard shortcut).
The signal must not terminate the shell itself, only the process running in the foreground.
If the user sends a SIGINT
signal when a process is not running, the shell should simply print a new prompt.
We recommend using the sigaction
function (man 3 sigaction
) to implement this feature.
Your shell should "intercept" the signal and handle it depending on its state.
2.3.4. Important remarks
The process group feature 1 and the POSIX standard dictate that all signals originating from the keyboard must be
propagated to all processes belonging to the foreground process group. Since our shell is automatically assigned to this group,
this will cause our shell to misbehave as the signal propagates to all of its child processes.
To solve this issue, we must explicitly place all newly-created child processes in a separate group using the
setpgid(0,0)
function (man 3 setpgid
) before the child process invokes execv()
.
2.3.5. Recommended solving steps
- Add
cd
andexit
to the list of builtin commands, - Implement
cd
andexit
, - Use
setpgid
to place all child processes in separate groups, - Implement the
SIGINT
signal handler.
2.4. Refined command execution using environment variables
The basic command execution feature required the user to provide the full path for a command.
Since typing full program paths is a cumbersome process, all modern shells provide a more refined mechanism that executes programs using only their names.
After the user inputs the command name, the shell then tries to find the corresponding program in a set of predefined directories listed
in the PATH
environment variable 2.
The value of the PATH
environment variable consists of a series of directory paths separated using the ":" character.
Listing 4 shows an example of the PATH
environment variable contents.
> echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/sbin
Your shell must fetch the contents of the PATH
environment variable using the getenv
function, process it, and use it to search the corresponding
directories before executing a command.
process_path() { ... char *contents = getenv("PATH"); ... }
2.4.1. Important remarks
- We recommend using the
access
system call when searching directories - You must not perform a directory search when the user inputs a full path (i.e. if the command name starts with '/' or '.')
- The shell should output a proper error message if the program can not be found
2.4.2. Recommended solving steps
- Study the
getenv()
function, - Fetch and parse the
PATH
environment variable contents when the shell starts, - Implement program search.
3. Remarks
- Make sure to check the return values of all system calls and library functions and implement proper error handling to facilitate solving this exercise.
- Full program paths may vary from system to system. Use the
whereis
command to find program paths on your system. - To keep things simple, test your shell using commands that do not require additional user input
- All functionalities listed in this exercise can be found in all modern shells. You are welcome to use the behaviour of any shell as a baseline for testing your implementation.