Operating systems
Second laboratory exercise

Table of Contents

1. Introduction

The goal of this laboratory is to become familiar with various low-level Unix/POSIX system calls related to process creation and management. You will write a simple shell, fsh (Fer SHell), that implements a few basic functionalities.

2. Examples of shell functionality

This exercise must be implemented in a UNIX-like environment (e.g. GNU/Linux, FreeBSD) using the C or C++ programming language (you are welcome to use any standard of these languages). You are also welcome to use any functions provided by the standard C library except the system, execlp, execvp and execvP functions.

2.1. Accessing system call and library function documentation

During this exercise, you will need to use complex system calls whose behaviour needs to be analyzed beforehand to ensure that your shell functions properly. The main source of information regarding system calls and library functions are the so called man pages, accessed using the man command. The man command takes a program or function name as an argument and displays the corresponding documentation page. It is important to note that the man pages are divided into discrete sections, grouped by the nature of the content they describe. For instance, the third section documents library functions, the second section documents system calls, and the first section documents builtin shell commands and programs. If invoked with only a name argument, the man function will sequentially scan the sections until it finds the first matching documentation page. Unfortunately, this can lead to confusing situations as some programs and functions share names, leading to man displaying the wrong documentation page, as depicted in the following listing.

Accessing the sleep library function documentation.
$ man sleep
SLEEP(1)   User Commands   SLEEP(1)

NAME
       sleep - delay for a specified amount of time
SYNOPSIS
       sleep NUMBER[SUFFIX]...
       sleep OPTION
...
$ man 3 sleep
sleep(3)   Library Functions Manual   sleep(3)

NAME
       sleep - sleep for a specified number of seconds

LIBRARY
       Standard C library (libc, -lc)

SYNOPSIS
       #include <unistd.h>
       unsigned int sleep(unsigned int seconds);
...

Fortunately, the man command takes an optional section number argument, searching the documentation page only in the specified section. Detailed information for any system call can be obtained using the man 2 <syscall_name> command. Detailed information for any standard C library function can be obtained using the man 3 <function_name> command.

2.2. Basic command execution

Pseudocode 1 roughly covers the basic shell execution flow. Before accepting user input, your shell must print "fsh> " to signal that it is ready to process the user's input. Each command consists of a command name and, optionally, multiple arguments that are separated using one or more whitespace characters. You don't need to implement deleting characters when typing commands.

while(1){
 print "fsh> ";

 wait for user input;
 read user input and parse;

 command = find_command();
 if(command not builtin){
   create a new process;
   execute the command in the child process;
   wait until the child process has exited;
 }
}
fsh> /usr/bin/pwd
/home/student
fsh> /usr/bin/echo "hello"
hello
fsh> /usr/bin/ls -lh
# directory contents
fsh> /bin/asdf
fsh: Unknown command: /bin/asdf

2.2.1. Recommended solving steps

  1. Write the basic shell loop,
  2. Implement command and argument parsing,
  3. Thoroughly study the fork, wait, and execv system calls,
  4. Implement support for "recognizing" builtin commands (will be required for the next task),
  5. Use the system calls listed in the third step to implement basic command execution.

2.3. Basic builtin commands and signal propagation

2.3.1. The cd command

The cd built-in command provides basic filesystem navigation. We recommend using the chdir() standard library function (man 3 chdir) for implementing this command.

An example of a properly implemented cd command can be found in Listing 3. Your shell must also print a suitable message for any errors encountered while navigating the filesystem (e.g. non-existent directories).

fsh> /usr/bin/pwd
/home/student
fsh> cd Documents
fsh> /usr/bin/pwd
/home/student/Documents
fsh> cd /nonexistent
cd: The directory '/nonexistent' does not exist

2.3.2. The exit command

The exit built-in command must terminate the shell when invoked, or when the user inputs an EOF character using the CTRL+D keyboard shortcut.

2.3.3. Signal propagation

Your shell must allow the user to terminate a running command using the SIGINT signal (CTRL+C keyboard shortcut). The signal must not terminate the shell itself, only the process running in the foreground. If the user sends a SIGINT signal when a process is not running, the shell should simply print a new prompt. We recommend using the sigaction function (man 3 sigaction) to implement this feature. Your shell should "intercept" the signal and handle it depending on its state.

2.3.4. Important remarks

The process group feature 1 and the POSIX standard dictate that all signals originating from the keyboard must be propagated to all processes belonging to the foreground process group. Since our shell is automatically assigned to this group, this will cause our shell to misbehave as the signal propagates to all of its child processes. To solve this issue, we must explicitly place all newly-created child processes in a separate group using the setpgid(0,0) function (man 3 setpgid) before the child process invokes execv().

2.3.5. Recommended solving steps

  1. Add cd and exit to the list of builtin commands,
  2. Implement cd and exit,
  3. Use setpgid to place all child processes in separate groups,
  4. Implement the SIGINT signal handler.

2.4. Refined command execution using environment variables

The basic command execution feature required the user to provide the full path for a command. Since typing full program paths is a cumbersome process, all modern shells provide a more refined mechanism that executes programs using only their names. After the user inputs the command name, the shell then tries to find the corresponding program in a set of predefined directories listed in the PATH environment variable 2. The value of the PATH environment variable consists of a series of directory paths separated using the ":" character. Listing 4 shows an example of the PATH environment variable contents.

> echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/local/sbin

Your shell must fetch the contents of the PATH environment variable using the getenv function, process it, and use it to search the corresponding directories before executing a command.

process_path() {
    ...
    char *contents = getenv("PATH");
    ...
}

2.4.1. Important remarks

  • We recommend using the access system call when searching directories
  • You must not perform a directory search when the user inputs a full path (i.e. if the command name starts with '/' or '.')
  • The shell should output a proper error message if the program can not be found

2.4.2. Recommended solving steps

  1. Study the getenv() function,
  2. Fetch and parse the PATH environment variable contents when the shell starts,
  3. Implement program search.

3. Remarks

  • Make sure to check the return values of all system calls and library functions and implement proper error handling to facilitate solving this exercise.
  • Full program paths may vary from system to system. Use the whereis command to find program paths on your system.
  • To keep things simple, test your shell using commands that do not require additional user input
  • All functionalities listed in this exercise can be found in all modern shells. You are welcome to use the behaviour of any shell as a baseline for testing your implementation.

Footnotes:

Created: 2023-03-10 pet 11:50

Validate