| title | CS 537 Project 3 |
|---|---|
| layout | default |
Important
- This project will be checked for memory leaks. Make sure you free all the memory you allocate.
- This project will have a code review. Make sure you understand the code you write and are able to explain it.
In p2 you implemented a system call. In this project you will utilize system calls to allow users to interact with the operating system. The shell is a program that reads commands from the user and executes them. These commands can be built-in commands or external programs.
Learning Objectives:
- Understand how processes are created, destroyed, and managed.
- Know how processes communicate among each other via pipes.
- Understand environment variables and their role in process execution.
Your submission:
- Should (hopefully) pass the tests we supply.
- Will be tested in the CS537-Docker environment. We recommend you use this environment for your testing too.
- Update the README.md inside solution directory to describe your implementation. This file should include your name, your cs login, you wisc ID and email, and the status of your implementation. You are encouraged to use it to document your design decisions and any issues you faced.
- Provided Code: You are provided with a parser (parser.c and parser.h). You can use this parser to parse your input (it should simplify things for you significantly). It is recommended that you go through the parser code to understand how it works. Feel free to modify it if needed.
In this project you will build a command line interpreter (CLI) or shell. The shell operates in two modes:
- Interactive Mode: When run without arguments, your shell displays a prompt (
wsh>) and waits for user input. - Batch Mode: When run with a single filename argument, the shell reads commands from the file without printing a prompt. You may not assume that the batch script ends with an
exitcommand. The shell should exit when it reaches the end of the file.
A shell is a program that acts as a command line interpreter. It reads commands from the user (via the keyboard or a script) and executes them. It serves as the primary interface between the user and the operating system.
Basic Shell Loop: The core logic of a shell is a simple loop:
- Read: Get the next command line from the user/file.
- Parse: Break the command line into arguments, handle quotes, pipes, and variables.
- Execute: Run the parsed command.
Example Session:
wsh> ls
Makefile wsh.c wsh.h
wsh> echo hello "world"
hello world
Parsing command lines can be tricky (e.g., handling "quoted arguments", escaped\ characters, and | pipes). To let you focus on process management and system calls (the main learning objectives), we provide a parser in parser.c and parser.h. You just need to call parse_input() after reading the command line and then execute the returned structure.
The solution folder already contains a simple parser in parser.c and parser.h.
-
Inspect
parser.hto understand the structures (command_line,pipeline,command). -
The parser handles:
- Splitting lines into pipelines (separated by
;). - Splitting pipelines into commands (separated by
|). - Splitting commands into arguments.
- Handling quotes (
'and"). - Handling escapes (
\). - Variable Substitution: It calls your
get_variablehook (feature discussed later).
Sample Command:
ls -l- Parser Output: The parser converts this string into a
commandstructure whereargv[0]is"ls"andargv[1]is"-l". Matches quotes and handles spacing for you.
- Splitting lines into pipelines (separated by
Implement the main loop in wsh.c that:
- Prints the prompt
wsh>(if interactive). - Reads a line of input.
- Parses the input.
- Executes the parsed commands.
Behavior: When you run ./wsh, it should print the prompt wsh> and wait for the user to type a command. If the user types ls, it reads that line and proceeds to parse and execute it. More on how to execute commands in the next task.
You must support running External Programs (e.g., ls, cp, cat, echo, etc.).
-
Use
fork(),execv(), andwaitpid()(orwait). -
Path Resolution:
- Environment Variables: The shell maintains a list of environment variables, which are key-value pairs (e.g.,
USER=johndoe). A specific environment variable,PATH, contains a list of directories separated by colons (e.g.,/bin:/usr/bin). - Searching: When the user types a command like
ls, you must search each directory (in-order) in thePATHchecking if there is an executable file namedls. The first match is the one that should be executed. - System Calls: For this project, you should not use
execvp(which searches for you). Instead, useexecvcombined with your own search logic (useaccess()withX_OKto check existence/permissions, checkman accessfor more information). - If
PATHis not set, initialize it to/bin. - NOTE: If the executable starts with a
/, treat it as an absolute path and it should be executed directly without searching thePATH.
Sample Command:
ls- Behavior: The shell searches the
PATH(e.g.,/bin:/usr/bin) for an executable namedls. It finds/bin/ls, fork/execs it, and waits for it to finish. The user sees the file listing of the current directory.
- Environment Variables: The shell maintains a list of environment variables, which are key-value pairs (e.g.,
Implement the following built-in commands, executed directly by the shell - not in a child process. Built-in commands are prioritized over external programs, i.e. if a user types a command that is the same as a built-in command, the built-in command should be executed not the external program.
-
exit:- Behavior: Terminates the shell by calling the
exit()system call. - Arguments: Ignores any arguments.
- Behavior: Terminates the shell by calling the
-
cd [dir]:- Behavior: Changes the current working directory by calling the
chdir()system call. Supports both absolute (starts with/) and relative paths (does not start with/, can use.and..). - No Args: Defaults to the path specified by the environment variable
HOME.- Error: If
HOMEis not set, printcd: HOME not setto stderr and return 1.
- Error: If
- Error: If
chdirfails (e.g., dir does not exist), callperror("cd")and return 1.
- Behavior: Changes the current working directory by calling the
-
env [VAR=val]:- No arguments: Prints all environment variables. (Hint:
man environ) - With arguments:
env VAR=val: SetsVARtoval.env VAR: SetsVARto empty string.- Error: If
setenvfails, callperror("setenv").
Sample Command:
cd /tmp- Behavior: The shell parses
cdand/tmp. It identifiescdas a built-in. It callschdir("/tmp"). Subsequent commands (likels) will now run in/tmp.
- No arguments: Prints all environment variables. (Hint:
-
What are they?: Environment variables are variables inherited by child processes. When your shell forks a child process, it gets a copy of your shell's environment variables.
-
Management:
- Use the
envbuilt-in (Task 3) to set variables. - You must use the
setenv()system call to modify variables in your shell (so they can be inherited by future children).
- Use the
-
Substitution (
$VAR):- Hook: Implement
get_variable(const char *var)inwsh.c. When the parser encounters a$VAR, it will call this function to get the value ofVAR. - Edge Case: If
getenvreturns NULL, return an empty string"".
Sample Command:
echo $USER- Behavior: The parser detects
$USER, callsget_variable("USER"), receives "pavan" (for example), and replaces$USERwithpavan. The shell then executesecho pavan.
- Hook: Implement
- Syntax:
command1 | command2 | ... | commandN. - Chain Length: Pipelines can contain more than two commands.
- Example:
cat file.txt | grep "search" | wc -l(3 commands). - Limit: You may assume a maximum of 128 commands in a single pipeline.
- Example:
- Execution:
- Call
pipe()to create a pipe. fork()for each command.- Use
dup2()to redirect stdout of left command to write-end of pipe. - Use
dup2()to redirect stdin of right command to read-end of pipe. - Close unused file descriptors!
- Call
- Simultaneous Exection: All commands run at the same time.
- Waiting: The shell must wait for all child processes in the pipeline to finish.
- Edge Case:
forkorpipefailure -> callperrorand exit with 1. - Built-in Commands: If a built-in command (e.g.,
cd,exit) is part of a pipeline, it must be executed in the child process. This means it will not affect the parent shell (e.g.,cdin a pipeline will not change the shell's cwd).
For more information and examples, see Supporting Pipelines.
**Sample Command**: `ls | wc -l`
* **Behavior**: The shell creates two child processes connected by a pipe. `ls` writes its output to the pipe's write-end. `wc -l` reads its input from the pipe's read-end. The result is the count of files in the directory.
You must handle the following specific error conditions:
- Command Not Found:
- If
execvfails, printCommand not foundto stderr andexit(1)(in the child process).
- If
- Batch Mode Errors:
- File Not Found: If
fopenfails, callperror("fopen")and exit with 1. - Invalid Usage: If >1 argument provided, print
Usage: ./wsh [file]to stderr and exit with 1.
- File Not Found: If
- System Call Failures:
- Check return values of
malloc,fork,execv,pipe,dup2. - On failure, use
perror("function_name")(e.g.,perror("fork")).
- Check return values of
- Empty Lines/Comments:
- The parser handles these, checking for empty strings or lines starting with
#. You should just ensure your loop handlesparse_inputreturning NULL gracefullly.
- The parser handles these, checking for empty strings or lines starting with
- Arbitrary Line Length:
- You should support lines of any length. Do not use a fixed-size buffer. Either use
getlineor dynamically allocate memory to store the input line.
- You should support lines of any length. Do not use a fixed-size buffer. Either use
You are expected to write test scripts to verify that your implementation works correctly.
We have provided a testing script tests/run-tests.sh.
To run the provided tests:
- Navigate to the
tests/directory. - Run
./run-tests.sh.
It is recommended that you test your implementation incrementally as you go. Look out for memory leaks using valgrind.
- Step 1: Get
wshto compile and run the main loop (prompt/exit). - Step 2: Implement basic program execution (argument parsing is done for you!).
- Step 3: Implement
cdandexit. - Step 4: Implement
envandget_variablehook. - Step 5: Implement
PATHresolution. - Step 6: Implement Pipelines (
|). - Test incrementally!
- Memory Management: This project will be checked for memory leaks using
valgrind. Make sure to free all allocated memory. Usevalgrind --leak-check=full ./wshto check for memory leaks. - Error Handling: Print errors to
stderr. Useperrorfor system call failures. - Parser: Understand the parser before implementing your solution. It does most of the string processing work for you.
- Development Environment: Use the provided Docker container to ensure your code runs in the expected environment.
For detailed information on the system calls and concepts used in this project, refer to the following chapters from OSTEP:
- Chapter 5: Process API - Essential reading for understanding
fork()andexec()family of system calls.- Covers how processes are created with
fork() - Explains the
exec()family (execv(),execvp(), etc.) for running programs. For this project you should useexecv(), notexecvp(). You are expected to implement the path resolution yourself. - Includes the
wait()andwaitpid()system calls for process synchronization - Code examples demonstrating these concepts
- Covers how processes are created with
- Man Pages (Essential reading for pipes):
man 2 pipe- Thepipe()system call for creating a unidirectional data channelman 7 pipe- Overview of pipes and FIFOs, including I/O semantics and capacityman 2 dup2- Duplicating file descriptors to redirect stdin/stdout
Don't forget to use the manual pages for detailed API documentation:
man 2 fork- Process creationman 3 exec- Execute a program (see alsoexecv,execvp)man 2 pipe- Create a pipeman 7 pipe- Overview of pipes and FIFOs, including I/O semantics and capacityman 2 dup2- Duplicate file descriptorsman 2 wait- Wait for process to change stateman 3 getenv- Get environment variableman 3 setenv- Set environment variable