Introduction to Software Development Tooling

Lecture Notes: Command Line

Lecture 1

Module overview

In the first of our four modules, you’ll learn to effectively use the Linux command line. You’ve all likely encountered a command line at some point, be it in the “Command Prompt” app on Windows, the “Terminal” app on macOS or Linux, or elsewhere. You may have run simple commands there, perhaps to walk the filesystem with ls, dir or cd; to compile programs with gcc, clang, or cl.exe; or to run scripts with python, or ruby, or perl.

But the command line is far more powerful than these simple commands might lead you to believe. There’s a rich suite of standard tools, accessible via the command line, for working with files and interacting with your operating system. In this module, we’ll teach you about these tools and about the operating system features and abstractions that they expose.

We’ll also teach you how to use the shell, which is the program that interprets the commands you type. The shell makes it easy to interact with files and programs on your computer by letting you find and rerun past commands, chain together commands to perform complex operations, and even write scripts to automate running long sequences of commands.

Learning this material will let you make more effective use of the tools you’ll learn about in this course as well as command-line tools you’ve already encountered like compilers. Command-line tools are ubiquitous in all areas of software engineering, and you’ll frequently interact with tools that can’t be used any other way¹. Even when a tool has a graphical interface (or a third-party wrapper that adds one), it’s rare for that interface to expose the full set of functionality that’s available from the command line.

Bash

Every major OS has a shell (and some have multiple!), and they all provide the general functionality mentioned above. However, the specific syntax and commands available vary quite significantly. As such, we needed to pick a specific shell to teach for this module. The shell we chose is called Bash, and it’s the default shell used by the vast majority of Linux distributions as well as by WSL, Windows’ Linux compatibility layer.

There are a number of things that make Bash a good shell to learn. First and foremost, it’s what’s called a POSIX shell. POSIX is the IEEE standard that defines UNIX-like operating systems, and both Linux² and macOS follow it. Part of POSIX defines what syntax and commands a shell needs to support. Most of the shell features we cover in this course are part of POSIX, meaning they won’t apply just to Bash but to any POSIX shell you encounter—for example, zsh, which is macOS’s default shell and a popular alternative to Bash on Linux.

Bash does have some nice quality-of-life features that go beyond what POSIX mandates, though. For example, POSIX doesn’t say anything about interactive shell usage (the process of entering commands at a prompt), so things like command history and search aren’t something you’ll find in more barebones POSIX shells. We’ll also cover some Bash-specific command syntax that you’ll likely encounter when reading shell scripts, as the vast majority of scripts you’ll find in the wild are written for Bash.

GNU + Linux

As you’ll see shortly, most of the command line’s power comes from programs you run. Those programs are not part of the shell but are instead provided by your operating system. All POSIX operating systems (which are the only kind Bash will run on) provide a standard set of tools with names and functionality specified by POSIX. However, just like POSIX shells, there are many different implementations of these tools and most have extra features beyond what POSIX mandates. And that’s not to mention the numerous non-POSIX tools that come with any given operating system.

In this course, we’ll constrain our studies to the tools you’ll find on desktop and server Linux distributions³. These distributions rely on GNU Coreutils to provide their POSIX tools. Many other programs and libraries on these distributions also come from GNU, which is why some people refer to them as GNU+Linux⁴ instead of simply Linux.

We’ve chosen GNU+Linux because it’s freely available to anyone, will run on almost any hardware (unlike, say, macOS), and is what many real-world servers run. Much of what you will learn is transferable to other POSIX operating systems like macOS, but some of it (for example, the exact directory hierarchy) is not.

Anatomy of a shell prompt

The first thing you see when you open a command line is what’s known as a prompt. On the instructor’s computer, it looks like this:

max@cedar:~$

The prompt is printed by the shell (which is Bash for all examples in this course) and serves both to inform you that the shell is ready to accept a new command and to orient you with basic information about the state of the command line.

In the example above, the prompt consists of three dynamic parts: the first, max, is the username of the current user. The second part is the name (or hostname) of the computer you’re using. In this case, that’s one of Max’s computers, which are all named after tree species. The third part, ~, is the current working directory of the shell.

Each of these pieces of information serves a purpose: the hostname and username together tell you where you’re executing the command. Accidentally running a command on the wrong computer (for example, if you forget you’ve run ssh) or as the wrong user (if you forgot you’ve run su or sudo) can be catastrophic—imagine accidentally rebooting a server that dozens of people are using instead of your local workstation—and so nearly every shell prompt you see will include this information prominently.

The punctuation that separates this information is considerably less important and has no standardized meaning. The symbol @ (“at”) is conventional and this has carried over even to email addresses. The symbol $ (“dollar”) is conventional to indicate the end of the prompt. But the Bash prompt is customizable, and other systems you encounter will use different punctuation and include slightly different information⁵. In fact, there’s one notable piece of information missing from the Tufts prompt. To illustrate, let’s look at a different prompt. Here’s the Bash prompt from the computer I’m writing these notes on, which runs Arch Linux:

[thebb@stingray ~]$

You can see that the username and hostname are still there, although now in reverse order and separated by an at sign. But the history event number is missing and instead there’s a tilde (~) character. The tilde indicates my working directory, which is the place in the filesystem I’m currently “at”. You’ll learn a more rigorous definition for this later, but for now you can think of it as the place where commands look for files by default. (For example, running ls with no arguments shows you the files in your working directory.) The tilde character is shorthand for my home directory, which is a place for me (as the user “thebb”) to put files without worrying about interference from other users on the system.

Most prompts you’ll encounter will show your working directory, as it’s another piece of information that can affect how commands behave. For systems like Tufts’ that don’t, you can run the pwd (short for “print working directory”) command to see where you are. Most Bash prompts also end with $ , for historical reasons that have been lost to time.

Entering commands

But the prompt isn’t just a pretty thing to look at: it’s also a rich interface for composing and editing commands. Have you ever wondered why you can use the left and right arrow keys to move the cursor around while typing a command but can’t do the same inside a program you’ve written? It’s because interactive shells like Bash know how to⁶ handle the special characters that the kernel emits when you press an arrow key and respond to them appropriately.

The features don’t stop there! We’ll talk about lots more features of the Bash prompt in a future lecture, but there are two essential ones that you should know about now, as they’ll save you oodles of typing. These two features are history navigation and tab completion.

History navigation refers to the ability to populate the prompt with a previously-run command by pressing the up arrow. Each time you press it, you’ll go back by one command; once you’ve found the one you want (and maybe edited it slightly), just press Enter to run it! If you go too far, the down arrow will bring you back, and you can press Ctrl-c to go straight to a fresh prompt no matter how far back you are.

Tab completion is how you ask the shell to figure out what you want to type before you’re done typing it. For example, let’s say you want to list some programs installed on your computer in the /bin directory. Type ls /b (without pressing Enter) and then press the Tab key to trigger tab completion. If you did it right, nothing will happen immediately. This is because are probably two directories that begin with b in / and Bash doesn’t know which one you want, which it signals by ignoring your first press of Tab. However, if you press Tab again, Bash will show you a list of every option it knows about⁷:

bin/     boot/

Now, add i to your command so you have ls /bi and press Tab again. This time, Bash can see there’s only one choice that matches and so it will immediately fill in the n/, no confirmation required. Tab completion also works with command names (although since POSIX commands are almost universally four letters or less, it’s not usually as useful there) and some other command-specific things as well. If ever in doubt, just try it!

Anatomy of a command

Now that you’re a pro at typing in commands, let’s talk about the things you can type! Although the shell provides the interface you use to enter and edit commands, it’s not what implements the commands themselves (with a few exceptions). As its name implies, the shell is a thin layer through which you can access the functionality of your operating system and other programs on your computer. As such, most commands you’ll run instruct the shell to execute some other program. Here are three examples of such commands, all of which run the program /usr/bin/ls.

$ ls
file1    file2    file3
$ /usr/bin/ls
file1    file2    file3
$ ls /
bin  boot  dev  etc  home  lib  lib64  lost+found  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
$ 

All the lines starting with $ in this example were printed by the shell; each consists of the prompt followed by some command. The other lines, however, were printed by the various invocations of ls. After running a program, the shell gets out of the way until that program completes, leaving the program free to print output and read input without interference. Once the program exits, the shell prints a new prompt and is ready for another command⁸.

A command is made up of multiple parts. The first word of each of these three commands (i.e. ls or /usr/bin/ls) is known as the command name and tells the shell what program to run. All following words (like / for the third command) are known as arguments and get sent to the program for interpretation. In programming terms, you can think of each command like a function invocation, where the first word is the function’s name and the rest make up the argument list.

So how does the shell know what program to run? For the /usr/bin/ls command, this might seem obvious (and certainly will once you’ve read the next section): the shell takes the command name and executes the program at that file path. But where does it look to find just ls? Since there’s no file called ls in the working directory, the shell must find the ls program somewhere else.

As it turns out, command names without slashes in them get special treatment: instead of parsing them like a file path, the shell looks in a set of predefined directories (typically including /usr/bin/) for files with matching names and runs the first one it finds. Because system tools like ls are in these directories, you don’t need to remember or type their full file paths⁹. If you ever want to know what program a command runs, you can use the which command with the command name as an argument:

$ which ls
/usr/bin/ls
$ 

Name parsing is the same for every command, but argument parsing is anything but. Because a command’s arguments are interpreted by that command and not by the shell, every command you use will accept different arguments and assign those arguments different meanings. Some of these meanings are easy to guess—for example, ls can take as an argument a file path to list; if you don’t give it one, it lists your working directory. But many aren’t, and in those cases you’ll need to find them out some other way.

This is where man pages come in. Short for “manual page”, a man page holds documentation for a command that’s accessible directly from the command line—no Google needed! To access a man page, run the man command and give it the command you want to learn about as an argument (for example, man ls). This will open up a full-screen view of the man page, which you can navigate with the arrow or PgUp/PgDn keys and leave by pressing q. Most man pages follow a common structure, but we’ll leave the details of that structure for later, after you’ve learned a bit more about some common conventions for arguments.

The Linux filesystem

Many of the examples so far have revolved around files and directories: the shell prompt shows your working directory, ls lists files in a directory, the shell runs programs stored in files, and so on. You’re most likely already familiar with the basics of files and directories: files hold data and can be arbitrarily nested within directories, which hold files. But there are some POSIX- and Linux-specific details about how files work that you might not know about.

On POSIX operating systems, all files live somewhere within a special directory called the root directory. The root directory is referred to by a single slash. (This is in contrast to an operating system like Windows, where multiple directory trees exist with names like C:\ and D:\.) When you run pwd, you see where your working directory lives under the root directory:

$ pwd
/h/utln01
$ 

The path /h/utln01/¹⁰ refers to a directory called utln01, inside a directory called h, inside the root directory /. This style of path—relative to the root—is called an absolute path and always starts with a slash. The other type of path you’ll encounter is called a relative path and never starts with a slash. Relative paths are interpreted relative to your working directory and so can mean different things at different times:

~ $ cd /
/ $ ls /comp/50ISDT/examples/file-zoo/
directory1  file1  file1-link  file2  file3  missing-link
/ $ ls comp/50ISDT/examples/file-zoo/
directory1  file1  file1-link  file2  file3  missing-link
/ $ cd comp
/comp $ ls /comp/50ISDT/examples/file-zoo/
directory1  file1  file1-link  file2  file3  missing-link
/comp $ ls comp/50ISDT/examples/file-zoo/
ls: cannot access  comp/50ISDT/examples/file-zoo: No such file or directory
/comp $ ls 50ISDT/examples/file-zoo/
directory1  file1  file1-link  file2  file3  missing-link
/comp $ 

In this example, we’ve added the working directory to the prompt to make each command’s working directory clearer. We start in our home directory ~ and immediately move to the root with cd /. While in the root, an absolute path refers to the exact same place as a relative path with the same components. But as soon as we switch into a subdirectory (h in this case), that’s no longer the case: ls h/utln01/ is now equivalent to ls /h/h/utln01/, which refers to a directory that doesn’t exist.

In addition to /, there are some other special directory names you should know about. Every directory has a hidden subdirectory named .., which refers to its parent directory, as well as one named ., which refers to itself¹¹. These are most often useful in relative paths, for example ../../somedir/, but are also perfectly legal to use in absolute paths: /h/utln01/ is the same as /h/../h/utln01/.

Finally, we mentioned the home directory shorthand ~ earlier in the lecture. Although this behaves similarly to the special directories mentioned above (~/file1, for example, means file1 inside your home directory), it’s actually implemented very differently. While / and .. and . are implemented by the Linux kernel and work as part of any path, ~ is implemented by the shell and so only works when you’re running a shell command. It won’t work, for example, from an fopen() call in C. To see this for yourself, you can put some of these special directory names inside single quotes when passing them to shell commands. As you’ll see next lecture, quotes (both single and double) suppress some of the shell’s processing, including ~ expansion:

$ ls '/comp/50ISDT/examples/file-zoo'
directory1  file1  file1-link  file2  file3  missing-link
$ ls '/comp/50ISDT/examples/file-zoo/directory1/..'
directory1  file1  file1-link  file2  file3  missing-link
$ ls '~'
ls: cannot access ~: No such file or directory
$ 

Lecture 2

Quoting

We mentioned at the end of last lecture that the shell’s expansion of the special home directory shorthand ~ can be suppressed by putting the ~ inside single quotes. This isn’t the only piece of processing that quotes prevent: many shell features are triggered by command names or arguments that include special characters. Often you’ll want those characters to be taken literally instead, though—especially if you don’t even know that the feature in question exists! This is where quotes come in.

Any piece of a Bash command that’s enclosed in single quotes will be preserved exactly as typed when the command gets processed, except that Bash will remove the quotes themselves. Notably, the quotes don’t have to be around an entire command name or argument: you can type a single word that consists of both quoted and unquoted bits. A useful command for seeing how the shell processes a word is echo. echo prints out all the argument values it receives separated by spaces, so it lets you clearly see which arguments the shell has proprocessed or expanded:

$ echo ~
/h/utln01
$ echo '~'
~
$ echo ~/foobar
/h/utln01/foobar
$ echo '~/foobar'
~/foobar
$ echo '~'/foobar
~/foobar
$ echo '~/fo'obar
~/foobar
$ 

Quotes can be used anywhere in a command, even in its name:

$ echo 'Hello, world!'
Hello, world!
$ 'ec'ho 'Hello, world!'
Hello, world!
$ 

And ~ is far from the only character that would otherwise have special meaning. POSIX specifies the full set of characters that the shell cares about to be |, &, ;, <, >, (, ), $, `, \, ", ', *, ?, [, #, ~, =, %, space, tab, and newline. Bash also treats ! specially¹². We’ll talk about what many of these characters mean to the shell next lecture, but for now just be aware that if you include them in a command unquoted, that command may not behave as expected.

One notable thing about this list of characters is that it includes the single and double quote characters themselves. And that makes sense, since we’ve just seen that the shell treats quoted strings specially. But what if you want to pass a literal single or double quote to a command? Since single-quoted strings undergo no processing at all, there’s no way to include a single quote inside one of them; the shell will interpret any it finds to be the ending quote.

Luckily, there are ways to get around this. The shell has two other common ways to prevent processing of one or more characters. The first is to precede any character by a backslash \\. The backslash causes the character immediately following it to be treated as if you’d enclosed it in single quotes. This is known as escaping that character. Since the backslash needs no closing delimiter, it can escape ' (\') and even itself (\\)!

$ echo \'
'
$ echo can\'t
can't
$ echo \\
\
$ echo \a\b\c\d
abcd
$ echo \~
~
$ 

The second way is to enclose the characters in double quotes. Double quotes behave similarly to single quotes, except that certain special processing is still allowed. One such piece of processing is the backslash escape (meaning you can use a backslash to include double quotes inside a double quoted string). (We’ll go over the other pieces when we discuss variable and command substitution in a later lecture.)

$ echo "can't"
can't
$ echo "The last command printed \"can't\""
The last command printed "can't"
$ 

One of the most common uses of quotes is to include spaces, tabs, or newlines in an argument value. The shell typically treats whitespace as a separator between arguments, and quotes/backslashes suppress this behavior. In the echo examples so far, we haven’t really cared whether we’re passing a single argument containing multiple words or multiple arguments each containing one, since echo joins all its arguments with spaces before printing them anyway. But the distinction is much more important when we’re trying to specify something like a filename.

To illustrate, let’s look at an example directory on the course server with a file named hello (containing “1”), one named world (containing “2”), and one named hello world (containing “3”). When viewing these files using the cat command (which prints the contents of one or more files), quotes make a big difference:

$ cd /comp/50ISDT/examples/file-zoo/directory1/
$ ls
hello  hello world  world
$ cat hello world
1
2
$ cat "hello world"
3
$ 

Argument parsing and flags

Besides the shell’s preprocessing we’ve just discussed, there’s no one set of rules about how arguments work: every program decides how to process its own arguments. However, there are some common conventions for arguments that the vast majority of commands follow. Knowing these conventions, which we discuss here, will help you interpret help text and man pages for many tools.

As an aside: There’s a family of C library functions called getopt (man 3 getopt) that many software packages use to parse arguments, which has solidified these conventions somewhat. Similar libraries, inspired by getopt, exist in other languages. For example, Python’s argparse and Rust’s clap.

Part of these conventions are the notion of flags. Flags are optional arguments that, when provided, alter the behavior of a piece of software. Some flags operate alone and indicate that a program should (or shouldn’t) behave in a certain way. For example, ls’s -l flag indicates that ls should use its long listing format. Certain flags require an argument (to the flag!); for example, in gcc -o filename, filename indicates to -o where the output file should go.

Because command line users usually value brevity, flags are often written as just a single (sometimes cryptic) letter or number. Flags like these are by convention prefixed with a single dash (-) and can often be coalesced together behind that dash if you want to specify multiple. For example, a pair of flags -l -v could also be written as -lv. Note that short-form flags that take arguments cannot be followed by other short-form flags in this way: consider, for example, that gcc -ohello means -o hello and not -o -h -e -l -l -o.

Many programs augment these short-form, single-letter flags with corresponding long-form versions that consist of whole words and are more descriptive. Long-form flags are typically prefixed with -- and cannot be coalesced.

To illustrate, let’s look at a sample program, wc. wc, which stands for “word count”, counts the number of words in its input file. Running wc myfile prints the number of words in myfile. But the authors also taught wc to count other things, and they exposed that functionality using flags. wc -c, for example, will count characters, while wc -l will count lines. wc -w is another way to ask for the default behavior of counting words. The authors also added long-form variants—--bytes, --lines, and --words, respectively. In general, short-form flags are handy at the command line, but long-form ones are better for shell scripts (which we’ll talk about later) and documentation because they better convey meaning.

Although each program parses its own flags, many different programs recognize --version/-v and --help/-h. Because the command-line ecosystem is written by thousands of people, and everyone writes software differently, not all programs adhere to this convention. Some use -v to mean --verbose, or use -version (one hyphen!) as the long form, or something else entirely. As usual, your best bet is to look at the manual pages.

Let’s learn about some common commands to both apply our new knowledge of argument parsing and get acquainted with useful CLI tools.

Common commands

You’ll find yourself using some tools more frequently than others. Here is a shortlist of tools you will likely use often, alongside their descriptions, and some common invocations. Refer to their man pages for more information.

Many of these commands can read input from a file whose name is given as an argument, and that’s what we’ll focus on in this section. However, those same commands can also read input directly from what you type into the terminal (also known as stdin). If you ever forget to give a command a filename and it appears to hang forever, it’s probably waiting for you to type something. You can get back to the prompt by pressing Ctrl-c. We’ll talk about more powerful ways to make use of this mode later in this module.

`ls`

Although many of our examples have already used ls, they’ve thus far shown only its default behavior, which is to print the name of each file in the directory it’s passed (or the working directory if run with no arguments). ls has quite a few flags which alter this behavior, though.

The most common flag you’re likely to see is -l (which stands for “long listing” and has no long-form version). This flag causes ls to list not only the name of each file but also its type, link count, permissions, owner, group, size, and last modification date. Files are printed one per line in this mode to make room for the extra information:

$ ls -l
drwxrwsr-x. 2 thebb01 ta50isdt 4096 Sep  8 00:41 directory1
-rw-rw-r--. 1 thebb01 ta50isdt   11 Sep  8 00:16 file1
lrwxrwxrwx. 1 thebb01 ta50isdt    5 Sep  8 00:10 file1-link -> file1
-rw-rw-r--. 1 thebb01 ta50isdt   13 Sep  8 00:17 file2
-rw-rw----. 1 thebb01 ta50isdt   20 Sep  8 00:24 file3
lrwxrwxrwx. 1 thebb01 ta50isdt    6 Sep  8 00:11 missing-link -> foobar
$ 

Some of these fields are of little use and so we’ll skip discussing them, and others are complex enough that we’ll discuss them separately later on. The main things to notice at for the moment are the following:

The first character of each line indicates the file’s type. - indicates a regular file; d indicates a directory; l indicates a symbolic link (see ln below). There are other file types, but these three are the ones you’ll encounter most.
The fifth field of each line shows the file’s size in bytes. For directories, this will usually be 4096, which is the size of the data structure that Linux internally uses to represent directories. (For directories with many files, it’s sometimes a higher multiple of 4096 instead.)
The sixth field of each line shows the date and time the file was last modified. For directories, this changes only when files are added or removed. Linux allows modification dates to be changed arbitrarily (see man touch), so don’t rely on this as a guarantee the file hasn’t been altered!
The seventh field of each line is the name of the file, but for symbolic links it additionally includes the name of the file the link points to after ` -> `.

Another common flag of ls is --all/-a, which causes it to show files whose names begin with a .. Although such files have no special meaning to Linux, ls hides them by default. This behavior started out as a bug (archive link) but became a feature when people realized it could be used to hide things like configuration files that you don’t normally care about.

`mv`

Rename a file, or move it elsewhere in the filesystem with mv source destination. Overwrites the destination by default; be careful! If source and destination both exist and are different types, mv will complain. If the destination exists and is a directory, mv will instead put the source inside the destination directory.

`cp`

Copy a file or copy multiple files into a directory. Overwrites the destination by default; be careful! cp will not copy directories without --recursive/-r.

`mkdir`

Make a directory with the name specified, like mkdir foo. If you want to make nested directories in one command or avoid an error if the directory already exists, use --parents/-p.

`rm`

Remove one or more files. Be careful! This deletion is irreversible. Specify the filenames to be deleted in a list: rm file1 file2 file3 and so on. To remove directory contents recursively, use --recursive/-r. To avoid error messages when files and directories don’t exist, use --force/-f.

`cat`

Join files together and print to stdout. This is useful when sticking two files end-to-end (e.g. cat fileA fileB will print first fileA then fileB to stdout) or just showing the contents of one file.

`grep`

While programming it is often useful to find where a word, phrase, or regular expression occurs in a file or folder. grep can do all of that.

grep -r "functionName" projectFolder/ looks for the string “functionName” recursively (the -r) in all of the files in projectFolder/. It will print the matching lines in the format filename:line.

If you want to see what line number a match is found, you can use the -n flag. This will add filenames so the format becomes filename:linenumber:line. If you want to see some context around this line, you can also pass -A NUM (shows NUM lines of trailing context), -B NUM (shows NUM lines of leading context), or -C NUM (shows NUM lines before and after).

You may want to eventually search for patterns of text instead of just small strings. Imagine you want to find all calls to the function “myfunction”. You could search grep "myfunction(.*)", which would look for a call to “myfunction” with any number of characters between parentheses. This is called a regular expression search.

Sometimes you might want to find all the lines that do not contain a pattern, because the pattern is very frequent. In this case you can do grep -v "pattern" file, where -v stands for “invert”.

NOTE: Grep uses a more limited set of regular expressions than people normally refer to when they say “regular expression”. Read more about this on the GNU manual and this nicely written DigitalOcean tutorial.

`find`

Searching files by their contents is all well and good but it’s also useful to search for files by their attributes. To find a file by name, you can run find myfolder -name filename. The filename can also be a pattern with find’s limited pattern support. For example, you can find files whose names end in “ed” by running find -name "*ed". find supports many other predicates—you should read the man page to get some ideas.

find also supports a limited number of operations on the files it finds, such as -delete. In the event that you want to delete the files matching your search, you can add -delete to your find command. For more complicated actions, the xargs program can help you run a command for every file found.

You may have noticed that find does not follow the expected short/long flag convention, with single and double hyphens, respectively. The simplest, and somewhat dissatisfying answer, is that the authors of find hand-wrote their own argument parser instead of using the more standard getopt library. The course staff is not sure what sequence of events led to them writing their own parser.

`sed`

To replace text and text patterns in files and streams, use sed. For example, to replace the word “hello” with “goodbye” in a file original.txt, use sed 's/hello/goodbye/' original.txt. This will print the output to stdout. To replace it in-place, use the -i flag. Note: doing sed COMMAND original.txt >original.txt will not work because the > causes the shell to overwrite your file before sed even runs. We will talk more about this when we get to our section on pipelines.

Although the above usage is probably the majority of use, sed supports some regular expressions and other commands (other than s). Take a look at the COMMAND SYNOPSIS section of the sed manual pages for more information.

`cut`

If the input data is separated logically into columns, it’s possible to use cut to only print the selected columns. The data need not be space separated; it’s possible to specify a delimiter.

For example, to print column 2 of a file with comma separated columns, use cut -f2 -d',' myfile. It’s also possible to specify ranges of columns. Read the man pages for more information.

`sort`

To sort a file or stream’s lines, use sort. The default behavior is to sort lexicographically—in alphabetical order—so it will not sort numbers as you expect. For that, you want sort --numeric-sort, or sort -n. It also can reverse the sorting order with --reverse/-r.

Depending on your system’s implementation, there may be some other fun options, such as a stable sort, a merge of two already sorted lists (to be used in merge sort), or even sorting in parallel.

`head`

To keep only the first part of a file or stream, use head. It is useful for examining only the first line of a file, or the first ten lines, or any number of lines (--lines=NUM, -nNUM), really.

`tail`

The opposite of head! To keep only the last part of a file or stream, use tail. It also takes --lines/-n, but has additional features, too: if the file is growing, or there is more information coming in the stream, you can use --follow/-f to make tail continually print output.

Software engineers often use tail -f to observe a continually growing log file, maybe of a server, or a build process.

`man`

To help you make better use of your tools, package maintainers write manual pages for their software. To read a manual page for a particular piece of software, use man PROGRAM, like man ls. Some software is available in two forms: for example, printf is both a program in GNU coreutils and a C standard library function. Since the manual pages are separated into sections, you can refer to them separately: man 1 printf for the coreutils command and man 3 printf for the stdlib function. To read more about sections, check out man man.

Manual pages are available in a centralized place (like /usr/share/man) for package managers and install scripts to write to and you to read from.

`ln`

Create a symbolic link to a file, when used with the --symbolic/-s flag. The syntax is the same as cp—ln source destination—but instead of copying a file, it creates a special kind of file at the destination that forwards all accesses to the source. Symbolic links can be created to both files and directories, and you can generally treat the link just as you would the original file when using it in commands. ln with no flags creates hard links, which are a different and lesser-used type of link that we won’t discuss in this course.

`diff`

Sometimes you have two files and you don’t know if they are different. Or perhaps you know that they are different and you don’t know what is different about them. The diff command will print out a list of differences between two files (or directories) in a regular format:

LINEcCOL
< LEFT-FILE-LINE
---
> RIGHT-FILE-LINE

It also changes status code depending on the result: 0 if the same, 1 if different. This can be useful in shell scripts, so you can do things like diff fileA fileB || echo "different" or diff fileA fileB && echo "same". We will have more about this coming up in our section on pipelines.

`which`

which helps find commands. If the command exists as a binary, it will tell the path. To find all the matching binaries, use -a.

While some shells have which as a shell built-in, so it can report other shell built-ins, Bash uses the system which binary instead. Therefore, it is unable to report on other built-ins and aliases.

`top` and `htop`

top and htop are interactive commands. Instead of running in a pipeline—consuming input from stdin and printing to stdout—they are meant to be used directly by the user. top prints live statistics about running programs, and is helpful for getting an overview of the pressures on your system—memory, CPU, etc. htop is a colorful variant with some more information about individual CPU cores and graphs.

These tools read from /proc, which is a virtual filesystem with information about processes pretending to be files.

`tmux`

tmux is another interactive program. It stands for “terminal multiplexer”, which is a fancy way of saying that it allows you to run multiple programs in the same terminal—kind of like in The Matrix. It is very useful for systems administrators to see live updating commands like top, some kind of live log, and maybe also have an editor running, all at once.

It also allows you to detach and reattach to the session you started, so you can persist your work across logins to a server. Note that it does not survive system restarts.

The keybindings are customizable, so read the manual pages for the default bindings.

See also the screen command, which is similar.

`less`

less is a pager. Its job is to display its input inside a viewer that you can scroll around and search in. It is useful for large streams that you don’t want to dump to your terminal, such as a large files, or noisy programs.

For live-updating streams, you can use less +F. Note that unlike most other programs, this option (a sub-command of less) is given with +, not -.

`uniq`

uniq filters repeated lines in its input. If you have adjacent lines in a file and they are the same, uniq will make sure that only one of them remains in the output.

A common failure with uniq is piping unsorted input with duplicates. uniq may not work on this input! Consider:

uniq will turn:

A
B
B
C

into:

A
B
C

but leave the following unchanged:

B
A
B
C

In order to remove the duplicate line from the last example, pipe your input through sort first.

`vi`

vi is a POSIX-specified text editor that is available on almost every system you will use. Unlike text editors such as Notepad and Kate, Vi is a modal editor, meaning that when it is open it is in one of several modes: INSERT, NORMAL, etc. The default mode is NORMAL mode, which means that opening it and trying to directly start typing will not work. To enter INSERT mode, type i, and to go back to NORMAL mode, type Esc. To quit Vi, type :q in NORMAL mode.

Most newer systems include Vim (Vi-iMproved), instead of plain Vi. Check out this getting started guide to read more. We won’t get too deep into text editing in this course.

File ownership and permissions

Nearly every command we’ve just shown you manipulates files in some way: some read files, some create, delete or move them, and a few (like sed and vi) can write to them as well. So what’s stopping you from using these commands to alter another user’s personal files in their home directory? Or to read a confidential system file¹³ such as /etc/shadow, which holds the hashed passwords of users on most Linux systems¹⁴? Let’s see what happens when we try!

$ cat /etc/shadow
cat: /etc/shadow: Permission denied
$ 

Unlike other files we’ve seen (e.g. /comp/50ISDT/examples/file-zoo/file1), /etc/shadow can’t be read by cat. To understand why, you need to understand the concept of file permissions. Let’s take another look at some bits of ls -l that we glossed over earlier:

$ ls -l /comp/50ISDT/examples/file-zoo/file1 /etc/shadow
-rw-rw-r--. 1 thebb01 ta50isdt   11 Sep  8 00:16 /comp/50ISDT/examples/file-zoo/file1
----------. 1 root    root     1195 Dec 20  2019 /etc/shadow
$ 

You know that the first character in each line indicates the file’s type, but what about the rest of that field (rw-rw-r--. and ---------., respectively)? These characters encode the file’s permissions, which control who is allowed to access it and in what ways. The first nine of these characters encode nine individual permission bits. When all nine bits are set, ls will show “rwxrwxrwx.”¹⁵. If one or more bits are unset, the corresponding characters are replaced with a dash, as in rw-rw-r--..

So what do these nine bits actually control? As the characters imply, the nine bits are split into three groups, each group having a read permission (r), a write permission (w), and an execute/traverse permission (x). The first of the three groups specifies what the file’s owner is allowed to do. The second specifies what members of the file’s group are allowed to do. And the third specifies what everyone else is allowed to do.

That’s a lot of information, so dig into it bit by bit. Let’s first talk about owners and groups. Every file in Linux has as its owner exactly one user on the system. New files are owned by whoever runs the program that creates them (except in the case of setuid—see the footnote above). A file’s owner can’t be changed once it’s been set, not even by that owner (with one exception, described below). ls -l shows a file’s owner in the third field: the files in our example are owned by thebb01 and root, respectively. If you own a file, the first group of rwx bits tells you how you can access it.

Every file on Linux also belongs to exactly one group. Groups are named, just like users, and every user is a member of one or more groups. (You can run groups to see which groups you’re in.) ls -l shows a file’s group in the fourth field. In our example, the files belong to the ta50isdt and root groups, respectively. (The group named root is distinct from the user named root.) If you are a member of a file’s group, the second group of rwx bits tells you how you can access it.

Finally, if you are neither a file’s owner or in its group, the final group of rwx bits tells you how you can access it.

Let’s next talk about the bits themselves. If the group of bits your user is subject to has an r, it means you can read the file in question (and for directories, list their contents). If it has a w, it means you can write to that file (and for directories, add, remove, and rename the contents). If it has an x and is for a regular file, it means you can execute that file as a program. If it has an x and is for a directory, it means you can access files within that directory (a.k.a. traverse the directory). In the case where you can traverse but not read a directory, you aren’t allowed to list its contents, so you must already know the name of the file you want to access.

The final character in the mode string, if present, indicates that the file is subject to extra access checks beyond the user/group/owner permissions just described. A . indicates that a Linux-specific framework called SELinux, which lets the system administrator set access rules on files that even their owner can’t change, is in use. A + usually indicates the presence of an access control list (ACL), a more granular but infrequently-used way of specifying permissions. We won’t cover ACLs or SELinux in this course, but you can read man acl, man getfacl, and man setfacl, and man selinux to learn about them on your own.

We can now go back to our original listing (reproduced below) and make sense of it:

$ ls -l /comp/50ISDT/examples/file-zoo/file1 /etc/shadow
-rw-rw-r--. 1 thebb01 ta50isdt   11 Sep  8 00:16 /comp/50ISDT/examples/file-zoo/file1
----------. 1 root    root     1195 Dec 20  2019 /etc/shadow
$ 

We can see that, for /comp/50ISDT/examples/file-zoo/file1, its owner (thebb01) is allowed to read and write but not execute, members of its group (ta50isdt) are allowed to do the same, and everyone else can read but not write or execute. But for /etc/shadow, no one is allowed to do anything! Not even its owner (root) can read or write to it.

This latter setup isn’t quite as perplexing as it sounds for two reasons: firstly, a file’s owner is always allowed to change that file’s permissions (see man chmod for how). So if root wanted to read or write /etc/shadow, it could first grant itself permissions, then perform the operation, then take the permissions away again.

However, it turns out that not even this is necessary, and that’s because the user root is special. Also known as the superuser, root on Linux is a user account with ultimate administrative privileges. One of the privileges unique to root¹⁶ is that any file access by root bypasses all permission checks: root can read or write any file on the system without having to change its permissions first. root is also the only user that can change the ownership of an existing file.

Because of all these extra powers, it’s incredibly easy to accidentally make a system unusable (for example, by deleting core system files) when operating as root. As such, most system administrators generally use a standard user account and use the sudo and su commands to run individual commands as root when needed. Note that the root account has no special relation to the root directory you learned about last lecture.

Lecture 3

Last lecture, we talked about argument parsing and some common tools that you’ll likely encounter during your illustrious career in computing. This lecture, we’ll talk about some shell syntax that lets you combine these and other tools together in powerful ways. We’ll also introduce some more shell features that make it easier to find, edit, and run commands.

For a fun bit of history, take a look at this video, which depicts some of the original authors of UNIX first introducing concepts we’ll cover today.

More ways to find previous commands

In our very first lecture, we showed you how you can use the up and down arrow keys to cycle through past commands. Although that’s probably the most commonly-used history navigation shortcut, the shell has other features that can make history navigation even more efficient.

The first of these features is history search. If you know a piece of a command you want to rerun but can’t quite remember when you last ran it, press Ctrl-r. You will see your prompt replaced with a new (reverse-i-search)`': prompt. Within this prompt, you can type any snippet of a command from your history, and Bash will find the most recent command matching that snippet. To find older commands matching the same snippet, you can keep pressing Ctrl-r.

Once you’ve found what you’re looking for, you can either run it immediately by pressing Enter or bring it into a normal prompt for editing by pressing Esc, Tab or a left/right arrow key. If you can’t find what you’re looking for and want to get back to a blank prompt, you can always press Ctrl-c.

The next feature is history expansion. Just as Bash expands ~ to your home directory, it also expands the special sequence !! to the last command you ran. You can use this to rerun the last command verbatim (e.g. to repeatedly compile a program) or to add something to the beginning or end of it (e.g. to rerun the previous command as root with sudo !!).

History expansion isn’t confined to the last command you ran. Remember how Bash prompts are customizable? You can add a history event number to your prompt by changing the $PS1 shell variable. (See the PROMPTING section of man bash for details.) By prefixing the history event number of a given command with a single !, you can rerun that command. Here’s an example:

vm-hw03{thebb01}1013: gcc -Wall -Werror -o hello hello.c
vm-hw03{thebb01}1014: ./hello
Hello wordl!
vm-hw03{thebb01}1015: vim hello.c  # fix the typo
vm-hw03{thebb01}1016: !1013
gcc -Wall -Werror -o hello hello.c
vm-hw03{thebb01}1017: !1014
./hello
Hello world!
vm-hw03{thebb01}1018: 

Note that Bash prints a line after each command with a history expansion, prior to the command’s output, showing what was actually run. This is for clarity and also happens with !!.

Variables in the shell

We now change focus from shell features that help you run simple commands at an interactive prompt to ones that let you express complex relations and interdependencies between commands. Although you can use these features interactively, they really shine as part of shell scripts (which we’ll cover next lecture).

Let’s start with variables. Any programming language needs a way to store data, and the shell is no exception. Every running shell holds a set of variables, which persist between commands but go away when the shell exits. Each variable has a name made up of letters, numbers, and underscores. (By tradition, variable names are all uppercase, but this isn’t enforced anywhere.) To create or change a variable, separate the name and desired value with an equals sign (=). You must not put spaces around the =:

$ FOOBAR=somevalue
$ 

To read a variable, prefix its name with a dollar sign ($) and use it in a command:

$ echo $FOOBAR
somevalue
$ 

The shell expands $VARNAME to the contents of VARNAME, just like it expands ~ to your home directory. Variable names can be used inside double quotes but not inside single quotes, and the $ can be escaped with a backslash just like any special character:

$ echo $FOOBAR
somevalue
$ echo "$FOOBAR"
somevalue
$ echo '$FOOBAR'
$FOOBAR
$ echo \$FOOBAR
$FOOBAR
$ 

Shell variables hold strings. The value you assign to a variable is substituted textually when you use that variable in a command. As such, you can put variable expansions nearly anywhere:

$ COMMAND=ls
$ DIRECTORY=/comp/50ISDT/examples/file-zoo/
$ "$COMMAND" "$DIRECTORY"
directory1  file1  file1-link  file2  file3  missing-link
$ 

Because of this direct textual expansion, variable names should almost always be used inside double quotes. Consider what would happen if the variable contained a string with spaces! Well, we’re right back to our old “hello”/”hello world”/”world” example from before:

$ cd /comp/50ISDT/examples/file-zoo/directory1/
$ FILENAME="hello world"
$ cat $FILENAME
1
2
$ cat "$FILENAME"
3
$

Sometimes, it can be ambiguous where a variable name ends and subsequent text begins. In those situations, you can make it clear by enclosing the name in curly braces:

$ "$COMMAND" -l "${DIRECTORY}file3"
-rw-rw----. 1 thebb01 ta50isdt 20 Sep  8 00:24 /comp/50ISDT/examples/file-zoo/file3
$ 

If you’re done with a variable and want to get rid of it, you can use unset. Note that it’s not an error to access a variable that doesn’t exist, although next lecture we’ll show you how to change that to make debugging scripts easier:

$ unset FOOBAR
$ echo "$FOOBAR"

$ 

Environment variables

The variables we created in the preceding example exist only within the shell. But Linux itself also has a concept of variables. These variables, called environment variables, provide another way of passing data to programs. Like arguments, environment variables can be read by programs and used to make decisions. Unlike arguments, environment variables are passed implicitly: new programs automatically inherit the environment of their parent unless the parent explicitly decides otherwise. This makes the environment a good place to hold system or user configurations that many programs care about.

To see the environment variables in your shell session (which will be inherited by any command you run), run env:

$ env
HOSTNAME=vm-hw01
SHELL=/bin/bash
USER=thebb01
PATH=/h/thebb01/local/bin:/comp/105/bin:/comp/105/submit/bin:/usr/lib64/qt-3.3/bin:/usr/condabin:/usr/sup/bin:/usr/bin:/usr/sup/sbin:/usr/sbin:/h/thebb01/bin:/usr/cots/bin:/bin:/opt/puppetlabs/bin
PWD=/h/thebb01
EDITOR=vim
LANG=en_US.UTF-8
HOME=/h/thebb01
$ 

Many servers has a much bigger environment than this, but I’ve omitted most of the variables so we can focus on these. Some of these variables hold system information: HOSTNAME is the computer’s name and LANG is the language that programs should prefer. Others hold information about my user: USER and HOME are my username and home directory; SHELL is the shell that I use by default (but not necessarily the currently-running one); EDITOR is my preferred text editor. PWD is my working directory and holds the same value that pwd prints.

Notably, PATH is how the shell knows where to look for commands. Like we mentioned in the first lecture, commands without a / in their name execute a program with a matching name from one of several system directories; PATH holds a :-separated list of those system directories. In the middle of my PATH, you can see /usr/bin/, which is where ls and most of the other commands we’ve used so far live.

It’s no coincidence that env formats its output to look like shell variable assignments. Every environment variable is accessible as a shell variable, and you can read and modify them as such:

$ echo "I am $USER, my home is at $HOME, and this place is called $HOSTNAME"
I am thebb01, my home is at /h/thebb01, and this place is called vm-hw01
$ 

However, the reverse is not true: a shell variable is not part of the environment automatically, but you can add it using export (and remove it using export -n):

$ FOOBAR=somevalue
$ env | grep FOOBAR
$ export FOOBAR
$ env | grep FOOBAR
FOOBAR=somevalue
$ 

Chaining commands together

Sometimes you’ll find yourself with a problem that can’t be exactly solved by any one program. Luckily, the shell offers a number of powerful operators that let you run multiple programs with a single command, connecting the inputs and outputs of those programs in various ways. Before we get into the specifics of these operators, let’s talk about what inputs and outputs a program can have.

There are three ways the shell can send input to a program. We’ve already discussed the first two, arguments and environment variables. The third is standard in (a.k.a. stdin). stdin refers to text that a program reads while it’s running. For those who’ve written C++ programs, this is where cin gets its data from. Many of the commands we’ve already shown, like cat and grep, will default to reading from stdin if no filename is given as an argument.

There are also three ways a program can send output to the shell—standard out (a.k.a. stdout), standard error (a.k.a. stderr), and its exit code. stdout and stderr are both text streams that programs can write to while they’re running (cout and cerr in C++), and the contents of both are printed to the terminal by default (but we’ll see how that can change shortly). stdout is used for normal output of a program, such as filenames located by find or lines matched by grep. stderr is reserved for error and diagnostic messages, such as those printed when a program doesn’t have permission to access a file. In the following example, cat prints Hello, world to stdout and cat: file3: Permission denied to stderr:

$ cd /comp/50ISDT/examples/file-zoo/
$ cat file2 file3
Hello, world
cat: file3: Permission denied
$ 

The final way of producing output, the exit code, is a number that every program returns when it exits. This number is traditionally used to report whether the program succeeded, indicated by a value of zero, or failed in some way, indicated by any nonzero value. (Individual programs assign their own meanings to different failure values.) You don’t see the exit code of commands you run interactively, as the text they print is usually enough to tell whether they succeeded or failed. However, the shell always keeps track of the last command’s exit code. You can view it through the special shell variable $?:

$ cd /comp/50ISDT/examples/file-zoo/
$ cat file2
Hello, world
$ echo $?
0
$ cat file3
cat: file3: Permission denied
$ echo $?
1
$ 

Here, cat returned a success code of zero when it completed normally but a failure code of one when it couldn’t read its input file. You can try out other programs on your own to see what codes they return in different situations.

`test`

There is a command, test, dedicated to producing error codes for use in conditionals. It comes bundled with a bunch of different predicates. (A predicate is a function that takes an input or multiple inputs and produces a boolean result.) If the predicate returns true, the exit code is zero, and if it returns false, the exit code is one. This is different from C, where true is one, but matches the POSIX convention of returning zero on success.

To test if a string is the empty string ""—if it has length zero—use test -z "$STRING".

$ test -z ""
$ echo $?
0
$ test -z "hello"
$ echo $?
1
$ 

To test if a string is not the empty string—if it has nonzero length—use test -n "$STRING":

$ test -n ""
$ echo $?
1
$ test -n "hello"
$ echo $?
0
$ 

test also provides a string equality predicate. To test if two strings are equal, use test "$LEFT" = "$RIGHT". Note that this uses one equals sign instead of the two you may be used to. To check inequality, use !=:

$ test "hello" = "hello"
$ echo $?
0
$ test "hello" = "world"
$ echo $?
1
$ 

Even though variables are always strings, the text in those variables can represent other types of data. To that end, test also provides predicates for numbers and files. For numbers, use -lt for “less than”, -ge for “greater than or equal”, and so on (see help test for a full listing):

$ test 5 -lt 7
$ echo $?
0
$ test 5 -lt 5
$ echo $?
1
$ 

To test if a file or directory exists, use test -e "$FILENAME". To test if it exists and is a file, use test -f. To test if it exists and is a directory, use test -d.

Running programs sequentially

Now that we’ve seen how programs can communicate with the shell, we come to our first few shell operators for combining multiple commands into a single, larger command. The first of these operators is the semicolon (;). By separating two commands with ;, you tell the shell to run the first one followed by the second, just as if you’d put them each on their own line. We can use this operator to rewrite the last example more concisely:

$ cd /comp/50ISDT/examples/file-zoo/
$ cat file2 ; echo $?
Hello, world
0
$ cat file3 ; echo $?
cat: file3: Permission denied
1
$ 

As the third command demonstrates, each command in the chain will run regardless of whether the one preceding it succeeded or not. This is sometimes desirable, as in the case of printing an exit code. But sometimes a later command depends on an earlier one, as in the case of making a directory and then creating a file there. For situations like this, you can use the && operator, which runs the second command only if the first is successful (and returns success only if both are). Here, the failure of the second mkdir prevents touch file2 from ever running:

$ mkdir dir1 && touch dir1/file1
$ echo $?
0
$ ls dir1
file1
$ mkdir dir1 && touch dir1/file2
mkdir: cannot create directory 'dir1': File exists
$ echo $?
1
$ ls dir1
file1
$ 

As you might expect, there’s also a || operator, which runs the second command only if the first fails and returns success if either succeeds. You can chain as many commands you want together using any of these three operators, and they’ll be run left-to-right.

Pipelines

The next operator we’ll discuss is one of the hallmarks of POSIX shells. It’s the foundation upon which the UNIX Philosophy—to write small programs that do one thing well—is built. This operator is known as the pipe, and it’s denoted with a vertical bar (|).

When you separate two commands with |, the shell connects stdout of the first command to stdin of the second, forming a pipeline. Pipelines can be arbitrarily long, and they let you express complex data processing operations in terms of the basic operations provided by individual programs.

To illustrate this, let’s build a pipeline to find which header file in /usr/include/ has the most lines. (/usr/include/ is the standard location for system headers like stdio.h.) We’ll start with a command to find all the header files in the directory. We learned about find last lecture, so let’s use that:

$ find /usr/include/ -type f -name '*.h'
/usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlib.h
/usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlibrgb.h
/usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-loader.h
/usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-autocleanups.h
<lots more lines>
$ 

Here, we’re looking for files in /usr/include/ whose name ends in .h. As you can see, there are lots of them. We’re using find instead of ls because find also looks in subdirectories.

The next thing we’ll do is count the number of lines in each file. We know that wc can count lines in a file, but there’s a problem: wc as we’ve used it so far wants filenames as arguments, but if we add it to our pipeline it will get the list of files on stdin instead. As it happens, however, wc has an alternate mode that does nearly what we need¹⁷. From its man page:

--files0-from=F
       read input from the files specified by NUL-terminated names in
       file F; If F is - then read names from standard input

By passing --files0-from=-, we can have wc -l read a list of files to count from standard in! “NUL-terminated” is a concept you’ll see often when dealing with pipelines containing filenames: POSIX tools typically operate on a line-by-line basis, but this becomes problematic when working with filenames, since filenames are allowed to contain newlines¹⁸. To work around this issue, many programs that read or write lists of files offer an alternate mode where each entry is separated by the unprintable character \0, which can’t occur in filenames. As luck would have it, find offers such a mode with its -print0 flag.

Adding this flag to our find invocation and piping into wc yields the following:

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l
92 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlib.h
233 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlibrgb.h
119 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-loader.h
37 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-autocleanups.h
<lots more lines>
83 /usr/include/netrom/netrom.h
70 /usr/include/H5Object.h
4233248 total
$ 

We’ve made some progress! Now every filename is preceded by its line count. (Note that this command may take a while to run, since wc has to read thousands of files.)

Wait a minute, though—what’s this 4233248 total line? That’s not a file ending in .h! As it turns out, wc prints a total line count at the end of its output, and there’s no flag to disable it. Such an inconvenience is no match for the power of pipelines though: we can use head -n -1 (note: -1, not 1) to discard the last line of output from wc¹⁹:

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l | head -n -1
92 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlib.h
233 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlibrgb.h
119 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-loader.h
37 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-autocleanups.h
<lots more lines>
83 /usr/include/netrom/netrom.h
70 /usr/include/H5Object.h
$ 

But how do we pick the biggest one out of this list? We haven’t learned about any utility to do this directly, but we have learned about sort, which can ensure that the biggest number is at one end of the list. Let’s sort the list such that the biggest number is on top:

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l | head -n -1 | sort -rn
4233248 total
40496 /usr/include/php/Zend/zend_vm_execute.h
20054 /usr/include/opencv2/ts/ts_gtest.h
19634 /usr/include/epoxy/gl_generated.h
19455 /usr/include/openblas/lapacke.h
<lots more lines>
$ 

Now we have the line we care about at the very top, and all we have to do is get rid of all the other uninteresting lines. For that, let’s again use head:

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l | head -n -1 | sort -rn | head -n 1
40496 /usr/include/php/Zend/zend_vm_execute.h
$ 

It would probably be fine to call the pipeline finished at this point, but if we really want we can also get rid of the line count and leave just the filename using cut -d' ' -f2. The -d stands for --delimiter, which in our case is a space, and the -f stands for --fields, where we specify that we only want field number 2. (The fields are 1-indexed.)

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l | head -n -1 | sort -rn | head -n 1 | cut -d' ' -f2
/usr/include/php/Zend/zend_vm_execute.h
$ 

And we’re done! By combining six commands, we answered our question, and we now have a pipeline skeleton that can be modified in minor ways to answer all sorts of related questions, too. (What about the top 5 longest files? Top 10 shortest? And so on.) Hopefully, this example illustrated some of the flexibility the pipe operator brings.

Redirection

One limitation of pipelines is that they write their final output to the terminal. In the case of our example above, that’s fine because the output is just one file. But what if we wanted to save a report of how many lines were in each header file at a given time? On a sample server, find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l outputs over 18,000 lines, so manually retyping, or even copy/pasting, the output would not be fun.

The shell’s redirection operators can help in situations like this. >, the output redirection operator, saves stdout to a given file. Similarly, <, the input redirection operator, copies a file’s contents to stdin.

Let’s split our pipeline from above into two commands using redirection:

$ find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l >header-line-counts
$ ls
header-line-counts
$ cat header-line-counts
92 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlib.h
233 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf-xlib/gdk-pixbuf-xlibrgb.h
119 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-loader.h
37 /usr/include/gdk-pixbuf-2.0/gdk-pixbuf/gdk-pixbuf-autocleanups.h
<lots more lines>
$ head -n -1 <header-line-counts | sort -rn | head -n 1 | cut -d' ' -f2
/usr/include/php/Zend/zend_vm_execute.h
$ 

The filename always goes after the redirection character, meaning the arrow points in the direction data flows. For output redirection (>file1) , the file gets written and so the arrow points towards it. For input redirection (<file1), the file gets read and so the arrow points away from it.

Be careful with the filenames you specify for output redirection! If you redirect into a file that already exists, that file’s contents will be completely replaced with no warning! Output files are overwritten before the command runs, so a command like sed 's/a/b/' file1 >file1 will empty your file! There is a variant of output redirection, >>, that appends to a file that already exists instead of overwriting it.

You may have noticed that the last command in our example can be written without input redirection at all, as head -n -1 header-line-counts | sort -rn | head -n 1 | cut -d' ' -f2. This is indeed true, and it’s why you’ll see output redirection used a lot more than input redirection: most tools can read from a filename given as an argument, so input redirection is usually unnecessary.

Output redirection, like pipelines, only redirects stdout by default. stderr is still sent to the terminal so you can see errors and so that the output file doesn’t contain error messages that might confuse a later tool. If you have reason to redirect stderr, you can do it with 2>. (2 is the number POSIX assigns to stderr; 1 is stdout, so 1> is the same as >.)

Job control

One thing common to every command you’ve seen so far is that, once you run it, you can’t run anything else until it finishes and the shell prints a new prompt. In fact, we told you that this was core to how shells work when we talked about REPLs in Lecture 1.

As it turns out, though, Bash and other POSIX shells provide a way to opt out of this behavior. By suffixing a command with an ampersand (&), you can tell Bash to start that command and then immediately print a new prompt without waiting for it to finish. Although it’s hard for us to show this in a pasted transcript, you can try it out for yourself using the sleep command! sleep takes a number as its only argument and waits that many seconds before exiting. Try running sleep 3; you’ll see that it takes three seconds for a new prompt to appear. Now try running sleep 3 &; this time, a new prompt should appear immediately and you’ll see a message like this:

[1] 8328

This message is from the shell’s job control subsystem, which is responsible for keeping track of and reporting the state of background jobs like the one you just created. It includes two pieces of information: the first, [1] is the job ID that the shell assigned the newly-created job. The second piece of information, 8328, is the process ID of the command you just ran²⁰.

Once a job is in the background, it will run to completion on its own time. You will be able to see its output²¹, but you won’t be able to give it input because anything you type will go to the shell or foreground job instead. You can bring the last background job you ran back into the foreground by typing fg, at which point it will accept input again.

You can also move foreground jobs to the background. To do this, first press Ctrl-z. This will temporarily pause the job and bring you back to a prompt. You can then either leave the job stopped until you bring it back to the foreground with fg or tell it to continue as a background job with bg.

To see a list of jobs that haven’t finished yet, use the built in jobs command:

$ jobs
[1]+  Running                 sleep 3 &
$ 

Each line shows the ID, state (“Running” or “Stopped”) and command for a single job. The last and second-to-last jobs to run are annotated with + and -, respectively. When a background job finishes, the job control subsystem will notify you of that fact by printing a message following much the same format before your next prompt:

[1]+  Done                    sleep 3
$ 

The fg, bg, and jobs commands can all take a job specifier as an optional argument, which if present will tell them which job to act on. %% and %+ both refer to the current job, while % followed by a job ID refers to that job. For more ways to refer to a given job, as well as details on how Bash leverages features of the Linux kernel to implement job control, see the JOB CONTROL section of man bash.

Lecture 4

Command substitution

Last lecture, you saw how the shell will substitute a variable name, like $FOOBAR, with that variable’s value when you use it as part of a command. Another similar feature, command substitution, lets you include a program’s standard output as part of a command. To do so, put the command you want to run inside the parentheses in $()²². For example:

$ echo "The last word in the dictionary is $(tail -n 1 /usr/share/dict/words)."
The last word in the dictionary is ZZZ.
$ 

Here, we asked the shell to run tail, which extracted the last line of /usr/share/dict/words²³, and substituted its output directly into a string that then got passed to echo for printing.

Like variable substitutions, command substitutions can and should be put inside double quotes but don’t work inside single quotes. The risk of using an unquoted command substitution is the same: if the command’s standard output contains spaces, it will be treated as multiple words by the shell unless quoted.

You can use all the syntax you’ve learned so far inside a command substitution, just as if you were writing a standalone command. For example, we can pass the output of last week’s whole shell pipeline to head:

$ head "$(find /usr/include/ -type f -name '*.h' -print0 | wc --files0-from=- -l | head -n -1 | sort -rn | head -n 1 | cut -d' ' -f2)"
/*
   +----------------------------------------------------------------------+
   | Zend Engine                                                          |
   +----------------------------------------------------------------------+
   | Copyright (c) 1998-2013 Zend Technologies Ltd. (http://www.zend.com) |
   +----------------------------------------------------------------------+
   | This source file is subject to version 2.00 of the Zend license,     |
   | that is bundled with this package in the file LICENSE, and is        |
   | available through the world-wide-web at the following url:           |
   | http://www.zend.com/license/2_00.txt.                                |
$ 

Glob patterns

We spoke about patterns briefly in Lecture 2 when describing arguments for tools like grep and sed. Those tools incorporate extremely powerful (and confusing) regular expression languages that allow you to express a huge variety of different patterns. What we have not discussed yet is the shell’s own less-powerful (and less-confusing) pattern language.

The piece of this language you’ll see used most often is the * character. If you include it in a word, the shell will interpret that word as a file path where the * represents any set of zero or more characters. Such a pattern is known as a wildcard or glob, and, if any files match it, the shell will substitute it with all of them:

$ cd /comp/50ISDT/examples/file-zoo/
$ ls
directory1  file1  file1-link  file2  file3  missing-link
$ echo *
directory1 file1 file1-link file2 file3 missing-link
$ echo file*
file1 file1-link file2 file3
$ 

You can see that echo * behaves much the same as ls with no arguments, as it matches all files in the current directory. (*, like ls, doesn’t match files beginning with . by default.)

Unlike variable and command substitutions, * does not work inside either single or double quotes. However, you don’t need to worry about quoting it, as it will correctly handle the expansions of paths containing spaces even when unquoted:

$ ls -l directory1/hello*
-rw-rw-r--. 1 thebb01 ta50isdt 2 Sep  8 00:43 directory1/hello
-rw-rw-r--. 1 thebb01 ta50isdt 2 Sep  8 00:43 directory1/hello world
$ 

Globs behave somewhat unexpectedly when they don’t match anything: instead of expanding to an empty string as you might expect, they remain completely unchanged! You should be careful of this behavior when writing scripts. Theshopt -s nullglob or shopt -s failglob commands make the behavior more consistent; consider using one of these in conjunction with set -euo pipefail (see below) when writing glob-heavy scripts:

$ echo foobar*
foobar*
$ shopt -s nullglob
$ echo foobar*

$ shopt -s failglob
$ echo foobar*
-bash: no match: foobar*
$ echo "foobar*" # Quotes still prevent processing
foobar*
$ 

Shell scripts (and when not to use them!)

As we’ve hinted, typing commands at a prompt isn’t the only way to use Bash. You may sometimes find yourself in a situation where you frequently rerun the same sequence of commands, perhaps with minor variations. Or perhaps you want to let others run those commands without having to remember them or understand exactly how they work. This is where shell scripts come in.

A shell script is a text file containing commands. When you ask the shell to run a script, it interprets and executes each line in sequence, just as if you’d typed the lines one after another at a prompt. You’ve already seen shell variables, and we’ll learn about a number of other shell features today—like conditionals, loops, and functions—that give shell scripts a similar level of expressiveness to normal programming languages like C, C++, or Python.

Before we talk about those features, though, a word of warning: although you can in theory solve any programming problem with a shell script, that doesn’t mean you should. The shell lacks a number of features, like data types and variable scoping, that are crucial to writing scalable and maintainable programs, and as such any shell script that grows past a few tens of lines quickly becomes incomprehensible. Shell scripts work best as lightweight “glue” between other software that already exists.

If you find yourself wanting to do any of the following things in a shell script, there’s a good chance that your problem has outgrown the capabilities of the shell. In these cases, you should move at least part of your solution into its own program written in some other language:

Non-textual I/O

Not all software deals with text. If you need to process structured data that’s kept in a binary format (i.e. not something you can process using tools like cut and grep), pick another language. If you have input or output that can’t be represented as text (e.g. audio or image data), pick another language.

External libraries

Shell scripts excel at interacting with command-line programs. Pipelines, redirection, and argument substitution make shell scripts the easiest way to solve problems in terms of programs that already exist. But if you need to interact with a piece of software that isn’t exposed through a command-line utility—for example, a database like PostgreSQL or MariaDB²⁴—pick another language.

Graphics

Graphical user interface (GUI) libraries like Qt and GTK provide bindings for languages like C, C++, and Python that allow you to build complex visual interfaces with buttons, lists, tables, and images. These libraries, like the databases mentioned above, are not directly accessible via shell scripts. In general, if your program needs to use the mouse, pick another language.

Tip: If you need to work with images, videos, or other binary data, pick a language with a good binary data library. People often reach for C or C++ in these cases, but languages like Python and Erlang provide just as good (and sometimes better) tooling! People use both format-specific libraries (such as libjpg, for working with JPGs) and format-agnostic libraries (like Python’s struct module or Erlang’s binary pattern matching).

Data structures

Even if your data is textual, you should probably pick another language if you need to store and later query that data as opposed to processing it all in one pass like a pipeline does. Because shell variables aren’t typed, you can’t build any of the data structures you might have learned about in an introductory course in a shell script. The best you can do is organize your data as files on disk. Languages like Python and Ruby, on the other hand, likely have the data structures you need built-in. And if they don’t, you can build those structures.

Complex logic

If you need to do math, nest conditionals more than a couple levels deep, or express any logic more complex than a few if statements, pick a different language. The shell is fine for simple string and numeric comparisons, but larger boolean expressions get tricky fast. There’s a reason that compile-time type checking, run-time type errors, and the like exist in other programming languages: they help people catch bugs. We’ll talk more in depth about this in our last module.

If you find yourself writing a bunch of logic in Bash, stop. Think hard about the problem you’re trying to solve. Does a command-line tool exist that can solve that problem for you? If so, run it from your script instead of implementing the logic yourself. If not, consider writing such a command-line tool in something like C++ or Python for your script to run.

Creating and running scripts

Let’s make a first shell script. Open up your editor and type the following into a file. Say, myscript.sh:

echo "Hello, world!"
echo "I am in a script and I am being run by $USER."

Save it. If you try and run it like a program you compiled in your CS courses—by running ./myscript.sh—you will get the following error:

bash: ./myscript.sh: Permission denied

This is because you don’t have execute (x) permission on the file, which you can verify with ls -l. If you would like to execute the script without execute permission, you will have to explicitly run the program using another program that has execute permission… like a shell. Try running bash myscript.sh.

If you add execute permissions (chmod +x myscript.sh), you will be able to run your script using ./myscript.sh. But what shell is running this file? We will find out more about this later (or read ahead to the #! section).²⁵

NOTE: We have had reports of strange shell script behavior from students writing their programs on Windows in an editor like Sublime Text. If you are on Windows, your best bet is probably to write your program in a text editor in WSL or a Linux VM.

This is due to the way Windows vs Unix handle line endings.

Comment your code

Bash scripts, as we mentioned, are harder to write and read than programs in other programming languages. Make sure to comment on the intent of any particularly tricky areas. Comments begin with the # character and continue until the end of the line.

# I am a comment
echo "foo" # I am another comment

Control flow in the shell

So, you’ve chosen to use Bash to solve your problem; after you pass this class, we’ll trust your judgement. Let’s learn the missing pieces of shell syntax needed to write programs! Most of these will likely be familiar from your CS curriculum, so we’ll focus more on “how” than on “why” for each.

In this section, we’ll try to stick to syntax that’s part of POSIX, so that your programs don’t depend on Bash specifically. Bash does actually have some non-POSIX features that address a few of the limitations we mentioned in the last section, but those features aren’t nearly enough to make it competitive with a language like Python. As such, we believe that all our advice above still stands and that, if you find yourself reaching for a Bash-specific feature, you probably shouldn’t be using a shell script in the first place.

if

The basic structure for if statements in Bash is as follows:

if CONDITION; then
  CONSEQUENT
elif OTHER-CONDITION; then
  OTHER-CONSEQUENT
else
  ALTERNATIVE
fi

As with other programming languages, the elif (else if) and else components are optional. So the minimal if statement would look like:

if CONDITION; then
  CONSEQUENT
fi

Any command can be a condition, and its exit code will determine As we talked about earlier, you can use programs like test for your conditions. For example, to check if two strings are equal, you can use:

if test "$LEFT" = "$RIGHT"; then
  echo "they are the same"
fi

POSIX also provides a more natural-looking way of doing conditionals in your program. POSIX defines test to be equivalent to [, so you can instead write:

if [ "$LEFT" = "$RIGHT" ]; then
  echo "they are the same"
fi

Note that the spaces around the braces [ and ] are required, just as they are for any command—[ is just a regular command with an unusual name:

$ ls -l /bin/\[
-rwxr-xr-x 1 root root 59736 Sep  5  2019 '/bin/['
$ 

(Though it is often implemented as a shell built-in, too.)

Loops

Because a programming language wouldn’t be complete without a friendly loop, Bash includes not one but several looping constructs. In this section, we will present while, for, and until. These can be used like other programming languages’ loop constructs, but they can also be used in conjunction with pipelines.

Let us begin with the while loop. The basic structure for while in Bash is as follows:

while CONDITION; do
  LOOP-BODY
done

All of the same kinds of conditions you might use with if work with while, too.

For example, to count up to four from zero, you can do:

i=0
while [ "$i" -lt 5 ]; do
  echo "$i"
  i="$(($i+1))"
done

The $(($i+1)) is yet another type of expansion, called an arithmetic expansion. The shell will evaluate whatever’s between the set of double parentheses as a mathematical expression and substitute the result. We won’t cover this in detail, as you should in general avoid arithmetic in shell scripts, instead delegating to other commands when possible. For example, a command called seq can replace our whole loop:

seq 0 4

These scripts are equivalent. In fact, seq is even more flexible, allowing you to format the numbers, choose a separator, and pad with leading zeroes. It also supports an arbitrary increment. Go check out the manual page for more information.

Now, onto for loops. Unlike in C, POSIX for loops do not have the for (INIT; CONDITION; POST) structure. They are instead based on iterating over sequences and are of the form:

for VAR in SEQUENCE; do
  LOOP-BODY
done

Let’s take a look at an example:

for i in $(seq 99 -1 1); do
    echo "$i bottles of beer on the wall..."
done

Note that in this case the command substitution is not quoted because we want to treat every separate line from seq as a different input to the for loop.

In this case, the sequence is a newline-separated list of numbers counting down from 99 to 1. The for loop binds each number to the variable $i for use in the body, and we sing a little song.

Last, POSIX specifies a funny little loop called until. This is an inverted while loop, so instead of using the negation operator ! to write something like:

while ! CONDITION; do
  LOOP-BODY
done

we can instead use:

until CONDITION; do
  LOOP-BODY
done

This is intended to remove visual clutter. The course staff has not often seen it used in real-world shell scripts, however.

You can also pipe to loops, but that is outside the scope of this course’s material and definitely falls into a “more advanced shell scripting” course.

Referring to script arguments

Scripts can read arguments from the special shell variables $0 to $N, where N is a rather large number. For argument indices larger than 9, however, you must use curly braces, like ${10}.

For example, the following script:

# myscript.sh
echo "$1, world!"

can will print out “Hello, world!” when run like so:

$ bash myscript.sh Hello
Hello, world!
$

Defining functions

It may be the case that you require a level of abstraction in your shell scripts that is somewhere between 1) writing a whole other shell script to call from your main script and 2) copy/pasting lines of code. For this, Bash allows you to define functions of your own. The syntax is rather terse:

FN-NAME () {
  FN-BODY
}

There are no static types. There are no argument declarations. Despite this, functions can read arguments from the special shell variables $0 to $N, where N is a rather large number. For argument indices larger than 9, however, you must use curly braces, like ${10}.

Here is a function to write a greeting to the person specified:

greet() {
    echo "Welcome to CS 4973, $1!"
}

Function invocations look like normal command invocations—unlike other programming languages, parentheses are not required:

greet "max"
# => Welcome to CS 4973, max!

Now that you are an expert shell script programmer (TM), you may find it educational to take a look at some of the shell scripts on your system and figure out what they do. To do that, we can list all of the available shell scripts and pick randomly:

$ grep -lrF '#!/bin/sh' /usr/bin > scripts.txt
<you may see some permissions errors>
$ vim $(sort -R scripts.txt | head -n 1)
<vim opens>
$

How long is the script? Is it well-commented? Does it follow the shell best practices we outline here?

Error handling (`set -euo pipefail` is your friend)

Error handling in shell scripts is somewhat fraught. Normally in a programming language when there is an error, you find out right away—or it is explicitly squashed. For example, in C, your program might segfault. Or, if you are luckier, it might print an error message and exit(). Or in C++, Python, and other programming languages that support it, it might raise an exception.

In Bash, by default, things just kind of go… sideways. Exit codes are the only method of error reporting, and you have two standard options: zero and non-zero. But a non-zero error code does not necessarily meant that a command has failed, and that the error should be propagated up the call stack.

Consider the case of searching for a string with grep. If a match is found, grep will exit with 0. If a match is not found, grep will exit with 1. You probably don’t want your shell script crashing if a match is not found, so the shell surfaces that exit code for use in conditions. And that’s it.

Unfortunately, the same happens for commands that really truly have an error, like reading from a file that does not exist. If the entirety of your shell pipeline, for example, relies on reading from a file called contact-list, and that does not exist, the shell will happily continue trying to execute the rest of your shell script anyway—often with unexpected results.

Fortunately, there is a magic incantation you can put at the top of your shell scripts: set -euo pipefail. This magic incantation is not actually magic, but instructs your shell to enter a particular mode. We’ll go over it piece by piece.

First, set -e exits the shell immediately if a command exists with a non-zero exit code. This helps avoid the aforementioned fiasco.

Second, set -u changes the default behavior of reading undefined variables. By default, reading from an undefined variable returns the empty string, but with set -u, this is treated as an error. This enforces some amount of rigor for ensuring your variables are defined.

Third, set -o pipefail causes pipelines to exit early if an intermediate command fails. The exit code of the whole pipeline is set to the exit code of the failed command. This helps set -e work in more cases; otherwise, a broken pipeline would not cause the entire shell script to exit early.

All of these put together produce: set -euo pipefail.²⁶

We have listed one common incantation to ease your shell script debugging, but we have certainly not listed all of the available options to set. There are more options; check out the manual page.

These options are useful both for you, the novice shell programmer, and for professionals. Recently, the video game launcher Steam had a shell script bug that destroyed data in rare cases.

`#!` lines and how the kernel interprets them

As we mentioned earlier, file extensions have little meaning on a Linux system. Linux reads, writes, and executes files of every extension identically; extensions, when present, only offer a hint to human readers about what’s inside.

This makes file types seem unknowable. So how does anyone get anything done if every file is completely opaque?

As it turns out, not all is lost, and files are not as opaque as they seem. There is a notion of “magic numbers” (see man magic) that file formats can use to identify themselves to external viewers. For example, the magic number for executable files on Linux (ELF) is hex 7f 45 4c 46, or 7f followed by “ELF”. This allows utilities like file (man file) to figure out if a file is an ELF binary or not, from looking at the first couple of bytes²⁷.

There is another kind of magic number, hex 23 21 (“#!”, pronounced any number of ways, but commonly “shebang” or “hash bang”), that denotes that a file is a script. It does not mean that the file is necessarily a shell script, but instead allows the programmer to specify an arbitrary interpreter for that particular file.

For example, if you execute a file with ./myscript, and it begins with #!/bin/bash, that means that the file should be treated as a Bash script and executed using /bin/bash. It is executed as if you manually typed /bin/bash ./myscript. If a file starts with #!/usr/bin/python, the file should be treated as a Python script and executed using the specified Python interpreter.

The Linux kernel reads the shebang and the interpreter. Then, it runs the interpreter with the file as an argument. Magic.

You may be wondering: but what do Bash and Python do about the line with the funny # character? Why isn’t that a syntax error? For both of those languages, # denotes the beginning of a comment, which is ignored.

Shellcheck

You may find Shellcheck helpful. It statically analyzes your shell scripts for potential bugs and lets you know about the problems. We will talk more about tools like this in the fourth module, Correctness.

Lecture 5

Note: these lecture notes are incomplete and will be updated soon

The POSIX programming interface

So far in this module, we’ve shown you how to use the POSIX shell and utilities to work with files, build pipelines, and run programs from the command line. As we mentioned last lecture, though, there are many problems that the shell isn’t well-suited to solving. When you encounter one of these problems, you may decide to solve it by writing your own tool in a language like C, C++ or Python. But what if you need to run a program from your tool? Or list a directory? Or create a symlink? While you now know how to do these things from the command line, you may not know how to do them from programs you write.

In this lecture, we’ll show you a set of POSIX APIs that tools like ls, cat, and the shell itself use under the hood, and we’ll show you how to use those APIs in your own programs to interact with POSIX concepts like files, streams, and processes. We’ll also talk about how those APIs are exposed in various higher-level programming languages.

The volume of POSIX we mentioned in Lecture 1 is titled “Shell & Utilities,” and it specifies much of the command-line interface you’ve just seen. This lecture, we’ll concern ourselves with a different volume, titled “System Interfaces.” This volume defines a set of system calls (a.k.a. syscalls) that a POSIX-compliant operating system kernel must provide to programs running on top of that kernel.

What is a kernel?

An operating system kernel (kernel for short) is a piece of software, generally written in C, that’s loaded into memory at all times your computer is running. The job of a kernel is to mediate access to shared hardware resources, including processor (CPU) cores, memory, disks, network interfaces, and other peripherals like keyboards and mice. Every modern operating system has a kernel, and it’s the first thing that runs when the operating system starts.

The defining feature of a kernel is that it runs in kernel mode, which is a generic term²⁸ for a processor state that allows access to pieces of the hardware that normal programs (which are said to run in user mode, also known as userspace) can’t interact with directly.

For example, code running in kernel mode is allowed to configure the Memory Management Unit (MMU), a translation layer inside the processor that can reject or rewrite any memory access the processor makes. Whenever the kernel passes control to a userspace program, it first configures the MMU to hide any memory regions that belong to other programs, to hardware²⁹, or to the kernel itself. Since the program runs in user mode, it’s stuck with this MMU configuration and has no way to read or write the hidden memory regions.

This is one example of how the kernel uses its special privileges to ensure that a single buggy or malicious program can’t take the entire system down. Another example, which you’ve already worked with extensively, is the filesystem. Files and directories are a creation of the kernel, designed to impose structure and ownership on the billions of identical bytes that make up a hard drive. By using those bytes to represent a tree of distinct objects, each with its own name, attributes, and permissions, the kernel allows the disk to be shared between programs with no risk of unwanted interference.

All the programs you’ve used, like the shell and utilities, are userspace programs. But that doesn’t mean you haven’t interacted with the kernel. Every time one of those programs accesses a file, prints output, reads input, sets up a pipe, or executes another program, the kernel is what performs that operation. And the list doesn’t stop there.

System calls

A program asks the kernel to do these things by making a system call. The exact mechanism by which a syscall happens depends on your computer’s processor architecture, but the effect is always the same: the syscall causes the processor to switch from user mode to kernel mode and run a piece of the kernel known as the syscall handler. The syscall handler has access to a set of parameters passed by the program, the first of which it uses to determine which syscall to run and to find the C function in the kernel that implements that syscall. Any arguments to the function are set from the other userspace parameters. Once the syscall implementation returns, the handler saves its return value somewhere the program can see, switches back to user mode, and resumes the program.

Syscalls in many ways resemble C function calls: they take arguments, return a value, and invoke a specific piece of functionality. But, unlike functions, syscalls are part of the kernel: they run in kernel mode, can see the kernel’s memory, can access hardware, and are implemented by the kernel, not by the program that calls them.

Because syscalls are part of the kernel, they adhere to the kernel’s security and synchronization guarantees. For example, the open syscall defined by POSIX (which we’ll dive into with an example shortly) validates that the program calling it has permission to access the file it’s asking for. If not, it returns a failure code³⁰.

As we mentioned, POSIX specifies a set of syscalls that compliant kernels must implement. Examples of such kernels are Linux, XNU (macOS’s kernel), and BSD. Software written to run on one of these kernels can usually be compiled to run on a different one with minimal code changes³¹. That’s only if the software restricts itself to POSIX syscalls, though: software that relies on kernel-specific syscalls (or proprietary userspace libraries) isn’t so easy to port³².

This interface is also what Microsoft implemented when they built the first version of Windows Subsystem for Linux. Although the current version, WSL2, runs a Linux kernel in a virtual machine, the first version exposed POSIX syscalls straight from the Windows kernel (which is named NT), letting it run Linux applications natively.

We’ll illustrate this point with an example. The cat utility uses three syscalls: one to open the file it’s given, one to read from that file, and one to write the contents to standard out. Let’s focus on the first two of these, as the filesystem abstraction is one of the core interfaces a kernel provides. Let’s talk about some of the things your kernel has to do to complete those two syscalls, and why those things couldn’t be done by cat itself:

Firstly, the kernel needs to figure out where the file being read is stored. Every file on Linux lives somewhere under /, but that does not mean that every file lives on the same piece of hardware. The filesystem you see on Linux can be split across an arbitrary number of physical storage devices, each of which is mounted at a particular location. (You can run mount to see all active mounts on a system.) To figure out where to look for a certain file, your computer has to split its path into a mount and a path within that mount.

Once it’s found the mount, your computer needs to look up what filesystem that mount uses. This is a different meaning of “filesystem” than before: here, we’re using it to refer to a scheme for representing a hierarchy of files, along with metadata like permissions, in a format suitable for the storage device that holds those files. Most filesystems expect a storage device that holds a huge linear array of bytes, like a hard drive or SSD (Linux calls these “disks”). Examples of filesystems that Linux supports are ext4, FAT, and btrfs. Each uses different data structures to represent a file tree.

Your computer then has to traverse the filesystem’s data structures to figure out exactly which bytes on the disk hold pieces of the file we’re catting. But to read those data structures—and the file itself—it needs to know how to read data from the disk. This differs based on the specific disk in use. Most modern hard drives use a bus called SATA for data transfer, while many SSDs use a different bus called NVMe. Flash memory chips on phones use a bus called eMMC, except newer ones which use one called UFS. Bus standards like these specify how data is transferred over one or more physical wires, and there are dozens of them. But even knowing a disk’s bus isn’t enough to read from it, as your computer also needs to know which specific instructions or memory regions control that bus, and these differ from system to system.

All in all, there are thousands of possible combinations of filesystems, disk buses, and bus controllers that a computer might be using, and that’s just to read a file. Once we add on the extra step of printing cat’s output to a screen, which involves GPUs, framebuffer devices, and video interfaces, the combinations are easily in the millions.

Surely cat cannot directly be responsible for this entire chain of operations? If it were, that would mean that any programmer who wanted to read a file would have to reimplement those operations by reading tens of thousands of pages of filesystem specifications and (often confidential) hardware documentation. Computing as we know it would be impossible.

Luckily, you as a software engineer don’t have to worry about doing any of these things on Linux. That’s because it’s the job of a kernel like Linux to do nearly everything we’ve just described for you. Linux exposes a set of operations, known as syscalls, which programs like cat use to directly manipulate abstractions like files. In this case, cat uses the open() syscall to locate a file and then the read() syscall to read data from that file.

Let’s take a look at a minimal version of cat in C that does not do any error checking. We won’t use syscalls directly, but instead use a suite of C functions specified by POSIX that wrap syscalls. The functions are close enough to making an actual syscall that the difference is not worth talking about at length.

This program has three parts: first, it opens a file using open(). This syscall gives you a file descriptor—just a number—that the kernel can use to refer to the file for the duration of your program.

Second, we have a loop. This loop reads chunks from the input file specified by the user into an array and then writes those chunks to stdout. C comes with a constant called STDOUT_FILENO so that the kernel can refer to stdout.

Third, we close the file using close().

int main(int argc, char *argv[]) {
  if (argc != 2) return 1;
  const char *filename = argv[1];
  int file = open(filename, O_RDONLY);
  int nread;
  char buffer[100];
  while (1) {
    nread = read(file, buffer, sizeof buffer);
    if (nread <= 0) break;
    write(STDOUT_FILENO, buffer, nread);
  }
  close(file);
  return 0;
}

A fuller version of this program would check for errors when applicable. For example, what if the filename the user provided does not correspond to a real file? Or you don’t have permission to read it? You can read the manual pages to learn how these functions surface errors and how to handle them.

The kernel’s responsibilities

files
directories
executing things
stdin/stdout/stderr
pipes
processes
/dev/ as filesystem

How a program interacts with the operating system

syscalls
special file accesses (a la /dev/)

Viewing syscalls with `strace`

It’s all well and good to speak in abstract about what system calls do, but it’s another to see them happen right in front of you. Let’s take another look at the implementation of cat from above:

int main(int argc, char *argv[]) {
  if (argc != 2) {
    return 1;
  }
  const char *filename = argv[1];
  int file = open(filename, O_RDONLY);
  int nread;
  char buffer[100];
  while (1) {
    nread = read(file, buffer, sizeof buffer);
    if (nread <= 0) {
      break;
    }

    write(STDOUT_FILENO, buffer, nread);
  }
  return 0;
}

We can compile it using gcc cat.c -o cat and run it with ./cat somefile. We’re telling you that it uses the syscalls open, read, and write to do its job, but how can you verify that is what’s actually happening?

Thankfully, there is a tool called strace that will tell you what syscalls a program is making as it is running. Let’s use it to see what syscalls are actually happening. You’ll see a lot of output that is not immediately relevant to the code above, but occurs when the program is starting up. We have omitted that from the snippet:

cedar% strace ./cat somefile
<other syscalls from process start>
openat(AT_FDCWD, "somefile", O_RDONLY)  = 3
read(3, "hello\nworld\nfoo\nbar\n", 100) = 20
write(1, "hello\nworld\nfoo\nbar\n", 20hello
world
foo
bar
) = 20
read(3, "", 100)                    	= 0
exit_group(0)                       	= ?
+++ exited with 0 +++
$

Lo! Your program calls openat, read, and write.

strace is a powerful tool for understanding what software is doing. The course staff has used it to figure out why a program is hanging—for example, maybe something went wrong with a file read, and strace shows it waiting for a read() to finish.

Some engineers at large tech firms agree that it is useful, but warn that it greatly slows down the program it is running. Brendan Gregg, for example, wrote an article about how slow it can get. For large production workloads, you may want to reach for a newer tool like bpftrace.

Lecture 6

Note: these lecture notes are incomplete and will be updated soon

Pádraig Brady

Slides here

Where can you find command lines?

General-purpose operating systems
- Shells for POSIX operating systems
- Shells for Windows
- Not much else worth mentioning (Fuschia?)
Special-purpose tools
- Debuggers: GDB, LLDB
- Programming languages: Python, Ruby REPLs
- Controlling daemons: bluetoothctl, pacmd
Recovery/rescue environments
- Linux recovery shell (POSIX-compliant, but minimal set of tools)
- GRUB
- U-Boot
Appliances
- Routers/switches

Most command lines have nowhere near the feature set of a POSIX shell and are limited to running individual commands with arguments.

Skills won’t transfer to any of these, but it’s a good bet that typing “help” at an unfamiliar command line will give you some information.

Alternative shells on Linux

Z shell (zsh) is another shell, similar to Bash. It is almost 100% compatible syntax with Bash and differs mostly in configuration for interactive features. It provides simpler theming and configuration support than Bash does.

Fish is a shell that is not POSIX compliant. Its goal is extreme user friendliness at the cost of compatibility. It comes with colors, a very helpful default prompt, fancy auto completion, and more shell built-ins.

Oil is a new shell written by Andy Chu whose first focus is compatibility with Bash and second focus is building a new, better language out of the lessons learned. Remember all the shell gotchas? Oil is trying to fix them. It works not only on text, but on JSON objects, too.

PowerShell, originally designed for Windows, also runs on Linux and macOS. It is a completely different language and works not just on text, but on .NET (a runtime by MS) objects.

Why can these shells run on Linux, BSD, and macOS but not on Windows?

With the exception of PowerShell, these shells will not easily run on Windows. It’s not because Windows is fundamentally deficient as an operating system, or because the programmers turned their noses up at Windows in particular, but because the shells are written for an API called POSIX.

We’ve talked about POSIX before, in particular as a specification for shell behavior and C function behavior. Because Linux, the BSDs, and macOS all provide these POSIX APIs, the shells can run on them with minimal—if any—changes.

Windows, unfortunately, does not provide this set of APIs specified by POSIX. For example, there is no readdir function; Windows provides its own API. This means that in order to get the shells running on Windows, one of two things would have to happen.

The first option is to rewrite each shell to target the Windows APIs. This is a hard sell to the developers, since they mostly do not run Windows. Also, it would be hard to port the exact behavior expected of the shell given the completely different underlying API.

The second option is to provide a POSIX (or POSIX-esque) environment on Windows and then target that. This is what projects like Windows Subsystem for Linux (WSL) and Cygwin do. While WSL provides a Linux-compatible Application Binary Interface (ABI), Cygwin provides a POSIX-compatible source-level API.

WSL1 and WSL2 do this differently; WSL1 provides a shim layer while WSL2 runs a real lightweight Linux VM.

How does using the shell on macOS differ from Linux?

Mainly, different directory structure. Directory layout is mostly not specified by POSIX and can be vastly different between different OSes.

MacOS puts everything in directories with much friendlier names, like /Users/, /Library/, /Volumes/, etc. Application config files can be found in ~/Library/Application Support/ (per-user) or /Library/Application Support/ (system-wide).

The Windows command line (DOS shell, PowerShell) and graphical shell

Take a look at cmd.exe and Windows commands.

Take a look at PowerShell.

PowerShell is in many ways a much more advanced shell than the POSIX shell. After all, it was designed from the ground up circa 2002, while Bash has its roots in the original Bourne shell, which first shipped 23 years earlier in

PowerShell commands (called cmdlets) receive and output structured data rather than simple text, meaning it can process types of data that are hard to represent or delineate as text.

PS> (ls | Sort-Object -Descending -Property LastWriteTime)[0].name
Documents
PS> 

Is there anything about the Windows kernel that makes it less suited to a command line interface? Anything about the Linux kernel that makes it less suited to a graphical interface?

No! That’s just how the userspaces evolved. Windows has evolved from DOS, which was entirely command-line based. Only “recently” did it get a graphical interface (3.1). And even today, power users use PowerShell reasonably heavily on Windows, as .NET objects are first-class citizens and .NET is well-integrated into Windows.

Running POSIX environments on Windows with Cygwin and WSL

As we briefly mentioned above, it’s possible to run software that expects to be operated in a POSIX environment on Windows. It requires an intermediate layer, either at compile time (Cygwin) or at runtime (WSL), but it can be done.

Running Windows apps on Linux and macOS with WINE

Given the broad interest for programs that target POSIX on Windows, you might wonder if there is a demand (and solution) for doing the reverse. There is, and it is called WINE! WINE is similar to WSL1 in that it translates Windows API calls into POSIX calls. Unlike WSL1, though, WINE is a fully userspace compatibility layer This allows you to run full Windows applications, including graphical applications, unmodified, on Linux and macOS.

Lesson: APIs are APIs

It does not matter what system you are running as long as the system provides the expected set of APIs. This is the beauty of implementation vs interface; you need not concern yourself with the implementation as long as you can use its interface.

This is why it’s possible to run programs that target POSIX nearly unmodified on WSL, and why it is hard to port scripts to shells like Fish. This is why people who create operating systems and programming languages focus so much on backwards compatibility.

This applies not only to operating systems and low-level tooling but also to programming at large: you will face very similar challenges switching your cloud hosting provider as porting your code between Windows and POSIX.

Below are some sample implementations of cat. None of them have good error checking because they are just to highlight the different APIs. This is not a good thing. Do not copy and paste without adding error checking.

Using POSIX syscalls

The first one uses POSIX functions and syscalls. These functions and syscalls are available on all BSD and Unix-like machines—systems that conform to POSIX.

#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
  if (argc != 2) {
    return 1;
  }
  const char *filename = argv[1];
  int file = open(filename, O_RDONLY);
  int nread;
  char buffer[100];
  while (1) {
    nread = read(file, buffer, sizeof buffer);
    if (nread <= 0) {
      break;
    }

    write(STDOUT_FILENO, buffer, nread);
  }
  return 0;
}

Windows APIs

The second one uses Windows-specific APIs. These functions are only available on Windows.

#include <windows.h>

int main(int argc, char *argv[]) {
  if (argc != 2) {
    return 1;
  }
  char buffer[100];
  HANDLE f = CreateFileA(argv[1], GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
  HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);

  while (1) {
    DWORD bytes_read = 0;
    ReadFile(f, buffer, sizeof buffer, &bytes_read, NULL);
    if (bytes_read <= 0) {
      break;
    }

    WriteFile(stdout, buffer, bytes_read, NULL, NULL);
  }
}

Using `fopen`

The third one uses functions guaranteed to exist by the C programming language standard. That means all platforms, POSIX and Windows, that support C will be able to run this C program. This is powerful!

#include <stdio.h>

int main(int argc, char *argv[]) {
  if (argc != 2) {
    return 1;
  }
  const char *filename = argv[1];
  FILE* file = fopen(filename, "r");
  int nread;
  char buffer[100];
  while (1) {
    nread = fread(buffer, /*size=*/1, /*nmemb=*/sizeof buffer, file);
    if (nread <= 0) {
      break;
    }

    fwrite(buffer, /*size=*/1, /*nmemb=*/nread, stdout);
  }
  return 0;
}

You almost certainly rely on command-line tools any time you write a program, even if you don’t realize it. To be sure, there are IDEs that let you program without ever seeing a command line; on Windows and macOS, such environments (in the form of Visual Studio and Xcode) are in fact the sanctioned way to develop native applications! Behind the scenes, however, both of these tools invoke command-line tools in order to compile your code, run tests, process resource files, sign and package your application for distribution, and so on. Knowing how to find and run these commands directly will help you figure out what’s happening when things go wrong and will give you the freedom to go beyond the IDEs in cases where they can’t do exactly what you need. ↩
The majority of Linux distributions are not actually POSIX-certified, but nevertheless are generally accepted to be POSIX compliant in all the ways that matter. macOS, on the other hand, is officially certified since version 10.5. ↩
For example, Ubuntu, Debian, Fedora, Red Hat Enterprise Linux, Arch Linux, or Gentoo. ↩
It’s possible to run the Linux kernel with no GNU project code at all: BusyBox, for example, implements most POSIX tools (including a shell!) in a single tiny program that can fit on even the most space-constrained systems; Android uses the Linux kernel combined with its own BusyBox-inspired implementation of POSIX tools and a custom Java runtime; FreeBSD and other UNIX derivatives have their own sets of POSIX tools, many of which are easily ported to run on Linux (and which are also the basis for macOS’s tools). ↩
Because prompts vary between people, computers, shells, and operating systems, in this course we will simply use $ to indicate a shell prompt unless we have a good reason otherwise. To learn more, look up the PS1 and PS2 shell variables. ↩
See man 3 readline for more information on this wondrous ability. ↩
This means you don’t have to abandon your command to run ls every time you forget a file name! You can tab complete without losing what you’ve already typed. ↩
This flow of you typing a command at the shell prompt, that command taking control of the terminal and running to completion (optionally printing output or reading input in the process), and then the shell printing a new prompt is known as a read-eval-print loop, or a REPL for short. Some programming languages also have REPLs—modes where you can enter one statement at a time instead of running a whole file at once. REPLs are common for interpreted languages like Python and Ruby and much less common for compiled languages like C and C++. ↩
This behavior is not only convenient but also improves security. If the shell looked in the working directory first for every command, an attacker could write a malicious program named ls, cat or similar and place it in a publicly-readable directory. Then anyone who went to that directory and typed ls or cat would unknowingly invoke the attacker’s program! ↩
A trailing slash is optional for paths that refer to a directory and forbidden for paths that refer to a file. For most purposes, the slash makes no difference to how a path gets treated. (One exception to this is when the final entry in the path is a symbolic link to a directory; we’ll introduce symbolic links later.) Our convention in these notes is to always include a trailing slash in directory paths to make it easier to distinguish them from file paths. ↩
You may wonder why . needs to exist at all, since adding or removing it from any path doesn’t change that path at all. One reason is to get around the shell’s special treatment of command names without slashes: if you really do want to run a program in your current directory, prefixing its name with ./ is an easy way to do that. Another reason is because some programs assign special meaning to the empty string (for example, as an indicator that you want them to use a default value). To explicitly signal to these programs that you’re talking about the current directory, you can use . as a path. ↩
This is related to the history event number we mentioned last time. ↩
The location of system files is no secret: directories like /usr/bin/, and /etc/ will exist on nearly every Linux system you’ll encounter and will contain many of the same files. These common paths are, like POSIX, a historical artifact that was later standardized. The standard was named the Filesystem Hierarchy Standard (FHS), and it defines the paths where different kinds of system artifacts should live. Most Linux distributions—and programs written for Linux—at least loosely respect FHS. (For a fun distro that doesn’t, check out NixOS.) ↩
See man 5 shadow and man 5 passwd for more information on these files specifically, and OWASP’s password storage cheat sheet for a decent overview of password hashing in general. ↩
Occasionally, you might see an s, S, t, or T in place of an x. These characters indicate special behavior of the file beyond its basic permissions. When an s replaces an x in the owner or group permissions, it means that, when the file is executed, the resulting program will run as its owner or group, respectively, rather than as those of the user who ran it. An S indicates the same thing but replaces a -. Look up the setuid and setgid bits for more information on how this is useful.

When a t replaces an x or a T replaces a - in the other permissions for a directory, it means that files inside that directory can be moved or removed by their owner as well as by anyone with write permission for the directory. Normally, only the latter is true. ↩
On Linux, the various special powers of the superuser can actually be granted and revoked more granularly using a system called capabilities (man capabilities), but it’s generally still the case that programs run by root have every capability and others don’t have any. ↩
If you ever encounter a command that doesn’t have such a mode and can only take filenames as arguments, worry not! There is a special tool called xargs (man xargs) designed specifically for using such tools in pipelines. In the case of piping from find specifically, you can also use find’s -exec flag (man find). ↩
And a lot of other unexpected characters, unfortunately. Don’t treat your filenames as nicely encoded strings; instead, treat them as byte arrays. ↩
Unfortunately, this has an annoying edge case. If find produces zero or one filenames to pass to wc, wc won’t print the total line. And in the general case, we can’t use grep -v to filter out lines that contain total either, since a file could be named total. In this case, however, we’re exclusively looking for files that end in .h, so adding grep -v 'total$' to the pipeline would be a more robust solution. ↩
Process IDs (a.k.a. PIDs) are what the Linux kernel uses to keep track of running programs, and every program you run (regardless of &) has one. The shell prints the PID prominently for background jobs because commands that aren’t part of the shell don’t know about job IDs but might still want to interact with the process as it runs. ↩
If you run a background job that prints output, that output will end up interspersed with the output of whatever’s in the foreground. This can be confusing, especially if you’re running something that expects to have full control of the terminal, like vi, in the foreground. To prevent this, you can redirect the background job’s output to a file. Tip: the special file /dev/null will discard any data written to it and so can be a redirection target for output you don’t care about. ↩
An alternate way to do the same thing is to put the command name between backticks (``). You may see this style in old shell scripts, but it’s rarely used in new scripts because it’s not nestable. ↩
/usr/share/dict/words comes with most Linux distributions and holds a list of commonly-used English words. It’s used by some programs for spell checking, so onomatopoeias like “ZZZ” are included. ↩
Both these databases do actually come with command-line tools, but those tools are designed for administrators to interactively configure and debug the database and don’t provide a means for efficiently running queries and returning data in a format easily usable by scripts. ↩
As it turns out, if you run your executable script with ./myscript.sh and there is no shebang, the kernel will refuse to execute it. However, your shell (Bash, Zsh, whichever) can choose to execute it if the kernel refuses. Bash and Zsh both make a guess if the file is a shell script and attempt to execute it with either Bash or Zsh, respectively. So it’s a one- and sometimes two-step dance. ↩
There is another helpful option, set -x, that prints out every command before it exits, including from invoked functions. This is useful for debugging, or if you feel particularly nosy. ↩
This is still just a guess, but it is a more educated guess. You could very well decide to write those bytes into a file and use them for some other purpose—bytes are bytes are bytes are bytes, after all. But it is a convention to use these bytes to denote an ELF binary. ↩
Each processor architecture has its own terminology for execution states like kernel and user mode. AMD64, the architecture that processors from Intel and AMD implement, calls kernel mode “protection ring 0” and user mode “protection ring 3.” ARMv8, the architecture that processors from Qualcomm, Samsung, Apple, and many others implement, calls kernel mode “EL1” and user mode “EL0.” (Both these architectures also have extra, even-more-privileged modes that can be used to run a hypervisor. A hypervisor is a kernel that mediates hardware access between other kernels instead of between userspace applications. This is how virtual machines work.) ↩
Devices like disk controllers, GPUs, and network cards are generally accessed by reading and writing to special address ranges, separate from the ones that correspond to RAM. The MMU lets a kernel wall these address ranges off from userspace programs in exactly the same way it walls off forbidden areas of RAM. ↩
If you manage to find a way to alter the kernel’s code or data either before it gets booted or while it’s running, you could remove this permission check, at which point the syscall would happily read the contents of any file for you, regardless of whether your process has permission. Because of this, kernel bugs that allow programs to run code in kernel mode are some of the most severe security vulnerabilities that can exist. ↩
Homebrew is a project that takes advantage of this fact to make a number of tools that were originally written for Linux available on macOS. Example: what does it take to read a file? It’s easy to talk in the abstract about what a kernel does, ↩
For example, nearly every modern kernel provides a set of syscalls to draw arbitrary pixels to the screen. Generally, these syscalls are paired with complex graphics and windowing libraries in userspace that let programs present a GUI. But POSIX predates graphical interfaces, meaning there’s no standardization of this functionality across POSIX operating systems. If you want to write a cross-platform graphical application, you should use a library like Qt or GTK. ↩

This site is open source. Improve this page.