Mike’s Place

Unix Basic Concepts

First off, I’d like to say that this document, while more or less finished, has not been fully proofed for typographical errors. Be kind, thanks.

This aims to be generally useful for anyone coming from a background that is not Unix. I do assume herein that you’re coming from something else, but not what that something else is. To put it differently: the information in this page will be absolute gobbledygook if you’re not familiar with analgous concepts in some other multiuser environment. (Yes, there are multiuser environments which are neither Unix nor Windows; among them are MVS, OpenVMS, BeOS, and clones thereof, and probably others.)

Please note: This document really just kinda touches the surface. However, it is not meant to be read and digested all at once. All operating systems/environments, even ones such as Unix, where the individual components are simple, have many concepts associated with them. About the only people that can reasonably be expected to read this document all the way through and have it be a leisurely read are those who are already seasoned Unix system administrators. Keep this in mind while reading!

It is particularly important to understand that you’re not going to come out of this with the ability to suddenly use or manage a Unix system. However, it will give you a starting point for additional learning and ultimately, the ability to move forward from being a non-user of Unix, to an administrator and/or user of Unix systems.

I have tried to keep each section small enough that it represents approximately one day’s worth of reading and additional research. Please tell me if I did this poorly.

Also note: The documentation (“man”) pages linked from this page are for Linux. Every version of Unix has its own man pages; if you’re using a system that is not Linux, refer to that system’s installed man pages instead.

Part One: Philosophy

To start, Unix is not (in the modern age) a single operating system, or even a particular group of them. It is more of a philosophy of system design, with commonly shared attributes.

As popularized in a paper by Eric S. Raymond, there are three categories: genetic Unix, which derives from AT&T Unix source code, trademark Unix, which is a Unix system (including macOS) which passes the conformance tests and has been submitted for and gained certification by the trademark holder, and a third category which is neither genetic nor certified, the so-called “functional” Unix systems, which is where Linux and several others fall in.

This more-or-less highlights the common thread between them, in that all these varied systems can be called, in some way, a Unix system:

These systems all have in common a relatively “simple” system design. By “simple,” I do not mean to say that they are easy to use: on the contrary, when it comes to computing, typically, more technologically complex systems are more user-friendly, and the inverse is also commonly true. Simple refers to the design of the system from a technical standpoint.

The common recipe for a Unix system has generally been:

It makes sense, then to summarize the philosophy of Unix system design as, “A Unix system comes with many simple tools with relatively clear divisions of responsibility between them; tools operate on files, and everything is a file.” As with anything in life, there are exceptions to this, but for the most part you will not come across them until later.

Part Two: System Startup

Generally, Unix systems have a fairly simple procedure for starting up. The kernel is loaded into memory by a bootloader, and then started. The kernel performs early hardware initialization, and then starts the init program, which always has the process identifier 1 and is treated specially by the kernel.

The init program is responsible for performing the rest of the startup process, which includes:

Most Linux systems use systemd as their init system, which is loosely modeled after the macOS init system, while most other Unix systems use the classic sysvinit or BSD init implementations.

After system startup is finished, and assuming that nothing in the startup process failed, the system is said to be running in “multiuser mode,” which means that it will now allow logins on terminals, at the console, and possibly over the network if the appropriate dæmons have been started.

Part Three: System Concepts

Before getting into how to use the system, we need to talk about the concepts which we need to know about the system, particularly in the realm of its administration. This might seem backwards, but really it isn’t: a regular user who does not manage their own system does not need to know about most of these things, whereas a system administrator (which includes the use of a system that is not managed by someone else!) needs to have a quite different view of the system. Normal users can just start running programs, and they can do no damage (if the system administrator has done their job correctly, that is).

Users and Groups

Unix, being a multiuser operating system, has the concepts of users and groups.

A user has a login, which is the user’s “short” name; a user ID (also known as a UID) which is a numeric identifier for a user, and one or more group IDs (also known as GIDs), representing the group(s) to which the user belongs. Unlike some other operating systems, user and group IDs are simple integer values (often 16-bit or 32-bit values, meaning that they range from 0–65,535 or 0–4,294,967,295). This is a bit of a difference if coming from a system such as Windows, where a user or group ID value looks something like S-1-5-21-992878714-4041223874-2616370337-1001.

User and group ID values are used in permission checks which are performed by the system, as will be seen below.

The Unix Filesystem

Every family of operating system has a different way of addressing filesystems, which usually reside on block devices. On Unix, there is a single hierarchical filesystem tree, to which all filesystems are attached. To contrast with a couple of different strategies:

On Unix, the root directory (spelled /) is, as the name implies, the top of tree. Every other filesystem present must be attached to a directory in the tree, which becomes a mount point. The mount(1) and umount(1) (note the spelling!) commands are used by a system administrator or a script to mount resp. unmount a filesystem.

When a Unix system starts, the fstab(5) file (or, on systems which use systemd, the systemd.mount(5) files in addition to fstab) are used to determine which filesystems are to be mounted by default. The fstab file, as well as files named ARBITRARY_NAME.mount may be used to inform the system of filesystems which should be mounted during the startup process. (Even on systems using systemd(1), many administrators use the fstab file because it is what they know, and systemd will honor it for backwards compatibility reasons.)

Every file that can be accessed exists within the context of a filesystem, and every filesystem (including the root filesystem) has a mountpoint.

Directories are simply files which contain other files; on some Unix systems, they can even be opened as if they were a normal file, and modified directly, whereas on others (such as Linux) directories cannot be opened like normal files.

File Permissions

Every file in a Unix system has a set of permissions associated with it; it has an owner UID and GID value, as well as a set of allowed actions for the owning user, the owning group, and “everyone else”. This permissions model dates back to the first multiuser Unix systems. A directory listing shows these for each file, as in the example below for a single file on my system:

-rw-r--r-- 1 sf users 1335 Oct  1 16:39 GNUmakefile

Despite a user and group name being displayed, the filesystem actually only records the UID and GID values, as shown below:

-rw-r--r-- 1 1000 989 1335 Oct  1 16:39 GNUmakefile

This means that if a user is deleted and a new user is created and assigned the same UID value, that new user will receive all of the old user’s permissions. This is an important thing to note, as in most systems, the lowest available UID will be assigned when a user is created; while on others, the next available one will be assigned. (That is, some systems treat UIDs as serial numbers, whereas others treat them as reusable values.)

The first part of the directory entry listing shows the permissions on the file; in the case shown above:

The third slot in each grouping will read x if the given entity (owner, group, or other) has permission to execute the file (or, in the case of a directory, search the directory [that is, list the contents of the directory]). This is a difference from operating systems and environments which use file extensions and/or MIME types to determine whether a file is executable. In most cases, it is possible to have a file which is executable by a given user, but cannot be read by that user.

There are also additional permissions (setuid, setgid, and sticky bit), as well as the ability to use ACLs, but this is a topic best gotten into after basic familiarity with the overall system has been achieved, furthermore, the meaning and semantics of these extra permissions can differ between Unix implementations.

File Types

On Unix, there are a few different types of files (for additional detail, see the Wikipedia article):

Networking

Networking is similar to other platforms. The names of network devices are different for each Unix system, and on some (such as Linux) they can be renamed. On most Unix systems, the ifconfig(1) utility is used to manage configuration details; on Linux, the ip(1) command is preferred because it offers cleaner syntax and supports various functions that are not supported by the ifconfig command on that platform.

Most Unix systems have the following network capabiliites:

Processes

In Unix, every currently executing program is known as a process. Each process has several attributes:

Part 4: Introductory man Pages

Each section of the Unix manual has a page named intro:

Viewing man pages on the Web can be useful; however, the best reference is always the one at hand on an installed system. Those man pages will always be the correct version for the system which is executing.

It is important to be in the habit of referring to man pages, particularly if you frequent different versions or implementations of Unix.

Conclusion

I hope that this serves as a helpful overview of the most basic concepts of a Unix system. With any luck, I’ll continue this as a series; however, note that any continuation of this will focus specifically on Linux systems. Since knowledge tends to be portable between Unix systems except for the very low-level details, this will still be useful. However, there are other resources which may also help:

Furthermore, for non-Linux systems:

Lastly, for Unix systems generally, there is the Unix & Linux StackExchange, where you can search for questions, and if you have a question that hasn’t been asked, you can ask yours.

And of course, the search engines.

Thanks for reading.

If you appreciated this article (or anything else I’ve written), please consider donating to help me out with my expenses—and thanks!