Mike's Place » What the Hell is an Event Loop Anyway?

One of the more formative moments in my career as a programmer and as a coder, was the insight that I gained when I truly understood what an event loop is. In fact, until I did, I thought the same thing as some others out there do: that glib, Qt, APR, etc. all provide “event loops”. But, they do not. What they do do, is provide a useful abstraction of the underlying event loop so that programmers can focus on their logic without mucking about the system.

At least, that’s the theory anyway.

Quick Overview

An “event loop” is fundamental to most programs, even if it is not structured as such. (Though, for the sake of convenience and legibility, it almost always is structured as a loop.) In concept, it is simple:

The operating system observes an event from a peripheral device or another program.
The operating system provides a mechanism to the application program such that it can receive notifications of these observations, called events.
The application program handles the event, in whatever way it deems fit.

This is the basis for nearly all (useful) programs which are not structured as linear sets of instructions and subroutine/function calls. Certainly any program that interacts with a GUI or a terminal in a complex manner will use such a thing.

Some Terms, Too

Before digging too much into this, it’s important to remember that there are two things in computer programming which are separate from each other, though it can be incredibly easy to blur the lines sometimes:

A thing itself.
A representation of a thing which is more convenient to use, called an abstraction.

In discussions, it is of course extremely important to know which one is the thing being talked about. We (programmers) often get into the bad habit of calling an abstraction “the thing itself,” when it is not. For some of us (programmers) this gets in the way, because we’re pedants. But that’s just an irritant, when discussions go sideways.

It is absolutely crucial though to be aware of the difference internally. Not having that means that no matter what your success rate is, you’re writing programs without a full understanding of the environment in which they run.

To put it another way: a Java programmer who understands only the invariants and agreements made by the JVM will become unable to cope when the JVM breaks on one platform (but not others!). A C programmer who also knows Java, on the other hand, will be able to peel away the layers of abstraction all the way down to native, unmanaged code and be able to at least identify (if not work around completely) the issue. To the Java programmer, Java is “the thing,” while to the C programmer, Java is an abstraction.

(Yes, even C has the notion of an abstract machine. Keeping in mind this fact, it is an abstract, not virtual, machine. C programs run on whatever CPU, real or virtual, they were compiled for, using the rules of the ABI chosen by the compiler implementation. And this will be the only mention of the C abstract machine in the entire post.)

An Overview in Code

At a high level, an event loop on a POSIX system looks something like this (see poll(2) for more information on the poll system call):

#include <stdbool.h>

#include <unistd.h>
#include <poll.h>

int
main(int argc, char *argv[]) {
        // See 'man 2 poll' or the link above for more information.
        struct pollfd pfd[1] = { 0 };
        int npfd   = 1;

        pfd.fd     = STDIN_FILENO;
        pfd.events = POLLIN;

        // The quit flag tells the loop when it is time to exit.
        bool quit = false;
        while(!quit) {
                int rc = poll(pfd, npfd, -1);
                if(rc > 0) {
                        // Data is available now.
                        char buf[4096] = { 0 };
                        int br = read(pfd[0].fd, buf, sizeof(buf));

                        // Need to do something with the 'br' bytes of data
                        // that we read...
                }
        }
}

This is fairly simple, and it even looks like a loop. The basic function also appears to be simple, because very little is called out for: the program receives a notification that data is available (because POLLIN is the only thing we’re checking for), and the program is expected to read that data and then do something with it.

On Windows, of course, everything looks different. Because everything is different there. Remember, instead of a POSIX base, Windows started with VMS-like primitives and placed a ton of API libraries on top of it to try to smooth it over (and replaced the DECwindows system with its own GUI, based on the GDI API which was created there; now I don’t know much about DECwindows, but since it used Common Desktop Environment built on Motif, I suspect it was somewhat X-like in nature.)

Here’s what the “typical” event loop appears as there:

#include <stdbool.h>
#include <windows.h>

int WINAPI
WinMain(HINSTANCE ci, HINSTANCE pi, LPSTR cmdline, int cshow) {
        MSG  msg;
        BOOL retval;

        while(true) {
                retval = GetMessage(&msg, NULL, 0, 0);
                if(retval > 0) {
                        TranslateMessage(&msg);
                        DispatchMessage(&msg);
                } else if(retval < 0) {
                        // This is an error condition.
                        break;
                } else {
                        // Program termination has been requested.
                        break;
                }
        }

        return msg.wParam;
}

This is a little more complicated, because it is simpler. What do I mean? I mean that there is less visibility here than there is in the POSIX event loop manifestation built on the poll interface above. Whereas a POSIX system does not “do much” for you, the Windows system does a lot. There, the event loop is called a message loop.

What’s the difference?

In UNIX-like systems, the “event loop” is something of an abstract concept. The implementation of it is broken down into two components: the kernel’s implementation, and the system calls which are used to interact with that implementation. Every program, whether directly or through an abstraction, must interact with this event loop implementation if it wants to have any chance at all of being a good citizen. That was not always that way (those of you who remember the days before MMUs and privilege rings/layers know what I’m talking about). Since Windows NT, the basis for all versions of modern Windows, never operated without these sorts of mechanisms available, we do not consider it specially here; programs are dependent on the OS there, just as they are on POSIX systems.

The astute observer will note that this necessarily means that the term “event loop”, while generally accepted, is something of a misnomer. The event loop may be more correctly and less ambiguously referenced as an agreement between the operating system’s kernel (or the vendor’s provided API libraries) and the application program itself. In the case of POSIX systems, the agreement is virtually always with the kernel, while in Microsoft Windows systems the agreement is typically with user32.dll and not the operating system kernel itself, as documented at MSDN.

This also means that the real mechanism is “officially” hidden on Windows, since the Microsoft does not support avoidance of this API. This does not mean that it is not possible; user32.dll obviously does it somehow, but Microsoft reserves the right to change the system call interface of the NT kernel between releases and so user applications which refuse to use this interface and therefore opt-in to the agreement may work on only a subset of Windows releases.

What about without an OS?

So, before we can answer the question of the post, we must look at one more thing: why do we depend on the operating system for an event loop? What about applications running in microcontrollers, which very often have no operating system?

Here it is where we realize that the operating system is itself an abstraction. But it happens to be such a useful and such an ubiquitous one that we nearly always choose to ignore that fact. It would complicate things even more if we did not.

For the sake of completeness, we’ll take a quick look at how this is done. Let us assume a program that keeps time, and transmits that time once per second through a UART device.

Here is the overview of the program:

The hardware begins, running the programs initialization routine. This is where the program “wires up” its handlers to interrupt lines. An interrupt is an event in the hardware world.
The program goes to sleep (calling some sort of a wait-for-interrupt instruction, or, if the processor does not have one, it waits in a busy-loop and checks the interrupt line each time it runs through the loop).
The program uses two interrupt (event) sources: a timer and the UART.
- The timer will fire an interrupt each time it counts down to zero. When this happens, the program increments a counter; when that counter is the right value (whatever value happens to represent one second), the program will transmit the current time to the UART.
- The UART will fire an interrupt when data is present in its receive FIFO and when its transmit FIFO is empty.

So generally, flow will appear thus:

Startup.
Idle.
TIMER INT: INC value. If ready flag set and is one second, transmit.
UART INT: Did data arrive? Discard it. Is TX FIFO empty? Set ready flag.

Now, then, the ANSWER!

It should be pretty clear at this point that without hardware or operating system assistance, waiting on events will always reduce to a busy loop which records one (or more) observations with each pass. So we can say that the notion of an event loop is pretty sound. But the things we use to represent that notion are no longer naturally loops, due to the evolution of both hardware and common software practices. These days, when there is a loop involved, it is most often actually an asynchronous thing running in a userspace program, most often provided by a library that the author of the program did not write.

Now, then, we can fill out the definiton of what’s going on when we talk about an “event loop” in the usual sense, that is a program that is running on a modern, memory-protected, hardware-protected operating system:

An interrupt is fired.
A driver responds, handles the interrupt by putting something in a queue for userspace.
The operating system’s generic event handling infrastructure then acts to make the event available for userspace.
In the case of an exceptional circumstance, the operating system may do something more than “make the event available,” for example, if a program running on a POSIX system receives the SIGQUIT signal, it will terminate instead of recieving the event.
The program, waiting this entire time, and assuming it has not been terminated, now recieves the event and processes it (either using its own code or library-provided code).

This same process works on Windows, too, but on that platform the initial steps of event handling are always performed by user32.dll and not by the program itself. But the underlying steps, starting from the interrupt and going all the way to the application, still occur.

So why isn’t called something else?

Historic reasons, most likely.

There are a few takeaways, here.

First, an ancient operating system on ancient hardware literally used busy loops in order to handle things. If there weren’t any interrupts being given for events, this was the only way to do things. Games on the NES, for example, do not typically receive an interrupt when the user presses a button; a literal loop checks a hardwired memory location to determine its current state and then behaves accordingly.

Second, in modern systems, the loop went from a visible construct present in concrete instruction code within a single address space, to a behavioral loop. That is, the “loop” is in the behavior, and not the code which expresses it. The actual “loop”, then, looks like this:

Operating system observes and queues an event.
An application, which has registered an interest in that event, receives it.

This process repeats infinitely until the application stops listening for/acting on events, or the process terminates, or the operating system halts for any reason at all.

Whether there is physically an instruction loop present on the user side of the equation is irrelevant; it’s a red herring. The loop is often actually a loop in structure/appearance, but sometimes it is not. Being that the “loop” part is not required, confusion on the topic is unsurprising.

So this is all useless? It’s just a loop, right?

There are valid methods for structuring an event-based program which do not involve any looping. By thinking about the event loop the way that we do, we might be inclined to make such a program more complicated than it truly needs to be.

Imagine a program whose only purpose is to wait for the top of a second, and transmit a UDP packet. It needs no loop. It only needs to:

Determine the current time.
Sleep until the right time.
Send the message.
Terminate.

This can be done quite easily on a Linux system by creating a timer (on Linux, see time(7) for a starting point), waiting on that timer, sending the packet, and falling off the end of main().

This is, by the way, what I mean when I speak (in general) about minimalism in programming: far too often, we’re shipping programs which are far larger than they need to be. This only increases the available attack surface of a program, particularly when the program is being run on a multi-tenant system.

Thanks for reading.

If you appreciated this article (or anything else I’ve written), please consider donating to help me out with my expenses—and thanks!