Mike’s Place

Trust in Programs

Computer programs are only as “smart” as they are made to be. This is something that you already know if you’ve written even a single computer program. This is a fact that is often exploited by people, used against programs which are connected to the public network.

The Spectrum

As with anything, with programmers, you have a spectrum. Most people fall somewhere in the middle of it, but at the edges you have two types of people:

Most of us are somewhere in-between. Personally, I think that programmers that know C and C++ (both, not either!) are the most useful entities around, because they can switch between the absurdly low-level and the absurdly high-level in the same program, while precisely controlling the costs of the abstractions.

The Problems With the Left

There are several problems being on the left side of this particular spectrum:

SpinRite, by Steve Gibson, is a “left-end” program on this spectrum; it requires MS-DOS, FreeDOS, or another compatible OS on which to run. It uses BIOS to communicate with hardware, and it is limited to running on PC systems. To run on anything else, a virtual machine must be used, which eliminates all direct access benefits.

On the other hand, ddrescue is a “right-end” program on this spectrum; it is free software, and it is written in C for POSIX-family systems. It works at least everywhere that Linux does, and allows an in-situ rescue operation to be performed without even rebooting the system. It has no assembly language instructions embedded within it outside of those generated by the compiler, meaning that it is portable to any operating system that supports direct (and even cached/indirect!) device access.

Personally, I have never had ddrescue fail to recover data from a drive that was still operational (spinning, without physical damage, and without controller board damage). Since this is the claim that Gibson makes of SpinRite, I see no reason to spend $100 for a single-platform application program when I have one that is objectively far better and it costs nothing.

The Problems With the Right

Just as with being on the left, the right has its own problems:

Authors of new computer programs loudly cry “Written in Ruby!”, “Written in PHP!”, “Written in Python!”, etc., as if it is a great reason to use those programs. That simply isn’t true: it’s a great reason to not use them, because not only do those programs inherit the flaws of their runtime environment, but they are powerless to circumvent them.

Sittin’ in the Middle

The middle is a nice place to be, at least in most spectrums. This one is no exception. Here, the middle represents languages like C and C++, where the level of abstraction is completely controllable by the programmer. There are newer languages (such as Go and Rust) which compile to native code, but they offer less control and heavier runtimes, making it difficult to say if any of them offer similar flexibility.

What About Trust?

Every program ultimately results in the execution of a sequence of instructions on a processor, a machine which does nothing but tick, tock, and execute instructions. Most of the time, this happens within the context of an operating system, although sometimes it happens on bare metal (that is, the physical or virtual system itself). The thing about an operating system’s exposed API is that, in its documentation, a contract is specified. Most programs trust that the contract will be honored, either because the programmer makes that assumption, or because the programmer does not want to incur the overheads of eliminiating that implied trust. Of course, most runtime libraries also place immense amounts of implied trust in the operating system, and so programs largely don’t bother trying to eliminate it for themselves, since it cannot be done for the program’s dependencies.

At first, it would seem that the situation is already hopeless. An operating system has (if it is correctly implemented) complete control of the hardware, meaning that userspace program code runs subject to the terms and conditions of the operating system. If the program does something that the operating system does not like, it will murder the program, no questions asked.

Yet the situation is not hopeless. Given a suitable runtime library that itself does not trust the operating system, and yet presents a standard C API, it is possible to have a program that does not trust its host operating system (mostly). There are a few caveats; mostly, extremely low-level interfaces (e.g., where a program interacts with a GPIO line or memory or port based I/O device) cannot have the implied trust removed from them; the operating system can easily step in and simulate the device, acting as a man in the middle (and without the program’s ability to detect it directly).

There are really only two requirements, in order to have a program that does not trust its host operating system:

“But that’s impossible!” No, it isn’t. Not really. When you think really hard about it, you realize that there is precisely one component in the system that needs to see both the instruction and the data on which it is to operate, and that is the hardware itself. The operating system is just a facilitator to get your program running on the hardware for its allocated slice of time. In every operating system other than Microsoft Windows, a program starts like this:

In Microsoft Windows, the process is similar to the one shown above, but instead of starting in an empty virtual address space, several objects and modules are preloaded into the virtual address space. Additionally, the details of the dynamic linker on that platform is unknown (to me, that is; yes, I am aware that I could study them in ReactOS, but since I do not operate [primarily] on Microsoft Windows, such low-level details are not worthwhile for me to learn right now).

Other Strategies

As long as the two requirements above are met, the strategy used doesn’t matter. Here are a few other possibilities that can work to solve the problem:

Conclusion

It’s a lot of effort, and it’s probably not worth it for most programs. The lack of a standard C runtime (and honestly, who has time to make one from scratch that performs paranoid checking everywhere?) makes it an extremely unattractive option, and nonviable for nearly all programs. Even programs that require highly secure operational environments are likely to have those environments provided by administrative, and not technical, processes.

Thanks for reading.

If you appreciated this article (or anything else I’ve written), please consider donating to help me out with my expenses—and thanks!