Mike's Place » Web Service Backend Protocols

There are several popular choices when it comes to which protocol to use for Web service backend software. In this post, I take a look at the available options and—spoiler alert—come to the conclusion that HTTP is the winner. Yes, HTTP. The one and same protocol used to actually communicate with the Web browser, should also be the one used to communicate between the Web server and the application itself.

The Choices

If I spent several weeks scouring the Internet, I could probably find all sorts of crazy, ad-hoc choices to add to this list. I’m pragmatic, so I’m not going to do that. Instead, I’m going to focus on the most commonly selected and supported choices:

CGI: Nearly as old as the Web server itself, CGI—the Common Gateway Protocol—was used to run a program in response to an HTTP request. The program would be expected to return a status line, followed by the HTTP response. This required a fork(2)+exec(2) every single time a request for a dynamic resource came in. If several thousand requests came all at once, the server would be unable to handle them all—and CGI was inherently machine-local, which limited scalability in a big way.
FastCGI: An alternative to CGI. Advantages include support for sockets (including persistent, multiplexed connections) and higher operational efficiency over plain CGI. However, FastCGI is complex when compred to CGI or even HTTP.
SCGI: Another alternative to CGI, it emphasizes simplicity (SCGI stands for “Simple CGI”). A single stream socket may handle a single request/response cycle, there is no reuse of a connection nor any multiplexing of requests and responses over the channel. SCGI therefore requires at least one socket per active request, which can be troublesome when the HTTP server is handling many thousands of concurrent requests.
HTTP: Of course, there is HTTP itself. After all, it is what the Web browser uses to request resources from the Web server, so why shouldn’t the Web server use it to talk to an application back-end? It already has the code to handle it, and as I’ll get to shortly, it has a significant number of advantages over the alternatives. Yeah, yeah, I know, with the spoilers.

We’ll analyze each of these in order.

Common Gateway Protocol (CGI)

The Common Gateway Protocol—most commonly known as CGI—is the O.G. of dynamic, out-of-process interfaces between Web servers and Web applications. Virtually every Web server that was around when Netscape Navigator was popular, and every one derived from one of those, has support for CGI. However, many new Web servers do not support CGI, since it often makes little sense these days.

CGI is conceptually quite simple:

A client (for example, a Web browser) issues an HTTP request to the Web server.
The Web server spools the request. Once it has been received, it parses the request and establishes the CGI Environment, typically by using fork(2) to create a child process, then by clearing the process environment and setting it to the values provided by the HTTP request itself before exec(2)’ing the CGI program.
If the HTTP request had a body, it is fed to the CGI program’s standard input.
The Web server then waits for a response from the CGI program. CGI does not define streaming outputs, however—so a Web server may spool, stream, or chunk a response from a CGI program at its own discretion.

Functionally, CGI is a simple proxy between HTTP and the standard I/O of a program specially designed to speak the CGI protocol. It requires very little in the way of resources—when CGI was designed, a server often had a just one single-core processor that ran no more quickly than 66 MHz, and often had less than 128 MiB of RAM. CGI programs were written in C and compiled, not written as interpreted scripts—those were for system administration. However, there is a time cost involved in the procedure of creating a child process, as well as in the procedure of replacing the current process image with a new one. And this must happen on every request. CGI thus becomes problematic when there are more requests than available slots in the process table!

Even if CGI defined a method of streaming response, the number of concurrent requests would again be limited to what could be serviced by a single system. This would result in the system failing to service requests after running out of sockets or processes. Since CGI cannot do networking, so there is no way to spread the load between systems.

Conclusion

CGI is fine for a site with light traffic. Any more than that, and the load (either from CGI or the application logic itself) will become too great for a small system. Even on a larger system, the Slashdot effect can cause a DoS by virtue of consuming all of the host system’s file descriptors or process table entries—or, if the request handling process involves a great deal of memory usage or I/O, through resource starvation.

Obviously, more concurrent requests can be serviced if a system has more processors, more memory, and more IP addresses—as well as an increased maximum process table size. However, CGI performance begins to suffer when:

There are too many processes for the hardware or scheduler (multitasking breakdown).
Each request requires one or more connections to another resource (such as a database or backend server), which must be set up and torn down at each request (and further limits the number of requests that may be serviced at a single time without queueing them up).
Requests run long enough that the system runs low on memory, process table entries, file descriptors, or some other finite resource that is required in order to continue servicing requests. While it is possible to write a CGI program to return a 201 Created, close its standard output, and continue processing in the background, it is also likely that the Web server will decide to terminate the CGI process once it has no more output to offer (because then what continued use is it to the Web server?), and so most CGI programs will simply defer the response until all the work is done.

FastCGI

One of the first proposed successors to CGI was FastCGI. The promise of FastCGI was scalable applications that could service requests, well, fast, and that it was superior in every way to CGI. Unfortunately, that was not exactly the case. FastCGI replaced CGI with a protocol that was way more complicated than CGI was, when all what was really wanted by most environments was a network-enhanced CGI. FastCGI did that, but it also did much more, trying to provide functions itself which were better kept at the HTTP header layer.

In any case, the main user of FastCGI these days appears to be PHP, with some other environments supporting it for compatibility with environments that have already set it up to use it for some other purpose (such as PHP). It is complicated, though, so it’s really not a good idea to use it unless one is forced into that corner.

FastCGI does have advantages:

The FastCGI-enabled application server may be located anywhere in the network.
The application server is a long-running process (service or dæmon), so it does not have to warm up to handle a single request and then shut down. This allows the application server to benefit from a “warm” cache and persistent connections to resources such as databases.
Multiple connections between the HTTP server and the FastCGI server are possible, and each connection is capable of servicing more than one request, so socket setup/teardown does not have to occur frequently.

However, FastCGI’s disadvantages are worse, in my personal opinion:

The complexity of FastCGI demands complex implementation. This, in turn, increases the chance of problems which might result in failure. For example, FastCGI has the notion of “roles,” where an endpoint may be a responder, a filter, or an authorizer.
FastCGI allows a configuration where the Web server spawns the FastCGI program, and the FastCGI program can choose to exit after a single request, making it possible to adopt FastCGI and suffer worse performance.
FastCGI is not as flexible as bare HTTP, which is perhaps the biggest reason not to use it.
FastCGI specifies itself to be an emulation of CGI/1.1, so it inherits the uncertainty as to streaming replies.

Conclusion

FastCGI is best kept around for legacy purposes. It shouldn’t be configured for new applications—it should only be used to speak to already-deployed systems.

SCGI

SCGI, as implied by the name, is simple. So simple, in fact, that it can be described very concisely:

HTTP server parses the request, collecting the headers into a netstring and appending the request body, if there is one.
HTTP server sends the assembled request to the SCGI server.
HTTP server waits for SCGI server to send an HTTP response.
SCGI and HTTP servers close the connection.

There are no roles (as in FastCGI), there is no multiplexing, and there are no held-open connections.

SCGI is fine for use today, but it will always perform worse than an HTTP application server, because every single request is a single socket connection, which is SCGI’s biggest failing. If SCGI at least had connection reuse, it would be much more useful—and efficient, too.

Advantages:

Like FastCGI, can be anywhere in the network.
Has a very simple, easy to implement protocol.

Disadvantages:

Specifies no provision for streaming responses!
No connection reuse or request multiplexing.

Conclusion

SCGI is fine, but HTTP is better. For extremely high-traffic, high-throughput sites, SCGI may not be a viable option unless the hardware budget is large.

HTTP

Alright, so then we come to the main player: HTTP.

There is one major reason that it wins, even if we fail to consider any other options: it is the protocol spoken between the client and the Web server. If it is also the protocol spoken between the Web server and the application server, then it is truly lossless. The biggest failings of the other choices—CGI, FastCGI, and SCGI—is that they do not have 100% of the capabilities of HTTP. You know what does have 100% of the capabilities of HTTP? HTTP.

HTTP has all the advantages:

It is simple to implement. Not as simple as SCGI, but simple enough. It is also possible to simplify the implementation greatly by breaking a few rules, and then simply requiring that a proxy be used for accessibilty by devices speaking the full version of the protocol.
It is well-defined and battle-tested.
Here in 2019, it is a universal protocol, in that it exists everywhere. It happened that way for a reason: HTTP (and HTTPS) are typically allowed through firewalls, and HTTP is robust enough to support just about any type of API. Natural selection at work, in the technical world.
It supports pipelining and connection reuse.
It supports streaming responses.
HTTP itself can be pipelined and streamed: an HTTP server can begin sending the request to the middleware or application server as soon as it has received enough to ensure that it meets the requirements (is authenticated, etc.); also, an HTTP server can start sending the reply as soon as it starts receiving it from the middleware or application server.

These are some pretty significant benefits, with perhaps the largest consequence being that HTTP can serve as a one-size-fits-all solution for data transfer and network service APIs alike. It’s actually perfect, because a wide variety of middleware already knows how to work with HTTP, and HTTP is naturally amenable to extension in this way.

As an example, a proprietary authentication system can be integrated with Web infrastructure by writing a middleware which implements the following policies:

Each request must carry with it a set of authentication credentials.
Those credentials must match a set of credentials already stored in the authentication database.

Now, without modifying anything about the underlying resource server, we’ve added a layer of security to our operations. Perhaps at a later time, an object storage server can be interfaced to the Web via an OAuth2 middlware.

Conclusion

As I stated at the outset, HTTP is the clear winner here. It should probably be considered a best practice to build infrastructures which consist of end-to-end HTTP.

Thanks for reading.

If you appreciated this article (or anything else I’ve written), please consider donating to help me out with my expenses—and thanks!