There are several popular choices when it comes to which protocol to use for Web service backend software. In this post, I take a look at the available options and—spoiler alert—come to the conclusion that HTTP is the winner. Yes, HTTP. The one and same protocol used to actually communicate with the Web browser, should also be the one used to communicate between the Web server and the application itself.
If I spent several weeks scouring the Internet, I could probably find all sorts of crazy, ad-hoc choices to add to this list. I’m pragmatic, so I’m not going to do that. Instead, I’m going to focus on the most commonly selected and supported choices:
exec(2)every single time a request for a dynamic resource came in. If several thousand requests came all at once, the server would be unable to handle them all—and CGI was inherently machine-local, which limited scalability in a big way.
We’ll analyze each of these in order.
The Common Gateway Protocol—most commonly known as CGI—is the O.G. of dynamic, out-of-process interfaces between Web servers and Web applications. Virtually every Web server that was around when Netscape Navigator was popular, and every one derived from one of those, has support for CGI. However, many new Web servers do not support CGI, since it often makes little sense these days.
CGI is conceptually quite simple:
fork(2)to create a child process, then by clearing the process environment and setting it to the values provided by the HTTP request itself before
exec(2)’ing the CGI program.
Functionally, CGI is a simple proxy between HTTP and the standard I/O of a program specially designed to speak the CGI protocol. It requires very little in the way of resources—when CGI was designed, a server often had a just one single-core processor that ran no more quickly than 66 MHz, and often had less than 128 MiB of RAM. CGI programs were written in C and compiled, not written as interpreted scripts—those were for system administration. However, there is a time cost involved in the procedure of creating a child process, as well as in the procedure of replacing the current process image with a new one. And this must happen on every request. CGI thus becomes problematic when there are more requests than available slots in the process table!
Even if CGI defined a method of streaming response, the number of concurrent requests would again be limited to what could be serviced by a single system. This would result in the system failing to service requests after running out of sockets or processes. Since CGI cannot do networking, so there is no way to spread the load between systems.
CGI is fine for a site with light traffic. Any more than that, and the load (either from CGI or the application logic itself) will become too great for a small system. Even on a larger system, the Slashdot effect can cause a DoS by virtue of consuming all of the host system’s file descriptors or process table entries—or, if the request handling process involves a great deal of memory usage or I/O, through resource starvation.
Obviously, more concurrent requests can be serviced if a system has more processors, more memory, and more IP addresses—as well as an increased maximum process table size. However, CGI performance begins to suffer when:
201 Created, close its standard output, and continue processing in the background, it is also likely that the Web server will decide to terminate the CGI process once it has no more output to offer (because then what continued use is it to the Web server?), and so most CGI programs will simply defer the response until all the work is done.
One of the first proposed successors to CGI was FastCGI. The promise of FastCGI was scalable applications that could service requests, well, fast, and that it was superior in every way to CGI. Unfortunately, that was not exactly the case. FastCGI replaced CGI with a protocol that was way more complicated than CGI was, when all what was really wanted by most environments was a network-enhanced CGI. FastCGI did that, but it also did much more, trying to provide functions itself which were better kept at the HTTP header layer.
In any case, the main user of FastCGI these days appears to be PHP, with some other environments supporting it for compatibility with environments that have already set it up to use it for some other purpose (such as PHP). It is complicated, though, so it’s really not a good idea to use it unless one is forced into that corner.
FastCGI does have advantages:
However, FastCGI’s disadvantages are worse, in my personal opinion:
FastCGI is best kept around for legacy purposes. It shouldn’t be configured for new applications—it should only be used to speak to already-deployed systems.
SCGI, as implied by the name, is simple. So simple, in fact, that it can be described very concisely:
There are no roles (as in FastCGI), there is no multiplexing, and there are no held-open connections.
SCGI is fine for use today, but it will always perform worse than an HTTP application server, because every single request is a single socket connection, which is SCGI’s biggest failing. If SCGI at least had connection reuse, it would be much more useful—and efficient, too.
SCGI is fine, but HTTP is better. For extremely high-traffic, high-throughput sites, SCGI may not be a viable option unless the hardware budget is large.
Alright, so then we come to the main player: HTTP.
There is one major reason that it wins, even if we fail to consider any other options: it is the protocol spoken between the client and the Web server. If it is also the protocol spoken between the Web server and the application server, then it is truly lossless. The biggest failings of the other choices—CGI, FastCGI, and SCGI—is that they do not have 100% of the capabilities of HTTP. You know what does have 100% of the capabilities of HTTP? HTTP.
HTTP has all the advantages:
These are some pretty significant benefits, with perhaps the largest consequence being that HTTP can serve as a one-size-fits-all solution for data transfer and network service APIs alike. It’s actually perfect, because a wide variety of middleware already knows how to work with HTTP, and HTTP is naturally amenable to extension in this way.
As an example, a proprietary authentication system can be integrated with Web infrastructure by writing a middleware which implements the following policies:
Now, without modifying anything about the underlying resource server, we’ve added a layer of security to our operations. Perhaps at a later time, an object storage server can be interfaced to the Web via an OAuth2 middlware.
As I stated at the outset, HTTP is the clear winner here. It should probably be considered a best practice to build infrastructures which consist of end-to-end HTTP.
Thanks for reading.
If you appreciated this article (or anything else I’ve written), please consider donating to help me out with my expenses—and thanks!