Some thoughts on Web programming…

May 2nd
Posted by Michael Trausch  as computing, programming, random thoughts

I used to write a lot of projects in PHP when I was working for creating a Web interface to something. However, I recently have spent some time learning other languages—C# and Java being two of them—and am beginning to realize that I think I would rather use one of them for creating anything larger than the smallest of projects anymore. Of course, their hosting environments differ widely, but I am finding that while I don’t know how the languages compare with the Apache PHP module in terms of performance, I would find the way that I think in terms of programming to be more easily expressed in either of those two languages, compared to PHP.

I used to think that language features found in the C family—notably, strongly typed variables—were nothing but a painful nuisance. At least, I thought that way until probably a few weeks ago, when I started work on a PHP project after spending a lot of time working in Java for a class and C# out of my own interests. And I realized something about languages that use strong variable types: Languages that require strong typing of variables might be inconvenient sometimes, but they build a lot more documentation into the program, simply by requiring strong typing. In a language such as PHP, you simply cannot tell what the variable’s type is by looking at the variable name. If you follow conventions (and not all people do) for naming your variables, you might be able to create little “hints” to their types, like $isAuthenticated for a boolean value or $startedWhen for a timestamp, date, time, or other similar type. The Hungarian notation might work for PHP, too, but I have always found that rather hard to read, particularly in certain situations. Instead, properly written code in a strongly typed language makes it very clear what the variable is, does, and may contain. Very fascinating at least to me, simply because I have always disliked what I perceived to be inflexibility in strong data types for variables.

Of course, this isn’t the only reason that I am finding I would want to work with one of those two languages. I absolutely love the idea of generics (yes, I know that they’re in C++… I never did play with that language enough to “get it,” apparently). I like the simple power of the languages (and their associated class libraries)—you can do things like create a dynamically sized array in a way that feels natural and non-hackish, using a type of generic called a “collection”, which goes a long way towards readability (and as strange as it sounds, that power partly comes from the extra documentation provided by the strong types used with them). Given that “code is read much more often than it is written”, it makes sense that features that reduce the amount of programmer time like this—and are readable—are things that I find useful.

There is only one real downside to both C# and Java, when compared to a language like C: They are far, far slower. What they essentially do is transfer programmer time to system processing time—depending on an application’s lifecycle combined with many applications running at the same time on a system, the system processing time could indeed be more than the time a programmer saved writing the program to begin with. C# tries to mitigate this by compiling the program’s bytecode “just-in-time” so that native code runs, and code that executes repeatedly gets a benefit from this. Java does the same thing with HotSpot, though only in certain areas of a program as the Java VM figures is necessary while analyzing the program during its execution. Still, they are slower and use more memory. PHP probably is, compared to C, as well. I think it would be interesting to compare PHP against C# and Java in a Web environment, though I have not done that just yet since there is no really easy way to do it equitably that I am aware of.

Anyway, how does this pertain to Web programming? I am beginning to think that I should use PHP for only small things that are not required to be very powerful or have long maintenance lifecycles. Why? There’s a ton of PHP programmers out there, to be sure, and PHP runs “out of the box” on a lot of services, such as NearlyFreeSpeech.NET, whereas something like mod_mono or Tomcat do not. You might be required to manage more of an application server with the latter choices, while the former ones are more flexible in most installations and you can manage just the application and leave the server management to a crew of people that want to do that. Where, then, is the benefit?

It would appear—at least to me—that the increased investment in initially setting up a Web application server (and maintaining it) with an environment for either C# or Java would be paid off in that wonderfully limited resource known as programmer time—as well as frustration. Personally, I hate re-inventing the wheel. And while there are resources for PHP that will let you do that, sometimes they require extensions to be installed or—often—the use of ad-hoc class libraries written for PHP, such as those found at the PHP Classes Web site (which, IMHO, is a poorly designed site to begin with). Unlike PHP, C# and Java both come with very large, powerful, and more importantly, standard class libraries, which is something that greatly helps programmers with both reading and writing code. There is a lot of support out there for using third-party class libraries (which you can find all over the place), and they are supported by a lot of third-party programs with interfaces, too (such as MySQL and PostgreSQL, and others).

All told, learning these languages has got me wondering if I shouldn’t use one of these languages instead. While I’ve really only scratched the surface of learning about many of the other things (such as the details about creating Web applications in these languages and hosting them), it seems that overall, there are less headaches to be had and code can be written faster, debugged faster and more efficiently, and there is more convention built into the language to govern how things are read and written by other people.

Tags

4 Comments

  1. Paul McKibben  3rd May 2008  

    Thanks for your article. I agree with you that, between PHP and Java, for a large application, I’d rather write it in Java, at least on the backend. However, in the Java world, more and more people are talking about writing their client side code in one of several dynamic languages such as JRuby (Ruby on the JVM), Jython (Python on the JVM), or Groovy (which also runs on the JVM). They say they can write the client-side code in 1/10th the time in one of these languages, compared to Java (I’m assuming they mean JSP or servlets). For me, the jury’s still out. Maybe I’m old-school, but I share your opinions about the benefits of strong typing.

    However, I disagree with your statement about Java applications being “far, far slower” than C applications. Benchmarks taken over the last few years suggest otherwise. I realize that benchmarks don’t always mimic real-world applications, but the conventional wisdom these days (and my experience agrees) is that for typical applications, Java performance is comparable to native performance. Of course, for special circumstances where performance is critical, a fine-tuned native C or C++ application is the way to go. But most real-world applications are not fine-tuned.

    Some links concerning Java performance:
    http://kano.net/javabench/
    http://www.ibm.com/developerworks/java/library/j-jtp09275.html
    http://www.idiom.com/~zilla/Computer/javaCbenchmark.html

    Thanks again for your article!
    –Paul

  2. Michael Trausch  3rd May 2008  

    @Paul:

    My statements on speed were actually based on running the simplest program I could think of, repeatedly, to get a feel for how much time was spent running just a very simple program. For working with “Hello, world” as an example using C, C#, and Java, respectively:

    mbt@zest:~/test/hw$ time { for i in `seq 1 500`; do ./hw > /dev/null; done };\
     time { for i in `seq 1 500`; do ./hello.exe > /dev/null; done };\
     time { for i in `seq 1 500`; do java hello > /dev/null; done }
    
    real	0m1.424s
    user	0m0.512s
    sys	0m1.076s
    
    real	0m36.252s
    user	0m27.202s
    sys	0m8.637s
    
    real	1m11.315s
    user	0m39.978s
    sys	0m13.297s
    
    The programs used are (for C, C#, and Java, respectively):
    
    mbt@zest:~/test/hw$ cat hw.c
    #include <stdio.h>
    
    int main() {
      printf("Hello, World\n");
    }
    mbt@zest:~/test/hw$ cat hello.cs
    using System;
    
    public class HelloWorld {
      public static void Main(String[] args) {
        Console.WriteLine("Hello, World");
      }
    }
    mbt@zest:~/test/hw$ cat hello.java
    public class hello {
        public static void main(String[] args) {
            System.out.println("Hello, World!");
        }
    }
    mbt@zest:~/test/hw$

  3. Paul McKibben  4th May 2008  

    Your results make sense given the test case. The C program is very straightforward for the machine to execute. However, to execute the same program in the CLR or JVM requires that the CLR/JVM be started up.

    So like I said when I mentioned benchmarks, they don’t necessarily represent real-world programs. Your scenario is relevant to the startup time of a desktop application, and it is true the CLR or JVM contributes to the startup time. However, when using the term “performance”, many other factors are involved. For example, the processor time devoted to memory management is a big factor in most real-world programs, and Java running in the HotSpot JVM does very well compared to C (don’t know much about the .net/mono CLR, but I would assume it also does well). I think you’ll need something more than “Hello World” to get a good comparison.

    One conclusion to draw from your scenario: the JVM/CLR will be overkill for short, straightforward programs. You’d be better off writing a shell script, but you probably knew that already. :)

    Thanks again!
    –Paul

  4. Michael Trausch  4th May 2008  

    Indeed. Shell or Python, anyway; I have just under 100 small scripts that I have written and still fairly regularly use in my ~/bin directory, for tasks that I’ve automated. Some of them actually need to be developed into more fully-featured software, when I have the time.

    What do you suppose would make a good test case, for computing the resources it would take for various things? I would, of course, be interested in creating an equitable comparison case, without spending 3 years developing it… lol

Leave a Reply

Powered By Wordpress || Designed By Ridgey