Feb 9th
Posted by Michael Trausch  as Uncategorized

Today is my birthday. In Georgia, that means the deadline to play ad valorem taxes, get the annual emissions inspection, renew tags and all of those sorts of things. I own two vehicles: a ’92 Dodge Caravan and a ’96 Saturn SL1. It should be pretty easy to get both cars to pass inspection, right? Apparently not.

This story starts several days ago, when we attempted to take both vehicles for inspection. The tester at the inspection station tried the Saturn first, and could not test it—there is a problem with the DLC (data link connector) that reports OBD-II data; things such as whether or not the computer is ready to report on the compliance of all of the car’s emissions control subsystems. The van, which has a significant chunk of its tailpipe missing, tested just fine on the third attempt (it kept jumping off of the dynamometer during the test). The Saturn, however, “could not be tested” according to the gentleman who attempted the test.

For those who don’t know, OBD-II is a feature found on 1996 and newer model year vehicles (and is required by federal law in the United States for all such vehicles). When it’s working correctly, it simplifies all sorts of diagnostics. Unlike older cars, where you have to guess and use process-of-elimination (combined with leaning on others who have experience) in order to diagnose many simple problems, a vehicle with OBD-II can “tell you where it hurts”, so to speak. The car can say “my forward oxygen sensor is bad”, or “there is something wrong in the area of the camshaft”. It isn’t perfect, and it cannot always tell you why the car is running poorly or about every single mechanical problem there is, but it sure does help someone like me who tries to work on the vehicles himself: simple problems are very quick to diagnose and often relatively easy to fix. Complex problems at least usually have their symptoms reported, which makes things easier to try to track down in most cases.

Well, funny story about that. For about three or four months, we were having problems with the Saturn wherein the coolant temperature gauge was acting up and misreporting nearly all the time. I attempted to fix the problem by replacing the following items:

  • The coolant temperature sensor (a thermistor). I’ve had to replace it before, and it’s not that difficult. This time, however, it did not fix the problem.
  • The connector for the coolant temperature sensor. Yet, the problem persisted.
  • The PCM, which is responsible for sending reference voltage and reading the return voltage and determining the coolant’s temperature from that. Net effect: the new PCM I guess had a more up-to-date set of software on it, and the engine ran better (and the car performs better now), but that did not fix the problem.

All that work, and the temperature gauge was still not reading anything but an error. Now, I’m not one to trivialize or magnify problems, but when you have an engine that is made of aluminum, it’s usually a good idea to at least know for sure that the thing isn’t overheating. But I was stumped. The only thing I could think of at this point was the wire that connected the thermistor to the PCM computer.

Well, when we went to Toledo, I took the car to Charlie’s Automotive (really awesome people there; they have seen our cars more frequently than anyone in GA has!) and explained to them what the problem was and everything that I had attempted to fix the problem. And as it turned out, it was indeed the wire. (No, I didn’t take the car there just for that; there was a steering problem as well that I needed them to fix… I trust them with my car more than any other shop I’ve ever been to.) Anyway, so all was well. Or so I thought.

Fast forward to now. The PCM that I replaced? It’s not sending anything to the OBD-II DLC interface, which is precisely the thing that has to work in order for me to pass emissions. Or so I thought. Then, I came upon this page at the Georgia Clean Air Force Web site. One of the questions from the drop-down menu was “What tests will be performed?”, and the answer to that question was this:

All 1996 and newer vehicles will receive a three-part inspection:

  • An OBD test to check your vehicle’s emission control performance history.
  • A fuel cap inspection to check for adequate seal.
  • A visual inspection of the catalytic converter to check for tampering or removal.

If an OBD test is unable to be performed on a vehicle, it may be necessary to perform a Two-Speed Idle test (TSI).

All 1995 and older model year vehicles will receive a three-part inspection:

  • A TSI test or an Accelerated Simulation Mode (ASM2) test – A dual-mode test including a 25/25 test = 25 percent load at 25 MPH and a 50/15 test = 50 percent load at 15 MPH.
  • A fuel cap inspection to check for adequate seal.
  • A visual inspection of the catalytic converter to check for tampering or removal.

An inspector can reject a vehicle for testing if it is considered unsafe to test. If the test has already begun when the safety problem is detected, the inspector may charge the full price of the test.

Note that the emphasis there is mine. So, having read this, I printed the page out, and I proceeded to get out and get my car passed so that I would be able to renew its tags. Or so I thought; reality, as is usually the case, is vastly different.

I got to the emissions testing place, where I was told that I have to have a letter from GCAF in order to have the alternative test done. They sent me to the GCAF office for Dekalb County—which is all the way across town, and if you know anything about getting around town in Atlanta during the day, you can imagine that was just a joyful experience. I was told that they’d give me a letter and that then I could have this test done.

So, I went there. It took me 45 minutes, even though it was only 15 or so miles. Now, this is normal here.

It occurs to me that we wouldn’t need emissions testing if the real underlying problems were solved.  Back to the emissions thing in a minute: there’s a way to fix that, too. Not that anyone who is in the position to make law in this state will give a rat’s rear end.

What problems, you ask?

  • Traffic law enforcement down here is a joke—you can break damn near any traffic law in the metro Atlanta area, and the probability that you will be caught is near zero. This is probably because the police in the metro area like to spend their time pulling people over who have done nothing wrong. Case in point (and this is only one of many examples): I was once pulled over in Marietta, GA, while driving my 1992 Dodge Caravan. The officer claimed I had a license plate light out. I was questioned as to why I was driving at that time of night (it was between 1 and 2 AM, if memory serves), whether the passengers in my vehicle were any relation to me, so on and so forth. It was myself, Erica, and two friends (I was the only white person in the car, and I was its driver). When I was let go by the cops (in response to my prompt that they need to “write a ticket, arrest me for questioning, or let me go”) I was livid. I proceeded to my next stop, to drop off one of my passengers. And I checked the light by the license plate. It wasn’t out. It is our belief that the police thought I was trying to sell the “services” of my passengers.
  • Repeat offenders—like, serious repeat offenders, with stacks of tickets both paid and unpaid—get to keep their license. I’m not sure why. The state law seems pretty clear to me, so the only thing I can think of is that people are getting reduced tickets for damn near everything. My guess is that this is because the police have no interest in going to court hearings to defend their tickets, and harsher tickets are more likely to be contested, so it’s easier to slap people on the wrist. Guess what? This encourages people to behave like fucking morons because there is no reason for them not to!
  • Lack of police presence on the interstates; I’ve lived here for years, and the only time I see police on the interstate in a traffic jam is when some idiot has wrecked. Instead, there should be a *lot* of police on the interstate during rush hours, and they should be pulling people off the road that are slowing things down. During traffic jams, people change their driving behavior in a way that is seemingly designed to ensure that nobody can go anywhere: people change lanes repeatedly (causing the flow to change and be inhibited), people drive on shoulders when they’re not supposed to; the list goes on and on.
  • Lack of the ability to truly investigate accidents that shut down the interstates. The interstate system should probably be covered in cameras. That way the idiot that caused the wreck that screwed everyone’s commute and cost metro Atlanta and its counties and cities buttloads of money can have their licenses suspended and car impounded if they were unlucky enough to have survived the wreck.

Waitaminute. What about the emissions?

Right, those things.

Well, you see, cars that were made before 1996 have a special sensor that is put into their tailpipe. They are then “driven” in-place on a dynamometer at a couple of speeds, and the actual pollutants coming out of the tailpipe are measured. Cars made after 1996 rely on the honor system: the car’s computer reports that all systems are ready and reports that all emissions components are working, then it’s a pass. It doesn’t matter how much smoke is coming out of the tailpipe. (Funny, yeah? I’ll bet asthma sufferers think that’s just fucking hilarious.)

However, sometimes things happen, and the computer in such cars doesn’t work. The dynamometer test should be allowed to be administered even to post-1996 cars. Why? It’s more truthful and accurate. And it should not have to require that I as a taxpayer have to pay a bunch of people to do a bunch of busywork to give me approval to have a more truthful, more effective test done on my car. How much sense does that make, really? None. There is no logical basis for requiring approval.

But, the lady that I talked to at GCAF disagrees.

See, it’s going to take several weeks for my replacement PCM (which AutoZone is replacing at no cost, under warranty) to arrive. Then  I have to put it in the car, and then I have to go return the old one, and only then will I be able to get a computerized emissions test. Oh, no, wait; I have to break in the new computer by executing GM’s hokey-pokey drive cycle song and dance, or it won’t report that it is ready. It can take up to 5 cycles for the computer to report that it is ready. And because of the aforementioned traffic in the metro area, it’s almost impossible to execute a drive cycle perfectly, so it can take even more drive cycles before the computer is truly confident that it is aware of everything that’s going on.

What’s the solution, then?

The lady at GCAF says: fill out a form and request non-conforming status for your car. In anywhere from three days to four weeks (!) you will have an answer. In the meantime, drive your car illegally, if you have to drive, and just don’t get caught.

Wait, what‽

Yes, that lady—who works for the state—told me to break the law and don’t let the Executive Branch catch  me doing it.

No fucking wonder people think they can get away with so much. The government workers here are aware that the police aren’t doing their enforcement duties, and telling people to take advantage of that fact.

It can therefore be no surprise that people actually do that. I mean, what’s the harm, right?

So, here’s the fix there: Let testing stations run the old test on a car that can’t run the new test. Maybe the car will pass, maybe the car will fail. In fact, for pre-2000 cars, it’s more likely that they’ll fail than pass, but that means that I can get a fail and then spend up to the $806 required to get a waiver.

According to the way things are currently, lack of testability is not a failure, it’s an aborted test. That means nothing I do changes anything.

The real problem with government? Too many convoluted rules, regulations, procedures, and waste. If people were truly interested in accomplishing anything, they’d use more effective techniques and strategies, and not just make laws that make it look like we’re doing something useful. That cannot happen when willful violators do not get punished, but the people who generally try to adhere to the law do get punished. What a joke.

And no, this is not a problem local to Georgia, and I’m sure it’s not local to the United States. Governments everywhere are abysmal failures: they fail to consider real impact, they fail to consider the utility of people who are willing to do good, they fail to consider that we’re all in this together, and they fail to realize that pissing money down the drain on things that don’t work only serves to steal that money away from things that could make a real, useful difference.

    Feb 3rd
    Posted by Michael Trausch  as The Internet, computing

    This is the third post in a series on networking. If you are joining late, please read the first and second posts before going forward.

    Today’s post is going to build on the previous two posts, and we are going to discuss the protection of your network. I’d like to start by saying that network protection takes place at multiple levels. In order to secure a network, you must work at all levels. It is absolutely impossible to secure a network by only looking at the network’s core infrastructure or by expecting that one can centrally secure everything. This is absolutely not the case. Security is relevant at all levels. In this post we will talk about security at many levels, including:

    1. An individual device, such as a workstation, server, or networked appliance.
    2. At the level of a single (physical) network segment, which includes all of the workstations, servers, and networked appliances on that segment. Recall that I discussed what physical and logical network segments are when I discussed bridges and switches in my first post in this series.
    3. At the level of a whole subnetwork. For simple networks, only one subnetwork will be present, and so the security of the subnetwork will be the security of the network as a whole. For larger or more complex networks, this is security at the level of each individual subnetwork.
    4. At the level of the whole network. This is effectively the same thing as security at the subnetwork level, although the business rules might be different (and thus, any firewalling or other security measures at this layer would also follow different rules). For example, subnets might be allowed to share privileged information or services amongst themselves, but such things might not be allowed to actually leave the network as a whole. Keep in mind that your network is just a subnetwork attached to whatever your upstream network is (which might or might not be the Internet itself, depending on your network structure).

    We will also discuss things that are found at various layers of the network, and different types of systems: Client/server applications, Web applications (which are just a specific type of client/server application, when you think about it), authentication and authorization systems, file shares, printers, and so forth.

    One At a Time: The Individual Device

    Managing a network—especially one that has many network nodes—can be a very time-intensive task. There is much to do. Many of the tasks required to manage individual workstations can be automated (which becomes more and more useful as you have to manage larger numbers of network nodes). We have really only three types of network nodes on our networks:

    1. Open systems, which we are able to work with in excruciating detail. These are devices that are running operating systems and software that we have the source code available for, including distributions of GNU/Linux, non-GNU Linux systems (such as those built around Busybox), FreeBSD, NetBSD, OpenBSD, and so forth. These are the systems that we truly have the power to manage literally every aspect of.
    2. Semi-open systems, which we can do a great deal with: we can install software and maintain software on such systems, and we can follow the vendor’s requirements and security update procedures. Microsoft’s Windows operating system family and Apple’s OS X line of operating systems are examples of such systems. Also, operating systems that are built on BSD-licensed free software operating systems, but themselves are closed (proprietary) forks without source code available fall into this category. (Of note: Microsoft does make source code for Microsoft Windows available to people who have enough money to pay for the privilege. However, as I understand the terms of their source licensing program, one is not allowed to make modifications to the source code, and therefore is not truly able to manage the system with the same level of power or flexibility as any of the systems in the “open system” category.)
    3. Black boxes, which are embedded network devices that we do not have the ability to work with at all. A limited configuration interface might be available. Such devices do not typically allow for the installation of new software, such as updated network stacks, unless the vendor decides to add support for such things. These are mostly network appliances, such as SIP phones and networked printers. There may be ways to work around some of the limitations in these devices, but this depends on the interfaces that the devices make available.

    Preferences and idealism aside, realistically speaking almost all networks have systems that are types 1 and 3, and many networks have type 2 systems as well. Most small office networks consists of mostly type 2 and type 3 systems, which type 1 systems usually running things “behind the scenes”, if they have type 1 systems at all. We will focus on types 1 and 2 in this section; most type 3 devices do not have security parameters to speak of and must be protected by the network itself. There are, of course, exceptions; we will not concern ourselves with these. We will also make the assumption that these type 1 and type 2 devices are computer systems of one sort or another: desktops, laptops, servers, netbooks, and some types of smartphones (such as those that run Android, which is a non-GNU distribution built around the Linux kernel).

    In order to secure a network, the first step is to ensure that all of the devices that are on the network are themselves secure. This implies that there must be some means by which devices can be rejected from the network if they are not, but we will cover that when we take a look at security at the subnetwork level. Individual device security includes the following items (but is by no means limited to only these items):

    1. The security of the operating system kernel itself. The operating system kernel provides critical system services; it is usually tasked with the management of hardware resources. It provides a “gateway” to all of the hardware devices in the computer itself, including audio, video, storage devices, RAM, I/O ports, network interfaces, communications buses, hardware cryptographic engines, and so forth. Most operating system kernels also provide the primitives required to handle networking, beyond talking to the network interface. The Linux, FreeBSD, OpenBSD, and NetBSD kernels all handle IPv4 and IPv6 as part of the operating system kernel itself, as well as things like virtual network adapters (which make tunneling possible) and IPsec. The operating system kernel does an awful lot for us! If the operating system kernel has a security vulnerability (particularly one that permits privilege elevation without proper credentials or that permits a user to crash the kernel, resulting in a denial of service) then it becomes possible for anyone to bring the whole system down, taking all of its software along with it.
    2. The security of the core system libraries and software; that is, the userland software that provides the other half of the operating system itself. For most systems that use the Linux kernel, this part of the operating system is provided by the GNU project: the C library, all of the core utilities, the basic shell, and several other things. There are also several userland utilities that are maintained by kernel developers that provide a means by which the kernel can be managed and maintained. All of these core items are part of the operating system (and the interface that it exposes to application software). Application software doesn’t worry about performing tasks that are handled by the operating system, such as managing network interfaces (though application software might be concerned with monitoring such things). What this means is that if there are security problems in a function in the system’s C library, any application software (or indeed, any software at all, including core operating system management utilities) that use that function are potentially vulnerable. This makes it important to keep an eye out for reported vulnerabilities in such things.
    3. Finally, there is the application software running on the server itself. If the software running on the server is not programmed with security in mind or has not been throughly audited (and even if it has!) it is entirely possible that there are (or will be) security problems discovered. If you are running software that is created yourself, then you need to be especially diligent and constantly be reviewing your applications for security problems. If you are running software provided by someone else, you should keep up on security vulnerability reports, such as those from CERT’s vulnerabilities database. In fact, if you manage any type of network at all, you should be subscribed to the feed (and regularly keep up on it)  so that you’re informed about new vulnerabilities as information on them becomes generally available.

    There are also other mailing lists and feeds around the Internet that are focused on reporting security vulnerability information; I would recommend that you be subscribed to as many of them as you can be, and somehow filter them so that you see only the vulnerabilities that are relevant to the systems that you run.

    Hire (or Create!) Help If Needed

    If you run a network that has more than one or two dozen nodes, it starts to become difficult to manage it all by yourself; not necessarily because of the number of devices involved, but because of the number of activities that are involved in keeping them secure. Certainly if you manage everything manually it becomes extremely difficult. You have system logs to go through, problems to track down and fix, intermittent issues sometimes from seemingly nowhere. For any size network here are some good things you can do in order to be aware of the “security health” of the systems on your network:

    • Monitoring software, such that you can monitor uptime, memory consumption, disk space usage, inode usage (when relevant) and other aspects of all of your individual workstations. Then set the monitoring server to call your attention to any situation that arises that requires human intervention. For example, you need to know if the filesystem on any given system is more than a certain percentage full (how much varies on the system and how quickly the data growth is on that system; some systems might be fine alerting you at 95% full, others you might need to know significantly earlier, say when it is 75% full). You can monitor many different aspects of a computer system, and these can all be early warning signs.
    • Aggregate your system logs and event logs in a single place. Have a monitoring system available that can issue alerts when extraordinary events take place. Those extraordinary events might be early indications of a security issue. Learn what’s “normal” for your particular network (every network is different!) such that you can filter out normal system activity. But do not throw away log entries. Archive all of them, somehow or another, since they may be important for forensic analysis later. Aggregation of log files also makes it much more difficult for an attacker to cover their tracks by, for example, overwriting incriminating log entries on a system; perhaps a system’s individual log files are tampered with, but the aggregated copy should be in a safe location and not use the same authentication credentials as other systems on the network, in order to raise the difficulty even further.
    • Have regularly scheduled “downtime windows” that can be used for things like applying patches. Humans aren’t perfect, and are sometimes even malicious; thus it is necessary from time to time to apply patches to the software that you run. If all the software you are using comes from a distribution (such as Ubuntu) then this process is made significantly easier, because you can perform all patches and issue a reboot command in a matter of minutes.
    • If you have many machines, keep a local mirror of the package archives for the operating system distributions that you have deployed. If you use both Debian and Ubuntu on your network, for example, and you have more than half a dozen or so machines, then it’s a good idea to have a local copy of the package archives for both. You will need a decent amount of space (between 30 and 70 GB for each supported release), but it will significantly improve the time it takes to roll out updates and deploy new systems, as well as save bandwidth (since you will only have to download the updates once, instead of n times, where n is the number of systems you have deployed).

    If you get to the point where you are spending all of your time performing maintenance of one sort or another, then it is time to either hire more help—or automate more maintenance. But be careful! There is only so much that can be automatically done without the supervision of a human being to notice that something just isn’t quite right…

    Create Update Alerts for “Helpless” Devices

    Nearly every network has at least one (and frequently, many) devices that are “helpless”. That is, they are not able to check for updates for themselves. That means that it is up to you in order to check for updates on its behalf. You can do this one of a few different ways. I am a huge fan of saving myself time and automating things where possible, of course. However, that is not always possible; your automation might have to be as simple as “Create an issue in our issue management system every x days to remind me to check for updates for device d“, such that at least then you have the ability to be automatically reminded. You can then check and close the ticket when you find that no update is necessary, or close the ticket when you find an update and create a new ticket that has the goal of actually applying the update.

    Audit Updates

    When you apply updates, if you have the ability to do so, audit them. See what vulnerabilities they fix. See if the vulnerabilities that are fixed are all of the known vulnerabilities. If you have the ability to do so, review the source code differences so that you can see what the changes are. You never know, you could find a security bug (or other bug) introduced in the patches.

    Automate Configuration When Possible

    Network security problems at the device level do not just come from bugs in software. They can also come from configuration problems. Perhaps a system is—or a family of systems are—too permissive and do not correctly apply business rules. Perhaps the business rules themselves are flawed such that the translation from business rule to configuration is faithful, but creates a security problem. In any case, do not manage system configuration manually if you can possibly avoid it! If you have a medium to large sized network, odds are that you’re going to have multiple systems that have similar configurations.

    Use a centralized system in order to deploy configurations to systems in your network. Also, this way, you only have to back up a single configuration database (this simplifies your needs for recovery later should you become the victim of an attack or some other form of disaster). It also gives you the ability to quickly get back on your feet when a system unexpectedly dies—and the ability to audit change control is extremely useful, as well.

    Disable and/or Remove Unnecessary Dæmons/Services

    Some operating systems come with a number of dæmons (POSIX systems, including the UNIX family) or services, background processes, or whatever. These things can, for example, perform automatic system maintenance. Or they can open up ports and accept commands. Whatever they do, if you do not use the services in your own network, disable them. Anything that you disable, make a note of, and have your monitoring system always validate that it remains disabled. An alarm should go off if it is ever re-enabled (unless you have to re-enable it for some reason, then you should of course tell your monitoring system to not alarm for it). This reduces the number of vectors available to mount an attack. If you do need such a service, but it does not need to be exposed to the world, use a firewall to keep it in (we will talk more about that when we get to subnet and network level security). If you don’t know what a service is or does, research it—ignorance is absolutely intolerable when it comes to working in security. If you don’t already do research on a regular basis and you are charged with the management of a network, that is something that you should seriously consider changing. Seriously.

    Network Segments

    Security at the level of a network segment is often synonymous with security at the subnetwork level. Of course, as we have already read, multiple physical network segments can be joined together in two different ways (well, more, actually, but we’ll ignore this for now) such that the multiple physical network segments become a large logical network segment.

    At this layer, the most important thing is to know your network well. Understand the physical and logical topology of your network. If you find that there are groups of systems that do different things and they are all sharing one logical network segment, take the time to consider whether or not you should replace a bridge interconnection with a router.

    To a certain extent, the security of a network is the performance of the network. Denial of service can result in severe business losses, for example. While relatively uncommon, it only takes a single misbehaving (or penetrated) system in order to bring down a network segment. Consider using routing in place of bridging unless there is some significant advantage to bridging between segments.

    Subnetworks and the Overall Network

    And here we are.  Security at the layer of a single subnetwork. If you have only one network with zero subnetworks, then you only have one subnetwork: the one that is given to you by your upstream network. Every network (except for the root network) is a subnetwork. Every subnetwork is a network. That’s all there is to it—so really, subnetworks and networks are treated the same way when it comes to evaluating their security.

    A network has one or more edges; there can be inner edges and outer edges. Outer edges are gateway points to parent networks; a router lives there and it is usually a default gateway. Inner edges are the same, except they point to a subnetwork that is logically “beneath” the current network. For packets to get to the inner subnetwork from the parent network they must be routed through the current network (in other words, the current network provides transit between the two networks). It is also possible for a network to provide transit between two (or more) outer edges. And just to blow your mind a little further: it doesn’t really matter much to the computer what we call an inner edge or an outer edge. It’s all the same to it; it’s just packets moving around from source to destination via hops along the way. The abstraction of a “tree” structure to the network, or the abstraction of “inner” vs. “outer”, is merely there to make it convenient for us simple-minded humans to talk about and really has nothing to do with how the system actually processes and routes packets.

    In order to secure a (sub)network, we must be sure that:

    • The only connections allowed from outside the network (e.g., originating from the other side of that network’s edge router or routers and terminating on the inside of that network’s edge router or routers) are those that are allowed by business rules.
    • The only connections allowed from inside the network to the outside (e.g., originating from the inside of the network’s edge router or routers and terminating on the other side of that network’s edge router or routers) are those that are allowed by business rules.
    • Services that are internal to the network are not exposed to the “outside world” (that is, not reachable from outside of the network’s edge router or routers).

    When you have a network that has multiple subnetworks, it can be the case that services on one subnetwork should be accessible to a “sibling” subnetwork in the same administrative domain (and sharing the same parent network as implied by the term “sibling”). In that case, some of the responsibilities for the rules listed above are delegated to the parent network’s edge router(s). Of course, it can get to be far more complex than that, even. Sibling subnetworks could have rules that only permit a pair of siblings to share access to particularly exposed services such that while the sibling networks can access the private services on each other’s networks, connections are disallowed from other sibling networks or even the parent network.

    Another way to say it: the design of a network structure can be as simple or as complex as required to support business rules. As long as the logical structure of the network (which, as you will recall, need not precisely match the physical structure of the network) is compatible with the rules that are to be enforced, everything is nice and easy. Where things get difficult is when the design of the network is insufficient to satisfy all business rule requirements. One example of how this can happen is normal business growth; e.g., when new business rules are introduced that are incompatible with the present design of the network. And in that case, the part of the requirement for adopting the new business rule(s) includes a redesign of the network’s logical structure—which might be simplified by using tunnels, for example, to create another subnetwork altogether.

    Summary

    The tools that we have in our virtual toolbox for network management are many—and they vary widely. In our arsenal, we have:

    • The wires themselves (and their physical layout)
    • Workstations
    • Servers
    • Physical network interfaces
    • Bridges
    • Switches
    • Routers
    • Firewalls
    • Network prefixes
    • Logical (virtual) network interfaces
    • Tunnels
    • Encryption
    • Subnetting
    • ALGs (Application Layer Gateways) and proxy servers

    … and more.

    Alone, each one of these tools does nothing significant. Together, they do absolutely amazing things—and can do so in ways that are infinitely flexible (and yet surprisingly simple).

    What’s Next?

    I plan on writing at least another couple of posts in this series. I believe that I am mostly done with the theory component, and so the next post or two will deal with actual networks. In the next post, I’m going to discuss the layout of the network I spend the most time on: my own. That’s the only thing that I know for sure at the moment. From there, it’s pretty much anyone guess.

    If I have left you behind, or if you are confused, please feel free to comment and ask questions on any of the three posts in this series. I will get back as soon as I can after seeing the question.

    Until next time.

    Tags
    Jan 31st
    Posted by Michael Trausch  as The Internet, computing

    In my last post, I talked about the underpinnings of networking at the lower layers. This post is going to talk about NAT: network address translation. NAT is almost as universal as IPv4 networking, and is used nearly universally on home and small-and-medium-sized business networks—with good reason, too: Having more than one IPv4 address carries with it a not-insignificant monetary cost. This entire post is going to be about what NAT is, and what function it performs in any network, and alternatives to NAT which can be used on both IPv4 and IPv6 networks.

    What is NAT?

    NAT, or network address translation, is a mechanism that attempts (and fails, in many cases) to provide transparent access to the Internet for multiple IP-networked devices that can not all have public IPv4 addresses. For example, it is used in homes and many small businesses to provide Internet connectivity to multiple computers on a single connection to the Internet with a single public IP address. NAT was invented in the early 1990s (approximately 1993, if memory serves) in an attempt to delay the exhaustion of the IPv4 address space. It was effective, too, for that purpose—we probably would have ran out of IP addresses ten years ago had NAT not come into existence.

    NAT has one major advantage: it enables an entire network to share a single IP address, thus conserving address space. However, NAT comes with a number of disadvantages at different levels. Some of these disadvantages are:

    • Increase in network operations overhead. Most types of NAT require the ability to maintain additional tables in memory which support the task of both address and port number translation.
    • Having a NAT increases the complexity of a network at the IP layer. This will be discussed in more detail in a few minutes.
    • Any NAT device will perform more slowly and consume more resources than a plain router—for most computer systems, this is not a major problem. However, for networks with high-bandwidth Internet connections which are using embedded devices for routing (such as a simple, consumer-grade wireless router) this can be a problem. It can also be a problem for any network that has very old routing equipment that has been retrofitted with the ability to perform NAT, as such devices were not designed with enough processor to handle the additional overhead when the network is at full load.
    • In network operating systems which behave as routers, NAT is often implemented in the same area as the firewall. For example, in the Linux kernel, the iptables command is used to set up NAT networking on an IPv4 network, and iptables is the front-end to the Linux kernel’s built-in firewall capabilities. This increases the complexity of the firewall code itself, and can make it more difficult to maintain in general, as well as more difficult to audit for security problems.
    • End-to-end communication is broken. The Internet was originally designed with the concept of end-to-end communication; that is, one system on the Internet can converse with another system on the Internet directly. Not only does this simplify network design, but it simplifies the design of network applications, as well (particularly those that require bidirectional communication but do not always maintain a persistent connection and require the ability for either side to reconnect upon some sort of a trigger). Some protocols (such as SIP) have worked around this, but such workarounds can be high-cost as well as brittle.

    The removal of NAT from protocol stacks therefore yields a number of benefits, including removing all of the disadvantages above. With the transition to IPv6, NAT devices are no longer necessary. Their additional complexity can go away, and networks all around the world will operate more efficiently and with less latency than they do now. It still takes I/O and processor resources to perform normal routing, but nowhere near as much as it does to perform NAT.

    Over the years, multiple types of NAT implementations have been created. I am not going to go into a terribly detailed analysis of them all, but they are:

    • Full-cone NAT (or, 1:1 NAT). This type of NAT provides no conservation of the address space; one external IP address maps exactly to one internal IP address. While there are uses for this type of NAT, I can think of no use for it that is not better served by another type of device, such as a load balancer, monitoring and failover, or plain routing.
    • Address-restricted cone NAT. This is one of the two most common types of NAT. When a system on the inside (usually using RFC 1918 IP address space) sends an IP packet to the outside, the NAT remembers the IP address, protocol, and port of the internal system and relays it to its destination. The destination system may reply by sending packets from any of its own ports to the NAT on the source port that it sent the original packet from.
    • Port-restricted cone NAT. This is the other of the two most common types of NAT. When a system on the inside sends an IP packet to the outside, the NAT remembers the same things as for address-restricted cone NAT, but replies from the destination must come from the same port as the packet was sent out to.
    • Symmetric NAT. This type of NAT is similar to port-restricted cone NAT.

    It is important to consider that a single NAT implementation may combine behaviors from one or more of these types, and some implementations are extremely configurable in terms of what method or methods are used to perform the functions of NAT. It is also important to realize that NAT breaks many legal network behaviors, depending on the application and the type of NAT in use. Various workarounds have been developed in order to traverse NAT devices for some protocols, and sometimes protocols will change in order to add NAT traversal as a core feature, but the overall effect is that NAT (often significantly) reduces network efficiency.

    What is NAT not?

    NAT is not a security mechanism.

    Let me repeat that: NAT is not a security mechanism.

    One more time: NAT IS NOT A SECURITY MECHANISM.

    I am uncertain where people have gotten the idea that somehow NAT was designed to increase security. It was not. It was designed to help conserve the increasingly scarce resource of IPv4 address space, nothing more. It is not a security tool, and it does not provide any additional security over a properly maintained IP firewall—using a firewall is essential with or without a NAT in place if you need to do any sort of packet filtering at all.

    The idea that NAT is a security mechanism probably came from the notion that one cannot see the addresses on the inside of a NAT. However, there are many mechanisms by which a sufficiently interested attacker would be able to determine things such as an approximate (or even an accurate!) count of how many devices are behind the NAT and what their IP addresses are through the use of various protocols, application tricks, and security exploits. For that matter, it is trivial to do things such as setup IPv6 on a NAT’d network and give all the systems on the NAT’d network a globally reachable IPv6 address, all without the cooperation of the NAT device. Only a firewall can stop such a thing, and only if you know what it is that you are trying to stop. And there are some things that even a firewall cannot protect you from (such as trojan horses and intentionally malicious employees and ex-employees). Security is an insanely complex problem to solve, and NAT is not a tool in a security professional’s toolbox.

    How does NAT work?

    A NAT device has a fully functional IP stack, and operates at a combination of OSI layers 3 (Network Layer) and 4 (Transport Layer)—mostly layer 4. In comparison, a router is an OSI layer 3 device. (In case you haven’t memorized the OSI model yet, refer back to my previous post which shows it.) Let’s say that you have a computer system that is on an IPv4 network, and that IPv4 network is using NAT. When you came to my blog to pull up this post, your Web browser performed the following tasks:

    1. It looked in its DNS cache to see if the hostname mike.trausch.us was there. If it was, it used the IP address from the cache; if not, it asked your computer to find the IP address for mike.trausch.us.
    2. Then, it asked the operating system to open a TCP socket connected to the IP address for mike.trausch.us on port 80. Since you are reading this, it is probably safe to say that it succeeded, and it was given a socket to work with.
    3. It then asked my Web server for the post, which my Web server kindly gave to you. There are a few back-and-forths that I am omitting here for the sake of clarity.
    4. The connection from your computer to my Web server was then closed.

    In a normal (that is, non-NAT’d) network, this was all nice and direct, and the edge router on your network made it possible for your packets to get here. Even better, in such an event, the chatter in such an event is directly between your computer and my Web server. However, on a NAT’d network, this is not the case. Instead, this is (some of) what happens:

    1. Your browser looked in its DNS cache to see if the hostname mike.trausch.us was there. If it was, it used the IP address from the cache; if not, it asked your computer to find the IP address formike.trausch.us.
    2. Then, it asked the operating system to open a TCP socket connected to the IP address for mike.trausch.us on port 80.
    3. Your computer’s operating system opened the socket, but not to my Web server. It just thinks it did. When the packet went out that was supposed to start the TCP handshake, it ended up at your NAT.
    4. Your NAT sees the packet, and makes a note of what the source IP address (your private IP) and source port was.
    5. The NAT notes the (source IP, source port) pair, and notes the destination address (e.g., the IP address for mike.trausch.us in this instance), the protocol (in this case, TCP) and sometimes also the port number (in this case, TCP port 80).
    6. It then forwards the packet to my Web server on port 80.
    7. My Web server receives the packet, which at this point appears to come from your external IP address, and probably a different port from the source port on your computer.
    8. My Web server sends a return packet, directed at your NAT device’s IP address and the port that it sent your packet from.
    9. When your NAT receives the packet, it looks in its table of entries to see if it has a mapping for the IP address and port it received a packet on.
    10. It then forwards my acknowledgement to your computer.
    11. All of the rest of the steps are the same, but with the NAT intercepting, looking up, and rewriting every single packet before it is shipped to its destination (either your computer or my Web server).

    It is a lot more complex to do all of this, obviously.

    And for larger networks, it won’t scale at all: multiple external addresses will have to be used to represent the whole network, because each NAT device can only have a mapping for one system and one (IP, port) combination at a time. What that means is that for larger NAT’d networks, you might have a different “public” IP address for every connection—or if you have a large network and only one external IP address to NAT with, you might actually wind up sometimes not being able to connect to anything at all, because the mapping table in the NAT is full.

    So I Have to Learn to Use a Firewall?

    Yes. Well, no. Well, sort of.

    Consumer devices for use in home networks that support native IPv6 will most likely be running their own firewall with a reasonably sane default set of rules (and hopefully, the ability to change those rules!). For example, not allowing inbound packets to protected ports (those below 1024) and ports well-known to be open and accessible by default by operating systems in the Microsoft Windows family. Of course, it will also be up to people to not install services on their computer systems that are configured to service the world. That’s not terribly hard, given that we have the loopback network (127/8) that is reserved for use locally (such IP addresses aren’t even allowed to reach the physical network outside of a sole system).  This means that a system can run services that aren’t to be exposed to the world (for testing, or in order to protect them from access without first using an SSH tunnel, or for any manner of other things).

    Very small, client-system-only networks will most likely use consumer-grade devices, as well, and need not worry about it for the same reasons that home networks won’t likely need to worry about it.

    Power users, network administrators, and everyone else in between can instead just configure a firewall. Whatever device is providing core routing functionality more likely than not has the ability to perform firewalling as well—and if not, it’s easy to obtain an operating system and a computer that can perform the task. After all, Linux, the BSD family, and most other UNIX and UNIX-like operating systems not only can function as routers, but can firewall (and Linux and the BSDs are free). If you are the administrator of a network that has more than ten nodes on it, or the administrator of an any-sized network that has more than zero server systems on it, you should know how to use a firewall both in general and the particular implementation that you have.

    What About RFC 4193 (IPv6 Private Address Space)?

    RFC 4193 does indeed provide for private address space in IPv6. That does not necessarily mean that it has to have NAT. Private address space can be used to ensure that one or more subnetworks have absolutely zero Internet connectivity (or can use something such as an IPv6-enabled SOCKS server in order to have strictly controlled connectivity). However, more often than not, servers that require such security need not use address space reserved for it by RFC 4193, as it would unnecessarily complicate the network. Instead, one could use a single subnet out of their allocation of subnets, treat that subnet as “dark” or private, and configure the firewall to prohibit all (direct) communication with that subnet. If you have a correctly configured network this is trivial.

    Private address space in IPv6 can also be useful to create disconnected islands, or testbed networks. However, in production networks, I would expect to see a business that has a /48, for example, simply devote a single subnetwork to private-use.

    There are other means by which one can protect their network without NAT; see RFC 4864 (“Local Network Protection for IPv6″) for more information.

    So, no NAT in the future?

    That’s right. We are heading to a world without NAT, a world where no NAT is needed, and a world where the overhead of the Internet as a whole will be reduced as a result. That pretty much wraps it up for today’s post. Questions? Comments? You know what to do with ’em!

    Tags
    Jan 29th
    Posted by Michael Trausch  as The Internet, UNIX, computing, hardware

    There are a number of things that I did not address in my post yesterday. My only goal in that post was to illustrate a (rather simple) network that had four subnets. I probably should have posted this article first, but I did not really think about that. This post is going to be somewhat long and very link-heavy. I don’t expect that you’ll be able to make it through this post (and all it references) in a single reading session. It’s more like a crash-course in all that which is computer networking from the bottom up, to a certain extent. This post, I have decided, is going to be the first post in a series, working toward the goal of building a basic understanding of the underpinnings of computer networks up to and including both IPv4 and IPv6.

    Before you get too deep in here, note that you’re likely to encounter things that you know already in it. But, it should be useful for anyone attempting to learn networking from the ground up. Most people in IT who are relatively new to it (or have been shoehorned into a position of supporting a network) have only really learned about bits and pieces of it, often as a result of being practical about learning only what is needed to get things done. As a result, I don’t really know how much “below” IP people who read this post know about. Even if you know about all of this stuff, I’d appreciate a full proofreading. :-)

    All of the IP addresses that you will find in this post are pulled out of pools reserved for documentation and examples. None of the IP addresses you find in this post will work on the public Internet. It might be possible to use them in isolated network islands, but there is network space reserved for use on islands; see RFC 1918 for IPv4 and RFC 4193 for IPv6 for address space reserved by IANA for such purposes. The MAC addresses in this post technically come from Xerox’s pool of MAC addresses. Don’t use them anywhere. I could not find a MAC address prefix dedicated for documentation/example purposes. If someone happens to know of just that, please let me know.

    There are two network models: the four layer IP model, (I am calling it the IP model and not the TCP/IP model because I won’t carry forward the misnomer “TCP/IP”, as TCP is far from the only thing that runs on IP) and the seven layer OSI model. I use the seven layer OSI model in part, because it is the model that I was taught when I learned about networking, but also in part because it is more descriptive and has a finer level of granularity—it is easier to talk about more layers that are smaller in scope than fewer layers that are larger in scope—at least, IMHO.

    The OSI Network Model

    The OSI model (Open Systems Interconnection model)—sometimes called the ISO network model since it was defined as an ISO standard—is the basis for the discussion that I will use in this post (indeed, it is the basis that I use for all of my discussions on networking). The OSI model has been around for a very long time, dating back to at least 1980 and probably even further. I wouldn’t know without a lot more digging, since I wasn’t around when it was created. ;-)

    The OSI model has seven layers. Conceptually, each layer only communicates with the one layer above and the one layer below it (with exception for layer 1, which has no layer below it, and layer 7, which is the top). The seven layers of the OSI model are:

    1. Physical layer. In your average Ethernet network, this is essentially the cable and the transceivers at either end of the cable. For dial-up networking, it consists of the serial lines at both ends, the modems at both ends, and the telephone network in between. This layer of the OSI model is concerned only with the movement of bits across the physical medium and passing data to and from layer 2. This is the hardware-specific layer and it is spoken of in terms of electrical connections, circuits, and signals.
    2. Data Link Layer. In your average Ethernet network, this is the frame-based protocol (also called Ethernet!) that is used. For dial-up networking and networking on cell phones, it is (most often) PPP. This layer receives bits from layer 1, interprets those bits as frames, strips the framing off, and sends the frame’s payload data up to layer 3. The inverse operation is performed when data is received from layer 3: framing is added and the bits are sent down to layer 1.
    3. Network Layer. No longer do we care about what type of physical network is in use: layer 3 is where IPv4, IPv6, ICMP, ICMPv6, IGMP, IPX, AppleTalk, and so forth all live. The network layer is responsible for packet routing, filtering, and delivery. It receives data from layer 2, strips the network protocol framing off, and sends the data up to layer 4; the inverse is taking data from layer 4, adding network framing, and passing the packet down to layer 2.
    4. Transport Layer. This is where TCP, UDP, SCTP, and others live. The transport protocol determines whether the transport is connection-oriented or not, whether it sends streams or datagrams (or can do either or both), whether or not reliability is provided, and so forth. This is the layer used by application software using the Berkeley sockets API when talking directly to a socket (in truth, the BSD sockets API also exposes a bit of the Network Layer, so it’s not a pure Transport Layer wrapper).
    5. Session Layer. This layer is not often discussed in relation to IP networks, because it considers the duties of layer 5 to be part of its application layer. However, the presence of layer 5 is useful as an abstraction when talking about sessions and how they can be thought of, and libraries that provide a sessions support mechanism on top of the BSD sockets API effectively implement the OSI Session Layer for applications that talk over IP. NetBIOS and L2TP are two such protocols which could be considered to be in this layer (and L2TP is an example of “a network in a network” in terms of the OSI network model, but we will skip discussion of recursive networking for later: even though we use it all the time in IT, most people don’t realize it and think the concept is more confusing than the application).
    6. Presentation Layer. Like the Session Layer, the TCP/IP model considers the duties of layer 6 to be part of its application layer. MIME, SSL, and TLS can all be considered to be OSI Presentation Layer components. The OpenSSL library effectively implements SSL (and TLS) as a layer by wrapping the BSD sockets API.
    7. Application Layer. This is the layer concerned with actual applications: DHCP, DNS, FTP, HTTP, POP3IMAP, IRC, XMPP, Telnet, SSH… the list goes on and on. The application layer represents application protocols that we use every day for IP address provisioning, turning names into IP addresses, transferring files, browsing the Web, email, chat, remote control of computers, and so forth. Many of the things that we consider application protocols in the context of an IP network also implement things elsewhere: for example, SSH is an application protocol in IP networking that performs tasks at OSI layers 5, 6, and 7.

    As I mention in layers 5, 6, and 7 in that list: the IP model of networking essentially merges those three layers together into a single layer. It also merges layer 1 and 2 into a single layer. Because of that, it is harder to focus on a single precise task in the IP model. However, it is worth mentioning that even in the IP model, the Internet Layer (equivalent to OSI’s Network Layer) and the Transport Layer (equivalent to OSI’s Transport Layer) are separate things.

    As you can imagine, the layer looks like a stack. This is probably where the term “network stack” comes from; software implementations often use the layer model, resulting in a suite of software “stacked” one atop another. In theory any one layer’s implementation can be altered or completely changed, and as long as it preserves the interface it has with the one layer below it and the one layer above it, nobody should really notice. In practice, it works almost as smoothly: applications that use TCP for transport are able to work on IPv6 as well, usually with few changes to the source code in order to select an IP-version agnostic socket type (if, that is, any changes are required at all). This is also why IPv6 support is nearly ubiquitous (at least, in popularly used free software).

    An Overview of OSI Layers 1 and 2

    Before we can continue on to discussion of the many things that IP does for us, it would be helpful to take a closer look at the OSI Physical and Data Link layers. These are the layers upon which both IPv4 and IPv6 build in order to provide us with the convenient networking that we are so used to today. We will also take a look at some of the network equipment that operates at these layers, so that we can understand how they fit into our networks. For the purposes of this section, let us imagine two computers (we will call them A and B) which use a standard gigabit Ethernet (1000BASE-T) to connect to each other and provide a network. As we are limiting our discussion to OSI layers 1 and 2, in this example we will not care what is handling the jobs of layers 3 through 7.

    The Physical Layer provides the medium upon which communication can take place, as well as the electronics which directly use the medium. So, if we look at computer A, we have the following things that represent the Physical Layer:

    • A Category 5 cable. The cable is the actual transmission medium: data “flows” over the cable.
    • The Ethernet port, (an 8P8C modular connector) which the cable is plugged into. The wires in the cable connect with wires in the port, which run to the Ethernet interface’s transceiver.
    • The Ethernet interface’s transceiver, which is responsible for transmitting (and receiving) binary digits (bits) to and from the wires in the cable.

    Operating systems are not concerned about the physical layer in a computer network; that is a job for hardware to handle all on its own. (This is not the case if the Ethernet interface offloads its processing to the host system’s CPU, but we will disregard this as an unnecessary complexity for the time being.) In this specific example, the role of the Physical Layer is to receive data from the Data Link Layer, serialize that data into bits, and transmit it over the cable. Likewise, bits that it receives from the cable are deserialized and sent up to the Data Link Layer. At the Physical Layer, the only thing defined is the connection between two (or more) nodes on a network segment as a bus or star network. Some examples of layer 1 network devices are ports, connectors, cables, terminators, and hubs (despite the link being to “Ethernet hub”, other types of networks can have hubs, too; they’re not specific to Ethernet networks). A hub is just a device that has more than two ports for network devices to plug into, and everything that is transmitted on the hub is sent to all of the ports at once; effectively, a hub is a repeater and connects all of the nodes together as if they were all sharing a single cable, but they are not. Bridges and switches, two types of popular networking equipment, are not OSI layer 1 devices as we will soon see.

    The Data Link Layer (or, layer 2 in the OSI model) can be thought of as providing two “sublayers”: Logical Link Control (LLC) and Media Access Control (MAC). The term MAC address refers to the address that a particular piece of network equipment uses at the MAC sublayer of the Data Link Layer. LLC provides the ability to multiplex many different network protocols over the same physical medium, while MAC provides an address space as well as the rules by which transmission and reception over a shared medium work. Think about a network segment as a town, and a MAC address as a street address within that town. It’s possible to have multiple street addresses that are identical as long as they are in different towns, but if two places have an identical address in the same town there are problems such as incorrectly delivered or lost mail. Instead of lost mail, a duplicate MAC address on a single network results in frames not being delivered, being delivered twice, or being delivered sometimes to one or the other of the nodes in conflict. The behavior in such a situation can be extremely erratic. It is only a common problem in environments where humans set MAC addresses, such as when using certain types of virtualization software.

    The Data Link Layer is the layer that handles addressing, framing, and sharing of the network segment. It is still not possible to do true internetwork routing using just the Physical and the Data Link layers. You can, however, do two new things at layer 2: network bridging, and packet switching. In fact, every network switch is logically just a multiport network bridge where all of the ports are bridged together—it is important to realize that. This brings with it two benefits: more collision domains (and thus ability to increase the usable bandwidth on a network segment given certain types of traffic patterns) and the ability to join multiple physical network segments together into a single logical network segment. Also, despite a switch (or a bridge) being a layer 2 networking device, they are invisible to the networks. A bridge behaves as an intelligent, transparent proxy between two networks, turning two physical Ethernet segments into a single logical Ethernet segment; a switch is nothing more than a bridge with multiple ports. Yes, that does mean that if you have an 8-port switch—and you’re using all of them—what you have is 8 distinct physical Ethernet segments, but it behaves as if it were a single Ethernet segment, so it is a single logical Ethernet segment.

    On today’s networks it is very rare to not have a switch somewhere, and while bridges are not exactly commonplace they are used quite a bit. Bridges and switches both make it much easier to manage a large network as if it were a single large network, but without many of the problems that large networks would generate without them. Prior to the widespread availability of inexpensive bridges and switches the two primary networking tools were hubs and routers. Networks were forced to be compartmentalized in order to keep the size of its collision domains manageable. There were also limitations in the length of cable that could be present between two nodes that wanted to communicate with each other.

    Switches do not entirely eliminate all of these pains, though. There is a limit to the length of a cable in an Ethernet network, and there is a limit to how many switches may be present between two systems who wish to talk to each other on a logical subnet. If you find that you need to exceed those limitations, your only option is to use subnetworks and add a router to the setup.

    Enough about the Physical and Data Link Layers! Back to the Computers!

    At the beginning of the last section I mentioned two computers: A and B. We now have enough information to talk about how they communicate with each other. Computer A would like to say “Hello!” to computer B. Computer A needs to know computer B’s address on the network (or needs to know how to broadcast over the network) in order to do so. For now, we’ll just assume that we want to broadcast—it makes things more simple! In terms of the Data Link Layer, the broadcast MAC address is FF:FF:FF:FF:FF:FF, or the “all F” or “all ones” address. Any layer 2 packet sent to this address is received by all systems on the network (though it may not be processed by all of them). Let us assume that computer B is listening for broadcast messages, and whenever it sees “Hello!” sent to the broadcast address, it says “Hi there!” back, also sending to the broadcast MAC address on the network.

    Now, let’s say that computer A has the MAC address 00:00:01:00:00:01 and that computer B has the MAC address 00:00:01:00:00:02 (no, that would almost certainly not happen on a real network). Computer A knows that it wants to address computer B with “Hello!”, so it sends “Hello!” to address 00:00:01:00:00:02. The frame is transmitted on the wire, computer B’s network interface sees the frame and because of the address decides to read and process it. It then generates a new frame, addressed back to 00:00:01:00:00:01, which says “Hi there!”, and the exchange is complete.

    It is important to keep in mind that there is sufficient power at this layer to perform only simple networking: unicast or broadcast on a single network segment are the only possibilities. Therefore, it is not suitable for internetworking by itself. In order to achieve internetworking, we have to have the ability for computer A to talk to computer C, even when computer C is not on the local link, perhaps on the other side of computer B (in which case, computer B would be a router). However, a router is not a layer 2 concept: it is a layer 3 concept. And so, guess what we’re going to talk about now?

    OSI Layer 3: The Network Layer

    It is at this layer that it becomes possible to perform internetworking. That is what makes the IPv4 and IPv6 protocols so special: they enable the ability to have a network so large that no one can completely imagine it.  A network is composed of smaller networks that participate in providing transit for it, which consist of even smaller networks, which themselves may have networks (and this recursion can be nested arbitrarily deep; there is no limit on the depth of the tree that is the Internet). [question: would a diagram here showing the structure as a tree data structure be useful here?] This means that you may build networks consisting of different physical network types, and they can still be linked together and participate as a single, harmonious (we hope) network.

    In fact, this is precisely how you are able to be on the Internet reading this at all. Your computer has either an IPv4 or an IPv6 address on the public Internet, or it has a private IPv4 address on a network that is essentially transparently proxied to the Internet. Either way, your computer was able to reach my computer, and request that this blog post be transferred to you in order for you to display it. When I think about all of the things that make this wonderfully huge network actually work, it is hard to imagine—even though I have been on the Internet for roughly two-thirds of my life.

    Let us now expand our example network from earlier. Imagine computer A has two Ethernet ports. A cable is plugged into the first Ethernet port, and the other end of that cable is plugged into computer B. Yet another cable is plugged into A’s second Ethernet port, and is plugged into computer C. If you visualize the three computer systems, there is A, which is the vertex of two lines that are the cables in its two Ethernet ports, and computers B and C are at the ends of those two lines (said perhaps more simply: if B and C had a cable connecting the two to each other, we would have a triangle). We still have no connection to any other computer network, including the Internet, in this example.

    At this point, what we want is for computer B to be able to say “Hello!” to computer C. But we have an obstacle in the way: there are two network cards in computer A, and B is attached to one while C is attached to the other one. What we have is two network segments, and all of the following statements are true:

    1. Computer B is on one network segment, connected to A.
    2. Computer C is on one network segment, also connected to A.
    3. Computer A is on two network segments, one of each is connected to computers B and C.

    Since we are string for an example in routing, computer B and computer C will be on different IP networks; computer A will be on both of these networks. This means that:

    1. The first Ethernet interface on computer A (which we will call eth0) will have the IP address 203.0.113.1/29.
    2. The second Ethernet interface on computer A (which we will call eth1) will have the IP address 203.0.113.9/29.
    3. Computer B’s eth0 will have the IP address 203.0.113.2/29 and it will be configured with a default gateway of 203.0.113.1.
    4. Computer C’s eth0 will have the IP address 203.0.113.10/29 and it will be configured with a default gateway of 203.0.113.9.

    We could say that we have two sibling networks. That is, we have two networks, neither of which are a physical subnetwork of the other one. Since we are disconnected from any other networks, this addressing scheme works just dandy for us. And, with a suitably sane operating system, we have nothing more to do, either: all three of these computer systems ought to be able to talk to each other just fine now, because:

    1. When we added the IP address 203.0.113.1/29 to eth0 on computer A, it should have put an entry in its local routing table that 203.0.113.0/29 was reachable via eth0.
    2. When we added the IP address 203.0.113.9/29 to eth1 on computer A, it should have put an entry in its local routing table that 203.0.113.8/29 was reachable via eth1.
    3. Computer B was configured with an explicit default gateway, and that gateway is set to point at computer A.
    4. Computer C was configured with an explicit default gateway, and that gateway is set to point at computer A.

    There is one catch: the Linux kernel will not behave as a router if it has not been configured to do so. That is, it won’t forward packets arriving out from eth0 but obviously destined to eth1 to eth1; instead it must be told to do so. I suspect that other operating systems have similar settings that must be flipped in order for the system to become a router. Once that is enabled, the routing table is all that is necessary in order for the kernel to make routing decisions. An added bonus is that if the kernel receives a packet destined for a network that it knows it cannot connect to, it will send an ICMP reply indicating that the destination network is unreachable (e.g., there is no route to the requested node). So, if you’re trying this out at home, you may have to enable some form of IP forwarding for things to work, but they’ll work. Computer A will behave as a router.

    In terms of network diagnostic tools, B should be able to ping both A and C; C should be able to ping both B and A; and a traceroute between B and C should show A as the first hop and C as the second hop. We have a functional, routed, network! However, this also means that network broadcasts on computer B’s network will not reach computer C’s network (and vice versa). That is to say, computer C is outside of computer B’s broadcast domain: they can both broadcast such that computer A will see the broadcast on its respective interface, but they cannot broadcast to each other. (There is something called multicast, and if computer A were configured as a multicast router, systems on both networks connected to A could subscribe to multicast groups managed by A and thus could send things to the multicast group and computer/router A would pass them along to the other network. However, that’s pretty advanced and we’re going to ignore that for now.)

    Great, but how does all that work anyway?

    So we know now that computer B can talk to computer C—even though there is no direct physical connection between computers B and C. We know that this is done because A is set up to route packets between B and C. So, how does an Ethernet packet from B wind up getting to C? The answer is that it does not. Huh?

    Right. It does not. Because routing is a layer 3 thing, it is layer 3′s job to figure out what the next hop is for the IP packet. If computer B is attempting to send to computer C, it knows that it cannot do so directly. Computer B knows this because it is only attached to the 203.0.113.0/29 network, and 209.0.113.10 is not in 203.0.113.0/29. So it looks to its default route (that is, the system specified as the default gateway) and figures out what the next hop’s IP address is.

    Here, we are going to stop for a minute and figure out how layer 3 gets anything done. When two computers are on the same IPv4 subnet it is simple: if computer A wants to send to computer B, computer A makes an Address Resolution Protocol (ARP) request; this is a broadcast, asking the network “who has the IP address 203.0.113.2?” and expecting a response. It will try this a few times before giving up; if it does give up, then the host is deemed to be unreachable: either no systems on the network have that address, or something is malfunctioning and it is not receiving, not transmitting, or otherwise impaired. (An aside: this is one thing that works differently in IPv6; instead of using ARP, IPv6 uses Neighbor Discovery Protocol (NDP) to find a host on the local network.)

    So, when computer B attempts to talk to computer C, here is what happens:

    1. Computer B notices that 203.0.113.10 is not in its own network (which is 203.0.113.0/29).
    2. Computer B looks in its routing table, and it sees that it does not have a route to a network that contains 203.0.113.10.
    3. Computer B therefore looks again in its routing table to see if it has a default route. It does.
    4. Computer B looks up the address for the gateway specified in the default route (203.0.113.1).
    5. Computer B makes an ARP request to get the MAC address for the node with the IP address 203.0.113.1.
    6. Computer A, having the IP address 203.0.113.1, sees this ARP request and says “I have that IP address.”
    7. Computer B sees the reply from computer A, and makes a note of its MAC address.
    8. Now, computer B takes the IP frame as a sequence of bytes and sends it down to layer 2, and tells layer 2, “Give this packet to this MAC address,” and layer 2 does its thing. The packet makes its way to the wire destined for the MAC address of the interface eth0 in computer A.
    9. Computer A receives the layer 2 frame, reads the bytes from it, and passes it up to layer 3.
    10. Layer 3 (IPv4) on computer A sees that the destination of the IP packet is not computer A.
    11. Computer A now proceeds to perform steps 1 through 7 again, except that computer A is looking for computer C. Computer A may broadcast the ARP request out of both of its interfaces.
    12. Computer A now sends the packet to computer C.
    13. Computer C, seeing that the packet is destined for itself, sends the packet further up the stack, eventually hitting the application layer, where a listening application may do something with the data.

    That’s a lot, and it happens very quickly. Each of the computers will have an ARP cache, which stores the MAC address↔IP address mapping for a limited amount of time. When the entry goes stale, it will drop from the ARP cache, and the next time the computer has to talk to that IP address again it will issue another broadcast for the MAC.

    Also remember that during all of this, layer 2 devices are invisible to the layer 3 protocol. Switches are smart enough not to interfere with the operation of things like ARP, by transmitting all broadcast packets out of all active ports (except for the port of origin); the same goes for all other types of MAC broadcast traffic. This is why DHCP works even when there is a switch between the computer requesting an address using DHCP and a system running a DHCP dæmon: the client initially sends the request to the special 255.255.255.255 IP address, and IPv4 stacks know that 255.255.255.255 is a network-wide broadcast address, so it in turn tells layer 2 to send the packet to the special FF:FF:FF:FF:FF:FF broadcast MAC address. Any bridge that sees that MAC address as the destination will send the packet across the bridge; remembering that switches are just a special application of a bridge, it does the same thing.

    What next?

    I am going to leave this as it is tonight. This concludes the first part of my post. You should feel comfortable with the concept of how the lower layers of networking work, and with how the Internet Protocol adds functionality to those lower layers of networking. If you do not feel comfortable with that yet (and you have read all of the links to Wikipedia and done research from there on out), then I want to know what your question(s) are.

    I am not sure when I am going to write the next post in this series, could be tomorrow or in a week. In the next post, I will add more complexity to the network. There will likely be diagrams, because it won’t be simple enough to unambiguously describe in plain text anymore. After the next post you should be able to feel comfortable with diagrams, though, at all of the first three levels of the OSI model (and realize that they can—and very often are!—very different from each other).

    Perhaps one key point to take from this post: Any time you have more than one logical network segment that is not connected by a bridge, you must have a router—and almost any computer operating system (certainly anything from the UNIX family) can be a router just fine.

    Tags
    Jan 27th
    Posted by Michael Trausch  as The Internet

    I have been trying to help someone understand a bit more about IPv6 networking and how it would pertain to their network. However, I have had a bit of a problem in the explaining, and I think it is possibly because we have forgotten what the basic components are to an IP stack, insofar as we have gotten used to having things in IPv4 and lost the ability to think about network designs without those things (specifically, NAT and all of its variations). So, the goal of this post is to attempt to explain how things work with IPv6, specifically the loss of NAT as a network design tool.

    For this whole post, we are going to refer to an example network. At this time, the network is just an Ethernet network: it is not running any IP stack (that is, neither IPv4 nor IPv6). Here is a diagram of the network:

    example network

    An example network.

    We are going to look at this network in three different ways: as a classic IPv4 network (that is, how it would be if it were an IPv4 network pre-NAT, or approximately pre-1993 and how it is in extremely large networks that cannot use RFC 1918 space, such as Comcast’s and AT&T’s networks), as a NAT’d IPv4 network (as it would be in a typical home or small- to medium-sized business today, and sometimes is at the ISP level for those who have ISPs that cannot afford additional address space for their consumers) and as a present-day IPv6 network. We will examine the network in each scenario. First, the rules:

    1. Workstations run no services.
    2. Servers run only the services they are allowed to run.
    3. There are internal services that are to be protected.
      1. Servers may provide services which are only for the internal network and are not to be accessed by outsiders.
      2. Workstations may consume internal services provided by the internal servers.

    As a Classic Network

    Assume that this network has IPv4 and no NAT. For the sake of this example, this is a normal public network. All addresses are directly on the Internet. There is no NAT. Nothing’s “hidden”.

    This business is given the following information for its static IP address allocation:

    • IP network 198.51.100.64/26
    • Default gateway: 198.51.100.1
    • DNS server: 198.51.100.2
    • DNS server: 198.51.100.3

    So, now, we have 1 edge router, 4 servers, 6 workstations, and 3 sub-networks. The following addresses are for them all:

    • Edge router (and, in this case, firewall): 198.51.100.65
      • The edge router has another IP address on the Ethernet card facing the Internet service provider in 198.51.100.0/26, but it is irrelevant for our purposes.
    • Server A: 198.51.100.66, runs SSH (internal & external) and NFS (internal)
    • Server B: 198.51.100.67, runs WWW (internal & external) and NFS (internal)
    • Server C: 198.51.100.68, runs SMTP, POP3, IMAP (internal & external), only trusts internal hosts to relay without authentication
    • Server D: 198.51.100.69, runs a network backup server and works for all of 198.51.100.64/26 (internal).
    • Workstation A: 198.51.100.70
    • Workstation B: 198.51.100.71
    • Workstation C: 198.51.100.72
    • Workstation D: 198.51.100.73
    • Workstation E: 198.51.100.74
    • Workstation F: 198.51.100.75
    • Router for Windows Subnet: 198.51.100.76
    • Router for UNIX Subnet: 198.51.100.77
    • Router for Macintosh Subnet: 198.51.100.78

    Excellent. Also, the network that we’ve just defined is 198.51.100.64/28, and we are at 100% utilization on it. The Windows subnet has 198.51.100.80/28 for its network, UNIX has 198.51.100.96/28 for its network, and Mac has 198.51.100.112/28 for its network. Their routers have two addresses: the one listed above on the 198.51.100.64/28 network, and one on another Ethernet interface on its own internal network.

    Now, we have the rules that we talked about above: they must be followed! We configure the firewall, then, on 198.51.100.65:

    • For all systems: (TCP, UDP, and if used, SCTP) inbound packets are only allowed if they are related to outbound packets (e.g., systems can go out and connect and receive replies)
    • Servers
      • For server A: port 22 inbound is always allowed.
      • For server B: port 80 inbound is always allowed.
      • For server C: ports 25, 110, 143, and 587 inbound are always allowed.
      • For server D: No inbound ports are allowed.
    • Subnets manage their own firewalls for the purpose of this example, and so no firewalling for those networks is done here.

    We have managed to accomplish for the network what NAT does. However, it is far more efficient: the router isn’t mapping IP address and port combinations from its one interface to an IP address and port combination on its other interface. It’s just routing packets, and the firewall (or “packet filter”) stops packets from passing that are not allowed. In other words, we can see from this design that NAT is absolutely not needed.

    However, there are some problems here. The size of the whole network (a /26) is only 62 nodes (before subnetting). It takes between 30 seconds and fifteen minutes to scan through that to see if any hosts are responding to things like ping or answering any services, depending on the congestion between the scan and the target and whether or not the scan is slowed down by the router, whether or not the scan is complete, and so forth. Average scan will probably be closer to 90 seconds. Because it’s only 62 nodes. Keep this in mind for later.

    As a NAT’d Network

    As a NAT’d network, the only thing that changes is that the network has one IP address: 198.51.100.65. That’s it. All addresses on the inside are different. The LAN has 172.16.0.0/16 now.

    • Edge router (and, in this case, firewall): 172.16.1.1 (/24)
    • Server A: 172.16.1.2, runs SSH (internal & external) and NFS (internal)
    • Server B: 172.16.1.3, runs WWW (internal & external) and NFS (internal)
    • Server C: 172.16.1.4, runs SMTP, POP3, IMAP (internal & external), only trusts internal hosts to relay without authentication
    • Server D: 172.16.1.5, runs a network backup server and works for all of 198.51.100.64/26 (internal).
    • Workstation A: 172.16.1.6
    • Workstation B: 172.16.1.7
    • Workstation C: 172.16.1.8
    • Workstation D: 172.16.1.9
    • Workstation E: 172.16.1.10
    • Workstation F: 172.16.1.11
    • Router for Windows Subnet: 172.16.1.12
    • Router for UNIX Subnet: 172.16.1.13
    • Router for Macintosh Subnet: 172.16.1.14

    The Windows network has 172.16.2.0/24, UNIX 172.16.3.0/24, Mac 172.16.4.0/24.

    More address space! Things are more comfortable now. But if someone wanted to do a port scan they still could. How, you ask? All that’s needed is one infected system. On a network of the size we’re talking about, we’ll have one eventually, no matter what the addressing model is. That problem must be mitigated by keeping up with all the internal systems’ security. It’ll happen with or without NAT; once one system is had, it’s not hard to scan, scan, scan. And even though we’re using more address space now, we’re only using 4 /24 networks, or roughly 1,000 addresses. Scanning the whole /16 is only 65K addresses. We can estimate 1 to 3 seconds to scan, so if they’re intelligent they’ll only be busy for anywhere from 17 to 51 minutes at ~ 1,000 addresses, or from 18 to 54 hours if they do the whole /16. So it’s VERY easy to map the classic network above. Slightly more difficult to map this network, but still doable.

    As an IPv6 Network

    Here it gets interesting. The ISP’s giving us 2001:db8:2:100::/56. Yay! That’s a lot of space.

    So, let’s do this: the edge + servers + workstations will be 2001:db8:2:101::/64, Windows network will be 2001:db8:2:102::/64, UNIX network will be 2001:db8:2:103::/64, and Mac network will be 2001:db8:2:103::/64. To assist, I’ve generated random MAC addresses and we’ll use those to create the IPv6 addresses, as is the case on a normal network.

    • Edge router: 2001:db8:2:101:d78f:ffef:d7b1:005f
    • Server A: 2001:db8:2:101:92ff:ffe0:7748:230a
    • Server B: 2001:db8:2:101:628f:ffe3:d9dc:bce3
    • Server C: 2001:db8:2:101:fc4f:ffe8:8cde:a303
    • Server D: 2001:db8:2:101:21ff:ffe8:4865:a520
    • Workstation A: 2001:db8:2:101:325f:ffe1:f700:691c
    • Workstation B: 2001:db8:2:101:babf:ffee:e73b:928e
    • Workstation C:2001:db8:2:101:80ef:ffe5:2b5a:6820
    • Workstation D:2001:db8:2:101:bf9f:ffe0:9f5e:256e
    • Workstation E:2001:db8:2:101:b81f:ffe5:cd18:96fb
    • Workstation F:2001:db8:2:101:3c6f:ffe0:c57a:7b3e
    • Router for Windows Subnet:2001:db8:2:101:30df:ffe4:6a97:9e32
    • Router for UNIX Subnet:2001:db8:2:101:152f:ffec:17d0:98f2
    • Router for Macintosh Subnet:2001:db8:2:101:ff3f:ffee:36de:4f54

    Addresses on the subnetworks will be similarly random.

    Try scanning that. To map out this network, an attacker has to scan way more hosts than with IPv4: four networks that can contain as many as 2 to the 64th power hosts, in virtually any distribution. Actually, since the business has a /56, that means that it has to scan 65K networks that can each have up to 2^64 hosts. Mapping that cannot be done; neither the bandwidth nor the time are available to do it!

    And, to make it all fun and happy, the firewall configuration to keep this network safe matches Classic, but just with the IPv6 addresses used for the extra permissions, not IPv4 ones. So you get added security in that nobody can create a map of your network by way of scanning it (but you have the map) and they have to work a lot harder to find hosts to attack. The only systems that will likely get attacked regularly on the IPv6 Internet will be those that are generating public traffic, and a few unlucky ones that have been guessed or leaked.

    Also, since firewalling in IPv6 works exactly the same, it’s easy to keep the network secure: just enforce network security at the network edge. That’s what we have firewalls for.

    I’ll pick another IPv6 topic to write about soon (suggestions are welcome about specifics if desired). Questions on this topic are also welcome.

    Tags
    Sep 2nd
    Posted by Michael Trausch  as Rant, computing, random thoughts

    Someone recently brought to my attention that I’ve offended/hurt someone by the (by now, months old) words on my blog.

    That is perfectly fine. I’m not here to be a politically correct person. This is my space on the Internet—and I’ll say what I like here. You know how I believe in freedom and its natural limitations? No? Well, I’ll say this until the day that I die: one person’s freedom ends where another person’s freedom begins. I’m free to say what I like, and you’re free to read it in any way you like or even not at all. Now, I won’t talk about other people by name on my blog unless I have permission to do so—especially if I’ve nothing positive to say, so don’t say that I’m not being at least a bit nice to you—but if I write a post here and it’s about you and you know that from nothing other than reading that post, maybe that is saying something more than what I have said here. If that has ever been the case in the past, or if it ever becomes the case in the future, then know this: it’s better you read what I have to say about you here, than wish that I were willing to spew my unfettered anger in your direction in person. And shoot, if you’re doing things that are illegal (such as running unlicensed proprietary software) and I haven’t turned your ass into law enforcement, you should be thanking whatever deity you believe in. Honestly.

    Anyone who knows me—even if they do not know me well—knows that there is absolutely nothing that will make me angry like willful ignorance.  “Ignorance is bliss,” as the saying goes, but ignorance when combined with the lack of desire to fix it in what you claim to be a domain of specialized knowledge which you possess is just plain inexcusable. If you manage a network, you should know the basics of how it all works. There are people that I know that manage Windows networks and know nothing—not even the high level overview—of how Windows networking actually works beyond the painted pixels on the screen. Guess what? That means that you really do not know what you’re doing.

    And lest anyone get offended or butt-hurt over being called ignorant, don’t. Let’s remember what ignorance actually means, and remember that it’s not an insult or a sleight against anyone—everyone is ignorant about many things, even in their own fields of work and expertise. However, willful ignorance in one’s own field—that is, ignorance that you’re not willing to fix all on your own like a big boy or girl—is absolutely something that you should be offended at! It is simply not possible for a single human being to know everything, even in his or her own specialty.  We have reference works, documentation, and vast seas of information in every field that I can think of, more than can fit in a human brain. But what matters is that you know what you know, and know what you don’t, and know how to find it out quickly and efficiently. And that means having a sort of self-initiative. And for that matter, I’ll even point people in the right direction, if they’re willing to do the legwork themselves to decipher the information once I’ve pointed them at it. I certainly don’t spoon-feed though, and if you expect that (in your own field, no less!) then I will stand by my assertion that you should not work in that field at all. And I will stand by that assertion whole-heartedly, no matter how much that gives a person pain.

    If you’re ignorant about something that you never do nor have a desire to do—say, you’re an auto mechanic and you don’t care to know how to sew or crochet—then that’s fine; that’s your choice! But if you work in a field, and you learn that there is something that you don’t know, then learn it. Or at least learn where you can learn it when you need to, and get a friggin’ overview in your head. Read any single RFC and you’ll realize that there is no way that any of us can memory every single detail of every single specification for every single type of system that we manage. It’s just not possible without spending so much time studying that as to make it impossible to get anything useful actually accomplished. But if you manage a mail server and you don’t know the first thing about SMTP or POP3 or IMAP or whatever-else protocols your mail clients and servers are using, yes, that’s a problem. You certainly do not need to be able to have a conversation with your SMTP server, but you should know how to look up just how to do to that should you ever have to do any really low-level troubleshooting or log capturing. You shouldn’t need to know how to speak any application layer network protocol directly for that matter (though a lot of the text based ones are simple enough that you can learn them as needed over time). But you absolutely should know how to find the information that tells you how to speak those protocols if ever you have a need. And you should know enough to be able to make intelligent decisions on things like physical network infrastructure, management of your client and server operating systems, and so forth.

    As an example: I am nowhere close to an expert on Windows—and I know this. (I will say that I know an awful lot about the way Windows bootstraps itself, as I have had to fix systems with multiple infections by hand because there were no automatic tools available to fix the system… but that does not make me an expert on the whole system, and probably not even the bootstrapping process of the system.) But I will research any issues that I encounter while supporting Windows users and find out—empirically, if I must—how to fix the problem. It’s what I do. And there are many, many places where I can find that sort of information, including booting up a copy of Windows itself and trying to figure it out that way. It does probably take me a lot longer than it would take someone who knows the system in and out, and I’ll grant that. I am absolutely the strongest on POSIX/UNIX-family systems. But that doesn’t stop me from being able to learn it and handle it. Even if it does take longer.

    The difference between me and the unidentified person in my last post? I’ll spend any resources necessary—time, money, effort—to learn what I need to learn to get the job done. I don’t cut corners. My goal isn’t to get everything done sloppy and fast. Even if it takes me longer, I’d rather know and understand the problem—and its solution!—completely before moving forward with doing anything about it. Especially if I can find a short-term workaround that will enable me to come up with a quality solution. I eschew willful ignorance in my field. Do you?

    Tags
    Jun 29th
    Posted by Michael Trausch  as Uncategorized

    It boggles the mind what some people will actually do—to what ends they will go to try to “show up” someone or make themselves look better. What is even worse is when it is pure marketing bullshit.

    I am sure that there are people who do this in every single line of work. But what disgusts me more than anything is someone who thinks that just because they did something like served in the military or got a degree of some sort that they know everything and that everyone else has to prove that they have two brain cells to rub together. Do you know what someone with a Ph.D. is good at? Marketing. They have to be.

    You know what I hate with a passion? Marketing. Because most of the time it is hogwash.

    What prompts this, you might wonder?

    So today, I responded to an unexpected downtime call. Long story short (because details cannot be given out, for obvious reasons), this meant that I got up out of bed to answer it. No problem; it’s what I do. I go, I handle the problem, I encounter a couple of snags, find that I no longer have authorization to fix those snags, and go on about my day, providing a notification of the issues that I wasn’t able to fix—and why. As far as the “fix”, I had to switch to a backup connection on a mixed voice/data T1 (technically, DS1, but nobody calls it that except the engineers, I am pretty sure). No big deal; the equipment is setup to handle that, and it works just fine, albeit slower than anyone would like it to.

    Enter Fatuous. Fatuous is someone who is employed as an “Information Technology” support person. Of course, that is not its real name, but it is suitable nonetheless.

    Fatuous sends an email to the effect of “you cannot run data through the T1 because it will make voice calls suffer”. Let’s keep in mind here that this particular equipment does both voice and data, and it gives preference to voice calls. In other words, if all the circuits are busy handling voice calls, there is no more room for bandwidth. Sounds simple, right? It should, because it is.

    So, I explain this little fact, and I get a mail back—oh, yeah, and half the office is needlessly carbon copied on this. Great! Let us fill everyone’s inboxes with a bunch of technical jargon that they will not care to read and (probably) have no desire to understand. Well, whatever. So the mail basically says, “Cite something, you’re wrong.” Wait a minute, what? Fatuous seriously does not understand what a T1 is. Now, if you have worked in the IT industry for any period of time, even if you have never used one, you should really know what a T1 is. Especially if you are over the age of, say, 25 or 30. Particularly if you are over the age of 40, since it is quite likely that’s what they were using for high-speed interlinks then (nevermind the fact that it is only a little bit faster than a low-end ARCNET card/network it was once considered high-speed connectivity, just as ARCNET was).

    Fine, so I explain what this means in a high-level, cursory overview that hopefully had words small enough for Fatuous to understand. In the meantime, I am raving mad. I have dealt with Fatuous enough that it is readily apparent that there is no salvation here: its ignorance is willfully incurable, and that is terribly sad. I am not sure what is worse: the fact that it has a job doing something I am way overqualified for, or the fact that it has a job doing something that I am way overqualified for and makes a fuck of a lot more money than I do.

    Sometimes, I really hate the universe.

    Tags
    Jun 23rd
    Posted by Michael Trausch  as Uncategorized

    So, I have someone on Identi.ca (@flameeyees@identi.ca) discussing with me me about my views on FatELF. No biggie, but trying to continue the argument (pointless as it is) there is just too much work: the character limit does not permit real discussion on such a complex issue. So, permit me to address each of the issues raised as I understand them and rebut. Then conversation can continue, if at all desired (though seriously, I don’t know that *I* desire to do so).

    First point: FatELF would be useless because “you can do that already, write a cc frontend that compiles the same file multiple times, it’s _not_ hard, I’ve done it before“. Okay, so the proposed solution here is to write a compiler driver that will interpret arguments and, from a single Makefile, build for multiple platforms. There might even be something out there for that, but simply put, if GCC supported this feature intrinsically, then everyone would have it and it would be done in a standard way. Free software works better when everyone can agree on a single standard way of doing things, and not just a single standard template for how it might be done. Using addons to perform this function still yields multiple binaries that have to be shipped anyway, which is decidedly not the aim.

    Second point: “how is shipping one (fat) binary ‘better’ than shipping one auto-extracting auto-deciding archive?” Making the assumption that the toolchain and kernel all support the feature as a standard thing here, the difference is simple: the kernel ELF loader would be able to decide which sections of the ELF file should actually be loaded in memory, read only those sections, and go on about its business normally—the rest of the process would not need to change in any way. No temporary copies need to be made, no images need to be extracted, nothing like that has to be done. However, the inverse is quite a different story. Let’s make the assumption that you’re using a POSIX shell script, with the archive of all of the possible binaries appended to the POSIX shell script. First, the script has to be prepended to EVERY such archive (meaning that different versions of the script could exist, and as any programmer knows, DRY), and the script is not going to be trivial: it would have to have code to detect and support every single individual platform. Furthermore, it would require that the user have permission to extract the payload, make it executable, and run it. This is the same deficiency that makes gzexe impractical for everyday use; I know that at least on all the servers that I manage, /tmp is mounted read-write but with execution of scripts and binaries disabled. Finally, it would fail to properly work in the event that something needed to be setuid—that information would have to be in the payload itself, which is absolutely not portable from one system to another. It just cannot be made to work in a generic enough fashion to be reliable on all different types of platforms with different administrative decisions made in the management of those platforms, and in many cases would require an increased attack surface just to be made workable.

    However, if FatELF (or, honestly, anything that is truly equivalent) were used, an administrator could copy the binary from one system (say, an x86) to another system (say, a PowerPC) that has all of the other dependencies filled for it, drop it on the filesystem, chown/chmod it once, and it would Just Work. setuid, if needed, would be honored by the kernel, and no extraction has to take place. No additional temporary disk space would be required, nor would it be necessary to incorporate any logic into the ad-hoc “loader” (if it could even be called that) to try to find a filesystem that is read-write with execution permitted for the current user, and therefore no special privileges from the user would be necessary.

    In fact, the only way to solve the problem reliably at present would be to have something like /var/cache/adhoc-fat-binaries, and have all ad-hoc “fat binaries” be setuid 0 (or setuid to some user that has all necessary privileges to make something setuid 0 if necessary, probably only UID 0 has that privilege on most systems) so that it could (a) write to /var/cache/adhoc-fat-binaries and (b) set the setuid or setgid bits if necessary for the program to fulfill its function. And it deserves to be restated: we all know that having a single specific standard and adhering to it—even when the standard is less than ideal (and in some cases, like X11, falls quite short of ideal)—is far better than having 100 different and incompatible ways to do the same thing. It’s one of the things that we people in free software know pretty damn well.

    See, I don’t see something like FatELF being used for distribution binaries, or anything that would be distributed in an operating system distribution package, except perhaps in special situations where something like biarch is natively supported on the hardware and it would be feasible to permit that sort of flexibility. Instead, I see something like my current situation: I administer several machines for small businesses, and not all of them are the same hardware platform.  They are all the same operating system and many of them have the same libraries installed.  Some of them are 64-bit and some are 32-bit.  Some are x86, some x86-64, and some are neither. But I would very much like to write a single program, say “make” and copy the file to every machine so that it just works. For the moment, if I want something like that, I have to just use something like Java, C#, or a script. Or, if I need something setuid, I do it in C and compile it for every system, shipping the source code file to the systems instead. But it would be more efficient to not have to do that. That is why I would see FatELF being a “good thing”.

    I know that I am in the minority.

    That brings me to point three: “because in 99% of all usage, the kernel won’t _need_ it. And its cost in effort and overhead would be higher.” For this next part of my post here, I am going to be looking at the Linux kernel, version 2.6.34, which I have just downloaded from kernel.org, which is 64 MB compressed (using bzip2!) and takes up 442 MB when uncompressed, before touching any file in the tree. Now, I am looking at this for x86-64 because that is the system I am running on and typed “make menuconfig”.

    Who needs any of the following options? I am willing to bet that the following options are not needed in 99% of all (desktop, server, and embedded, combined) usage:

    1. Processor type and features/Support for extended (non-PC) x86 platforms
    2. Processor type and features/Maximum number of CPUs
    3. Processor type and features/Memory model
    4. Processor type and features/Build a relocatable kernel
    5. Executable file formats / Emulations/Kernel support for ELF binaries
    6. Executable file formats / Emulations/Kernel support for MISC binaries
    7. Executable file formats / Emulations/IA32 Emulation
    8. Executable file formats / Emulations/IA32 Emulation/IA32 a.out support
    9. Networking support/Plan 9 Resource Sharing Support (9P2000) (Experimental)
    10. File systems/Second extended fs support
    11. File systems/Reiserfs support
    12. File systems/JFS filesystem support
    13. File systems/XFS filesystem support
    14. File systems/GFS2 file system support
    15. File systems/OCFS2 file system support
    16. File systems/Dnotify support
    17. File systems/Kernel automounter support
    18. File systems/Kernel automounter version 4 support (also supports v3)
    19. File systems/FUSE (Filesystem in Userspace) support
    20. File systems/FUSE (Filesystem in Userspace) support/Character device in Userpace support

    I can’t even go on. Twenty is enough; I think I have made my point. In 99%+ of all situations, these options are either always on or always off. They are rarely modified. And the kernel still supports a.out from IA32′s really old days‽ Seriously?

    What does this tell me? It tells me that FatELF—or anything else that came along and did something like what FatELF would do—has room in the kernel. And if it were for whatever reason incompatible with current ELF (as it would very likely be) then the kernel could still support “old” ELF, without any of the extra fields or sections.

    And actually, there is a great deal of possibility around something entirely different altogether. FatELF isn’t the most technically elegant thing I can think of to solve the problems that it solves, but I have yet to see something else seriously proposed. I can think of something even better, actually. We are all taught that operating systems are here to abstract us from hardware, so that we can write applications and not have to worry about communicating with the hardware directly because the OS handles those details for us. Well, if that is the case, then why don’t operating systems also abstract the system’s processor? Why don’t we have operating system kernels that provide a virtual instruction set? Yes, I am talking about essentially moving the application VM into an operating system kernel, though ideally with some supporting utilities in userspace to do things like hold persistent JIT caches and so forth. However, that’s for another post, another time.

    Jun 18th
    Posted by Michael Trausch  as Uncategorized

    Ever since Canonical started with Ubuntu One (or “U1″), people have been crying and whining about it being a proprietary pile of goo. For someone who has been active in the free software world for a long time, and still has his brains about him, this makes no sense whatsoever.

    Yes, it is true that Canonical has not given up the source code for the U1 server-side software. And if you stop right there and proceed no further down the train of thought, you will probably come to the conclusion that it is proprietary software—and you would be wrong. Why?

    Because the U1 client is free software, that’s why! It seems to me that we have forgotten what client/server applications are. Remember that HTTP is a client/server application, as is SSH, as are so many other types of things. It doesn’t matter if both sides have source code available, when it comes to client-server stuff. In fact, it almost doesn’t matter if the server side is available, because by itself it is useless. However, if the client source code is available, that tells you a great deal: it says, “here is how to talk to the server, how to communicate with the server, and how to use the server’s facilities in code that is useful,” and from that you can create an interface. Once you understand the interface, you need only to write software that fulfills that interface, and voilà, you have yourself a free software server that satisfies the client’s needs.

    Then all you have to do is point the client to the new server software, installed somewhere on a server that you control, and you are all set. For that matter, I am somewhat surprised that nobody has done this just yet, because it seems to me that this would be a most excellent means to doing synchronization on systems in an office/workgroup setting. You could then set the Ubuntu One client to point to a server of your choosing and tell it, “sync all of the ~ directory” or “Sync all of ~/Documents” or whatever on every machine and you would have the ability to share and so forth… overall it seems like a great idea to me. Too bad I don’t have enough time to write it currently, though; I have a full plate of other projects (both in the realm of computing and not) that I have to get done first.

    Unless, that is, someone wants to offer me some serious cash to write an U1 server implementation. Cash always advances a project to the head of the queue…

    Jun 17th
    Posted by Michael Trausch  as Uncategorized

    A common piece of software to find in an office is a word processor—it’s also often one of the most expensive pieces of software that is found in the office! This makes little or no sense, when there are a multitude of alternatives and any business that wants to do so can make requirements that documents be submitted in some reasonably sane format such as PDF.

    First, let me start by saying that you should read this Ars Technica article on “The prospects of Microsoft Word in the wiki-based world” which is a great article on how unnecessary something like a word processor is in today’s world.  The article is nearly a year old, but it’s still a good read.

    Creating information from data does take some work and that work should go somewhere. A wiki seems a natural choice: everyone in a workgroup can collaborate using a wiki, and the data from the wiki is all in a single place so that it can be easily backed up. Of course, it takes some getting used to the notion that you are using a wiki and not a word processor, but that’s not all bad.

    Most things that you work on in a word processor these days are done only to send to someone else in an email anyway. And word processors do not guarantee that formatting is kept intact unless you send in a strong, font-embeddable format such as PDF anyway, so they have little point. At least with a wiki, you have the ability to quickly write and save, and others can come along and update, and so forth. If you need to, you can even upload things that people send you (graphics, documents, whatever) into the wiki and use the wiki to keep track of those things.

    In fact, I sometimes wonder why we don’t actually use things like wikis for information storage and retrieval more frequently. After all, once upon a time, a folder that held paper was an acceptable form of record keeping. The analog to that is a folder with text documents located in a filing cabinet, or a wiki (the cabinet) with articles and hierarchies (folders and files). Then you can have files that are as long as you need them to be without actually having to worry about how much physical space they take up. With the recent apparent movements from structured databases to unstructured information storage systems, that seems to make sense to me. At least then you’re still using a solid database that performs well, but instead of using a highly inflexible database schema that is application-specific, you can use the wiki instead.

    Well, anyway, just a few of my thoughts—2¢ worth of them anyway. Back to my regularly scheduled day, which today involves reading source code from the Mozilla project.


    Powered By Wordpress || Designed By Ridgey