Jim's Random Notes - January 2003

Wednesday, 29 January, 2003

Exploring the .NET Framework

Perhaps the most frustrating part about learning any new development platform is learning where to find things. With the .NET platform, for example, I have this wonderful class library that has all manner of classes to access the underlying services. But without a roadmap, finding things is maddening. For example, if you want to get the current date and time, you access a static property in the System.DateTime class (System.DateTime.Now). If you want to pause your program, you call System.Threading.Thread.Sleep()—a static method. The class library is peppered with these static methods and properties, and ferreting them out is difficult. What I need is some kind of reference that lists all of the objects that provide similar useful static members, and a cross-reference (maybe an index) that will let me quickly find the appropriate static member for a particular purpose. As it is, I'm spending untold hours searching the documentation or just randomly picking a topic in the library reference and scanning it for interesting tidbits.

That said, I'm having a ball learning C# and .NET programming. I'm still not as comfortable with it as I am with Delphi (hardly a surprise, considering I've been working with Delphi since its early beta days in 1994), but I can see C# becoming my language of choice very quickly. C# and the .NET platform have all the ease of use of Delphi, a comprehensive class library, good documentation, and a solid component model that's supported by the underlying framework (a luxury that Delphi didn't have). It's easy to like this environment, and it's going to be rather difficult for other compiler vendors to provide tools that programmers will prefer over C# and Visual Studio .NET. I'm interested to see what Borland comes up with. Whatever it is, it'll have to be pretty compelling to get me to switch.

Monday, 27 January, 2003

Movie Review: In The Bedroom

I find it difficult to believe that Debra and I just sat through In the Bedroom on DVD. My first (printable) response after the movie was over is "You've got to be kidding me." That pile of brooding incoherent disconnected images was nominated for 8 Academy Awards and countless "lesser" distinctions? How could anybody in that film be nominated for anything other than Best Comatose Performance?

I've long known that film critics and I rarely agree when it comes to drama, what with our widely differing opinions of such drivel as The Last Emperor, My Dog Skip, The Thin Red Line, and Titanic. But until recently I thought perhaps I just didn't get it. I've finally realized, though, that if you think of films like In the Bedroom as THE ROYAL NONESUCH, and film critics as the citizens of that little Arkansas town in The Adventures of Huckleberry Finn, then things make a lot more sense.

In the Bedroom is yet another film that suffers from (among many other things) the deadly sin of taking itself seriously. Don't waste your time or your money.

Friday, 24 January, 2003

More on Redundant Code

Something else interesting about the research I mentioned yesterday is that they've used the tool to find a large number of previously undiscovered bugs in the Linux kernel—primarily, if I'm reading the sketchy information in the Slashdot postings correctly, in kernel device drivers. That the bugs reside primarily in device drivers isn't terribly surprising. Device driver code is notoriously difficult to write for many reasons, and doubly so when the programmers don't take the time to read and understand the hardware manuals. It's harder still when the manuals don't exist and the programmer is working from knowledge gained by poking random data at the hardware interface to see what comes out.

That this analysis reveals so many previously undiscovered bugs both validates and refutes the open source mantra "with enough eyes, all bugs are shallow." Validation because somebody finally looked at the code, and refutation because it points out that not all code is equally examined. Some parts of the code get looked at by thousands of eyes, and other parts don't even get tested by the original programmer, much less reviewed by somebody competent. An automated auditing tool like this is useful, but it still can't replace a competent programmer reviewing the code, as it's still quite possible for errors to occur in modules that do not exhibit any of the redundancies or similar indicators. The idea behind open source is that "somebody will care." The reality is that lots of people care about certain parts of the project, but other parts are left wanting. That particular problem can only get worse as the kernel continues to grow.

Thursday, 23 January, 2003

Redundant Code as a Bug Indicator

Scanning Slashdot today during lunch, I came across this posting about two Stanford researchers who have written a paper (it's a PDF) showing that seemingly harmless redundant code frequently points to not-so-harmless errors. They used a tool that does static analysis of the source code to trace execution paths and such. The technology behind their tool is fascinating and something I'd like to study, given the time. But that's beside the point.

On the surface, the paper's primary conclusion—that redundancies flag higher-level correctness mistakes—seems obvious. After all, it's something that we programmers have suspected, even known, for quite some time. But our "knowledge" was only of the problems that arose specifically from particular redundancies like "cut and paste" coding errors—repetitions. The paper identifies other types of redundancies (unused assignments, dead code, superfluous conditionals). Some of these redundancies actually are errors, but many are not. The paper's primary contribution (in my opinion) is to show that, whether or not the redundancies themselves are errors, their existence in a source file is an indicator that there are hard errors—real bugs—lurking in the vicinity. How strong of an indicator? In their test of the 1.6 million lines of Linux source code (2,055 files), they show that a source file that contained these redundancies was from 45% to 100% more likely to contain hard errors than a source file picked at random. In other words, where there's smoke (confused code, which most likely means a confused programmer), it's likely that fire is nearby. These methods don't necessarily point out errors, but rather point at likely places to find errors.

A production tool based on the methods presented in this research would be an invaluable auditing tool. Rather than picking a random sample of source files for review, auditors could use this tool to identify modules that have a higher likelihood of containing errors. Very cool stuff, and well worth the read.

Saturday, 18 January, 2003

The Fallacy of Affordable Health Insurance

The idea behind insurance is simple. A group of people agree to pool their funds to protect individuals in the group from financial ruin in the case of a catastrophic loss. The group's premium payments are invested, ideally at a profit, and any excess funds over what is reserved for future losses is paid back to the group members (in the case of a mutual insurance company), or distributed to the company's stock holders. This works quite well in many situations because catastrophic losses are relatively rare. Life insurance works slightly differently in that everybody dies at some point. Insurance companies use well-researched statistics to project the insured's life expectancy and then structure a payment plan so that the insured's premiums, when invested at a reasonable rate, will return more than the policy's face value before the insured person dies. The reason you buy life insurance isn't to insure that your estate will have $100,000 (or whatever sum) when you die at age 80, but rather that if you kick the bucket in your 50's, your dependents will have something to fall back on. If you could guarantee that you'd live to be 80, there would be no need for life insurance. No, you can do much better by investing the money yourself.

One other thing. Insurers base the price of the premiums (what individuals must pay) on two things: the computed probability of a loss, and the cost to fund the loss should it happen. That is, a 22-year-old with a drunk driving conviction and an $80,000 Porsche will pay a higher premium than a 40-year-old mother of three with a minivan and a clean driving record.

Okay, that's how insurance works. So what's my point?

My point is that "health insurance" as we've come to know it can't possibly work. It's a huge Ponzi scheme that at some time has to crumble in on itself. Remember, insurance works by spreading the cost of infrequent catastrophic losses over a large group of individuals. Critical care insurance can work in this way. But you can't fund day-to-day health care, which is what today's "health insurance" has become, using any kind of insurance scheme. The theory is that younger members of the group, who are supposedly in better health and need less medical care, help subsidize the payment of care for older members of the group. This all works fine as long as health care costs remain relatively fixed. But several things happen: better and more expensive medical technology (tests, treatments, drugs) becomes available, younger members insist on more coverage (lower deductibles, lower co-payments, wider coverage of services), life expectancies get longer, and an increased scrutiny by an ever more litigious society requires that every possible test be run in every possible circumstance. Oh, and insurance companies don't have free reign to adjust their premiums based on an individual's health history, age, or habits. (Yes, smokers pay the same premiums as non-smokers.) Prices increase, soar, and then skyrocket.

Until relatively recently, employers have been picking up most of the increasing costs of health care "insurance." But employers are starting to realize that they can't continue to pay the ever-increasing premiums, and they're expecting employees to pay a little more out of their own pockets. This will work for a short while, but individual employees won't be able to afford it for long. At some point, government will step in and take over the whole health insurance Ponzi scheme. But even the Federal government has finite resources, which soon will be overwhelmed by the cost of everybody insisting on the absolute best possible care right now.

Do I have an answer? Of course I do. Scale back expectations, insist that people take responsibility for their own health (that is, eat right, exercise, cease self-destructive behaviors), shoulder the costs of your own day-to-day medical care, and use insurance as it's intended—to cover catastrophic losses like broken legs and serious diseases. It's a workable plan, in theory. Sadly, it requires more restraint and personal responsibility than most people today can manage.

Thursday, 16 January, 2003

Trainable Bayesian Spam Filters

My friend Jeff Duntemann posted a note yesterday in his web diary about using trainable Bayesian filters to filter spam. I still don't agree that filtering is the best way to combat spam, but it's probably the best we're going to get, all things considered. Blocking spam at the source (i.e. preventing it from entering the system in the first place) would be much more effective, but the design of the email protocols, and resistance to change prevent implementation of an effective Internet-wide spam blocking scheme. So we're left with filtering at the delivery end.

The nice thing about Bayesian filters, as Jeff points out, is that they are trainable. And the one that everybody's talking about (see Jeff's site for the link) has a 99.5% success rate, with zero false positives. It's impressive, and perhaps this is the way to go. But on the client? Like spam blocking, filtering should be done on the server. All it would take is some simple modifications to the email server, a few extensions to the POP and IMAP mail protocols, and everybody could have spam filtering regardless of what email client they're using. Filtering on the server would be much more efficient than having each individual client do the filtering. Plus, servers could implement black list filtering on a per-user basis, and perhaps stop a large amount of unwanted email from ever being accepted.

Do I expect this to happen? Sadly, no. Even as outdated and inefficient as our mail protocols are, I don't expect them to be changed any time soon. We're left waiting for the established email clients to include this kind of feature, or for somebody to come up with a new email client that has a good interface, includes all of the features we've come to expect, and also has advanced spam blocking features. I think it's going to be a long wait.

Wednesday, 15 January, 2003

Good Deals on Good Wine

In case you haven't heard, there's a glut of wine on the market. Vintners are struggling, and consumers are enjoying record low prices on some very good wines. California growers enjoyed an excellent grape harvest this year, and now there's way too much wine on the market. Wine producers are victims of their own success. California wine makers spent huge amounts of money marketing their products in the 80's and 90's, and successfully increased demand for their product. Their increased market combined with the boom years of the 90's led to many newcomers entering the market and established companies planting new orchards. At some point along the way they crossed the line that separates meeting demand and overproduction. Even without the current economic downturn, they'd have too much wine on their hands. I don't know if vintners are screaming for federal price supports or other types of relief yet, although that won't surprise me.

I'm keeping the ranting to a minimum lately, so I'll just mention that it's usually a good idea to see which way the wagon's headed before you hop on it. But if you like wine, now would be a great time to head down to your favorite retailer and get some.

Monday, 13 January, 2003

Spirograph Revisited

I think every kid had a Spirograph when I was growing up. I know that I spent countless hours with my colored pens and those little plastic pieces, trying to overlap figures in different ways to make beautiful designs. I'd pretty much forgotten about Spirograph until 1995, when my friend and co-author Jeff Duntemann wrote a program he called "Spiromania" for our book Delphi Programming Explorer. I converted the program to C++ for The C++Builder Programming Explorer, and have played with it from time to time since, even toying with it when I was learning about writing Windows screen savers. I'm at it again, but this time with completely new code written in C#. It's a project for learning .NET programming.

One of the cool things about computers is that you can simulate things that you just can't do with a physical model. For example, the two figures shown above were created by simulating a circle of radius 60 rolling around a circle of radius 29, and the pen on the edge of the bigger circle. The only difference is that the figure on the left is drawn with smooth curves—which is what you'd get with a real Spirograph toy—and the figure on the right is drawn by plotting 5 points for each time the big circle goes around the little circle. (In actuality, the figure on the left also is drawn using straight lines, but the lines are sufficiently short to give the illusion of smooth curves.) In any case, the figure on the right would be impossible (well, okay, exceedingly difficult) to create using a physical Spirograph toy.

I've written a .NET custom control so I can drop these things on a form and fiddle with their properties. I'm working now on some animation and a little better user interface, and eventually will have a program that will allow you to create and manipulate multiple images, moving them around and overlapping them. It's great way to learn a new programming environment.

Sunday, 12 January, 2003

More on Web Services

The basic idea behind Web services isn't really new. Over the years I've seen a few IT shops that had standard protocols for their disparate systems to communicate. They all had problems, though, because the protocols were mostly ad-hoc creations not subjected to rigorous design, and weren't very easy to modify or extend. And there was no possible way that systems from company A could talk with systems from company B.

Microsoft's innovation (and perhaps I'm stepping on a land mine using that word here) isn't so much the individual ideas of standard protocols, standard language, automatic data description and discovery, or any such technology. No, Microsoft's innovation here is in tying all of those individual technologies together, filling in the holes with very solid design work, and pushing for standardization so that any program running on any modern system has the same ability to query any other system for information.

You don't need .NET in order to write or access Web Services.

Naysayers and Microsoft bashers will complain (mostly unjustly, in my opinion) about the company shoving things down our throats, strong-arming the industry, or coercing standards organizations. The plain fact of the matter is that our industry has needed something like this for at least 10 years, and no "consortium" has even come close to providing it. Certainly, Microsoft hasn't acted alone in this—they've had the cooperation of many other companies—but without Microsoft, it wouldn't have happened. Microsoft has provided a huge benefit to the industry by spearheading the creation of an open Web services architecture and making it freely available to all. That they stand to make money from selling applications and development software based on Web Services doesn't lessen the benefit that they have provided. To the contrary, it should make them much more interested in ensuring that the standard is complete, consistent, and flexible. From what I've seen of the standards, I believe that it is.

Saturday, 11 January, 2003

Using Web Services for Integration

One of the things that impresses me most about Microsoft's .NET strategy is not so much the technology behind it (although that is impressive), but rather the way they're going about selling it to businesses. Microsoft has identified a real concern in the business world: disparate systems that have to share information. For example, consider a medium sized bank that has offices throughout southern and central Texas. Among the systems that they support are:

Central data processing for posting checks, deposits, loan payments, etc.
Web site
Online banking
Automated clearing house for Fed transactions
Word processing with central document storage
Teller terminals
Automated Teller Machines
Customer Relationship Management system used by Customer service representatives
Credit rating and scoring system
Human Resources

Those are just some of the internal systems. They also would like to interface with their suppliers and business partners. Some run on big iron, some on PCs, others on older systems that aren't even supported by their manufacturers anymore. Some of the systems are in a single location, and others are spread out over thousands of square miles. Software is a mix of pre-packaged applications, commercial applications with custom modifications, and in-house custom applications. Ideally, all of these systems could share data. That turns out to be very difficult, though, due to incompatible formats (ASCII versus EBCDIC, for example), incompatible communications protocols, or other problems.

It's certainly possible to modify each system so that it can interact with all the others. The obvious way is to modify each individual system so that it understands what it needs to know about each of the other systems. Assuming that each of the 10 systems above needed to interact with all 9 others, you would have to write 90 different interfaces. Even if you only had to write 1/4 of that (23 interfaces), it'd be a daunting task.

Microsoft's idea (more on that tomorrow) is simplicity itself: write just 10 (maybe 11) interfaces. If you can define a standard communications protocol, and a way for all applications to describe the data that they provide, then all you need to do for System A to talk with System F is to tell System A what data it needs to obtain. It becomes almost a trivial matter to instruct System A to obtain the current balance information for a particular account.

So why "maybe 11" interfaces? Ideally, each system would be able to communicate with each of the others using the standard protocol. If that's not possible, though (consider an old machine that has no ability to communicate via TCP—they do exist), than an intermediary system will have to serve as a proxy. The proxy machine accepts requests from the other systems on the standard protocol, and relays those requests to the orphan systems. Or, vice-versa.

This is the basic idea behind "Web Services." Although simple in concept, it still requires much thought and care in design and implementation. More tomorrow.

Thursday, 09 January, 2003

Solar Powered Cars?

While I'm on the subject of solar power, let's look at the possibility of solar powered cars. From yesterday, we know we can get one kilowatt per square meter. A typical car is about 15 feet long and about 7 feet wide, or about 10 square meters. So on a sunny day I can expect 10 kW to be hitting the surface of my car. Even assuming those fictional 100% efficient solar cells, we're only talking 13.5 horsepower (which, incidentally, is the power rating on my riding lawn mower). A figure of 4 horsepower, maximum, is more reasonable, and even that's probably on the high side. But reasonable performance for a practical car requires about 20 hp to be delivered to the drive train. There simply isn't enough sunlight hitting the car, even at 100% efficiency, to power a useful car, even if the solar cells could recharge batteries while the car is parked. See this web site for some ideas on the numbers, or do a Google search on "solar car" and see what the universities are doing.

Wednesday, 08 January, 2003

More on Solar Energy

Talk once again in the media about our dependence on oil in general, and foreign oil in particular. The latest is The Detroit Project, an outgrowth of Americans for Fuel Efficient Cars. Their claim: driving an SUV is like sending a check to Osama bin Laden. The idea certainly isn't new—see my November 3, 2002 entry. But I think they're going a bit overboard.

With all the talk about our dependence on foreign oil, people are starting to bring up alternative energy sources again. Today I saw a post from somebody claiming that solar is the "one true way." There is an astonishing number of supposedly bright people who think that government and private industry have conspired to deprive us of this oh-so-wonderful power source. I never was much of a believer in conspiracy theories, so I thought I'd do some research and run some numbers.

A reasonably good summary of what I learned is in this paper on Photovoltaic Power Generation. According to my research, a good average number for the amount of sunlight that hits the earth's surface is about 1 kilowatt per square meter (1 kW/m²). During daylight hours. On a clear day. The linked article has a map that shows the average daily sunlight, in kilowatt hours per square meter, for different parts of the country. For Central Texas, that number is between 5.0 and 5.6. Unfortunately, photovoltaic cells are not very efficient. The high end for new, clean, commercial cells is about 17 percent, making that 5.6 kWh/m²--about 0.95 kWh/m²per day. So how big an array would I need to power my house?

I pulled out our electric bills for the last year. We average about 3,000 kWh per month, or about 100 kWh per day. Dividing by 0.95 comes out to 105 square meters, or about 1,130 square feet. So if I covered half of my roof in solar cells, I could generate enough power to run my house. What would such a thing cost?

A good rule of thumb for the cost of photovoltaics is about $5.00 per watt. A system to power my house needs to generate 100 kWh per day, but do it in 5.7 hours. That means that I need a 17.5 kW system which will cost $87,500 for the photovoltaics alone. I still need to add a system to store the generated power so I can use it at night or when it's cloudy. The cost of photovoltaic cells typically makes up between 25% and 50% of total system cost. So I'll have to spend at least another $87,500 on a storage system and associated hardware, bringing the total cost of my power system to more than I paid for my house. Not friggin likely. True, if the cells were 100% efficient, I could save about $70,000, but that does nothing for the storage system. Even if I could do it for $15,000 it doesn't make sense. At 5 cents per kilowatt hour (what I currently pay), it'd take me 100 months (8.5 years) to make up the cost, and by that time I'd have to replace the cells.

I'm all for cleaner energy, but not at those prices.

Tuesday, 07 January, 2003

Eight Legged Freaks

Debra came back from the video store with Eight Legged Freaks, a modern day big bug movie in the same vein as Them and Tarantula. It was predictably terrible, but then what do you expect from a movie starring David Arquette?

Included on the DVD, though, was the original short film, Larger Than Life, which writer/director Ellory Elkayem presented at the Telluride Film Festival and that eventually got him the contract for Eight Legged Freaks. Larger Than Life is a wonderfully well done short film, shot in black and white, featuring just four actors and a handful of creepy-crawlies. It, too, is predictable, but it has a certain charm (and remarkable brevity) that makes it worth watching. I wouldn't say that the short film is worth the DVD rental, but if you run across it in somebody's collection take the time to plug it in and watch Larger Than Life.

Monday, 06 January, 2003

Campaign Season Already?

Representative Dick Gephardt filed papers this evening to create an exploratory committee for the White House. He joins Senators John Kerry and John Edwards, and Governor Howard Dean of Vermont in forming these "exploratory committees." In addition, Senators Tom Daschle and Joseph Lieberman both are expected to announce soon. That's six Democrats vying for the nomination, with 22 months to go before the election. (Note 01/07: Tom Daschle announced today that he won't be throwing his hat into the ring, so that leaves us with just five. For now.) The Democrats have a real problem here. Not only is the party searching for identity, but their likely field of contenders for the Presidential nomination are practically indistinguishable from each other. The best that can be said for any of them is "at least he isn't Al Gore." The Democratic party is not standing on a firm platform these days. They're fractured, as the large field of contenders shows, and about the only thing the factions have in common is an urge to regain power. Unless the Democrats find a leader who can unify the party and project a coherent vision (and none of the likely contenders appears to be that person), the only votes they'll get in 2004 will be from hard-core Democrats who always vote the party line. And that isn't enough to win a Presidential election.

Saturday, 04 January, 2003

Charlie Update

It's not very often that Charlie the hyperactive slobber dog settles down enough that Tasha the timid poodle will consent to lay down anywhere near him. And even when he does crash, he usually goes right back into happy puppy mode whenever anybody gets near him. After a hard day of playing in the yard, he'd crashed in his favorite spot on the futon and we laid Tasha down beside him. I was surprised that he remained still long enough for me to get the camera and take a few shots.

It's been almost six months since Charlie showed up, thirsty, hungry, and with a bad case of mange. If you would have told me a year ago that I'd adopt a mangy pit bull puppy, I'd have asked what you were smoking. But he's turned out to be a very affectionate and well-behaved (although certainly not perfect—yet) dog. It's surprising how fast we got used to having him around. I suspect, though, that Tasha and Kameeke the cat don't feel quite the same way.

Friday, 03 January, 2003

Thoughts on Microsoft .NET

I've been studying Microsoft .NET for a month now, and I continue to be impressed. Microsoft put a lot of very serious thought and design work into .NET, and they have produced an incredibly useful development and execution platform. Finally, I have a unified model for which I can develop all types of Windows applications and components. Whether I'm writing a standalone Windows application, a console application, a Web site, an XML Web service, or a system service, I use the same development model. True, I could have done that in the past using straight Windows API calls, but I do like to actually finish projects from time to time.

The other nice thing? It's language independent. You can write modules in C#, Visual Basic, COBOL, FORTRAN, or any other .NET supported language, and (provided you follow some simple rules about standard data types) they'll all work together. This is much different than using COM or ActiveX components, which were supposed to be language independent but rarely were.

Overenthusiastic .NET supporters will talk about platform independence, but don't let them fool you. Whereas it's true that Microsoft has submitted the C# language and the Common Language Infrastructure (CLI—a subset of the .NET Framework) to international standards bodies, I find it doubtful that they will actually release a version of .NET for anything other than Windows operating systems. I suspect that the "platform independence" that is built into .NET is there to support future versions of Windows rather than Linux or some other operating system..

There's a lot to like about .NET, and in a month of studying it I haven't found any gaping holes there that would make me not want to use it. This is the most exciting new thing to come along for developers in a very long time—way more exciting than anything I found in the Linux world (something I'll discuss another time).