Uptime

by Volker Weber

I am monitoring three servers every three minutes. All of them are complex, but in different ways. These are the last 20k probes:

Up: 20943 Down:62
Up: 20349 Down:752
Up: 20989 Down:2

The servers run Apache, Domino and IIS. Which is which?

Update: About 3 weeks later:

Up: 36215 Down:92
Up: 35363 Down:1283
Up: 36283 Down:2

Update: About 2 months later:

Up: 50875 Down:117
Up: 49675 Down:1983
Up: 50960 Down:4

Looks like the data is pretty stable.

Comments

I know your prejudices ;-)

Apache should be #3, Domino #1 and IIS #2.

Two faults?

Cem Basman, 2007-08-04

Two faults. You can't make only one. :-)

Volker Weber, 2007-08-04

IIS, Domino, Apache? Just a guess...

Martin Hiegl, 2007-08-04

I'd say IIS, Domino and Apache as well.

Matt White, 2007-08-04

Domino, IIS, Apache

Bruce Elgort, 2007-08-04

Domino - IIS - Apache

Andy Brunner, 2007-08-04

Bruce, Andy, as I understand it that's what Cem proposed and Volker called faulty ;-)

Martin Hiegl, 2007-08-04

IIS - domino - apache :-)

ingo harpel, 2007-08-04

@Martin,

Ah.... I now think ingo is got it.

Bruce Elgort, 2007-08-04

Interesting. Everybody assumes that Apache is the best one. And you are right. It is the machine which hosts this website. The complexity arises from the fact that it also hosts another [insert five digit number here] sites.

The other two are Domino and IIS. The Domino site is complex because it is tied into a Siteminder SSO solution. The Domino server is pretty new, the Siteminder environment has existed for at least five years. The IIS server hosts vast amounts of data on EMC storage.

So what is the difference? Why is one site performing very poorly and the other one so-so? Some of the missed probes can be explained by reboots required for keeping the Windows platform properly patched. Since there is no redundancy, there are short outages. And they are tolerable and planned.

The server which fails the most does so because of organizational complexity, not because of the platform. And this explains why I charge different rates, depending on the problem. See the FAQ.

It's IIS, Domino, Apache. Cem, you should know me better.

Volker Weber, 2007-08-04

:-)

Cem Basman, 2007-08-04

Yes, but what is the host OS?

IIS must be Windows. Apache, I guess not. Domino?

"Comparisons are odious."

-- Oscar Wilde

Chris Linfoot, 2007-08-04

Oops. Make that Shakespeare. The Wilde quote is a little different.

Chris Linfoot, 2007-08-04

All are very stable platforms, and all are very problematic platforms. It depends entirely on how well matched the task they are performing is to the purpose for which they were built, the load they are built for matched to the demand they are under, and of course the ability and budget of the people running them. Distance (in hops) from the monitoring software also plays a part.

Volker, your Apache site hosting a 5 digit number of sites probably gets serious attention to change control, network availability, and other such issues. Small sites actually are MORE prone to downtime than are large ones.

Andrew Pollack, 2007-08-05


I should also point out that the differences here are likely below the level of statistical significance unless your measurements cover years of consistent data collection.

Andrew Pollack, 2007-08-05

Andrew, one of the problematic sites is actually pretty major. It also has more change control around it than anything else on the list (backed-up and clustered too).

Whilst the downtime might be “statistically insignificant”, to that site’s users (and there are many), the downtime is anything but.

You touch on a key issue however:

… the ability and budget of the people running them…

Ain’t that the truth.

Ben Poole, 2007-08-05

Hmm. Andrew, if I'm not completely mistaken Domino was never built to be a web server. So you're quite right on that account. As for the rest of your musings: well, anything of significance you wanted to say? Or just training the "marketing weaselly speech" skills?

Stefan Rubner, 2007-08-05

Wow, Stefan - if I'm not confused, you may be the first person ever to suggest I'm not blunt enough in my speaking. ;-)

My point, and you can consider it significant if you like or not as it frankly matters little to me, is:

While interesting, monitoring three sites, each mean to represent a different platform and gaining a 1% - 3% difference in uptime when not controlling for budget, use of redundancy in network paths and servers, site functionality, distance from the measuring tool in hops, planned vs. unplanned downtime, or frankly anything else does not make it a statistically meaningful measurement.

Now, monitoring several dozen of each kind of site in a roughly even distribution of functionality, complexity, distance, and budget over a period of several months would in fact be interesting (though difficult).

Andrew Pollack, 2007-08-05

Andrew, you're definitely not "not blunt enough" in your speaking ;)
Yet, you seem to assume that the points you mention haven't been taken into account already. Thus, you are arguing against the methods of measurement used where most likely all Volker wanted to state is that some sites obviously aren't using either the software or the infrastructure (or the ressources) that would be needed for the task.
So, yes, I have to admit your speaking is perfectly blunt enough and thus I deem it to be a perfect match to your way of thinking ;)

Stefan Rubner, 2007-08-05

Fair enough, Sefan -- I am, after all, just a dumb fireman. ;-)

Andrew Pollack, 2007-08-05

Hm, seems like something has gone missing here. I was fairly sure there was a comment here questioning just which site this was targetted toward. It left me curious.

I don't deny the right of any blogger to remove a post, but if that's happened it would be helpful to have some kind of edit mark. My grasp on reality being as thin as it is, I'm tempted to be left thinking I imagined it.

Andrew Pollack, 2007-08-05

I remove comments on request of the author. Is that OK with you, Andrew?

Volker Weber, 2007-08-05


Of course it is, Volker. Its your blog, and its his comments. FSM knows I've made a few comments over the years I would take back if I could.

Sadly for me, this one supported my point fairly well. ;-)

Andrew Pollack, 2007-08-06

When it was written, yes. But not after careful thinking. :-)

Volker Weber, 2007-08-06


I have a thing about hidden agendas, manipulation, and disingenuous argument. All three trigger me in a pretty negative way. Before I read that post, it didn't really stay in my mind more than fleetingly that there was perhaps a specific site you were making a private point about. I still hope that isn't the case, because either way I don't think this does the argument justice.

I surely don't mind throwing stones at something I don't like -- but I usually pick big stones and make no bones about throwing them fairly openly.

Andrew Pollack, 2007-08-06

Andrew, the only advice I have is not to jump to conclusions too quickly.

Volker Weber, 2007-08-06

Andrew, nice reference to FSM.

.::AleX::.

Alex Hernandez, 2007-08-06

Old vowe.net archive pages

I explain difficult concepts in simple ways. For free, and for money. Clue procurement and bullshit detection.

vowe

Paypal vowe