Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 302 server: nginx date: Wed, 06 Aug 2025 01:20:26 GMT content-type: text/plain; charset=utf-8 content-length: 0 x-archive-redirect-reason: found capture at 20090428030941 location: https://web.archive.org/web/20090428030941/https://www.codinghorror.com/blog/archives/000345.html server-timing: captures_list;dur=0.820123, exclusion.robots;dur=0.026375, exclusion.robots.policy;dur=0.012346, esindex;dur=0.013512, cdx.remote;dur=45.938887, LoadShardBlock;dur=228.166571, PetaboxLoader3.datanode;dur=66.800475 x-app-server: wwwb-app214 x-ts: 302 x-tr: 474 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 set-cookie: wb-p-SERVER=wwwb-app214; path=/ x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() HTTP/2 200 server: nginx date: Wed, 06 Aug 2025 01:20:27 GMT content-type: text/html x-archive-orig-content-length: 17692 x-archive-orig-last-modified: Tue, 21 Apr 2009 12:51:55 GMT x-archive-orig-accept-ranges: bytes x-archive-orig-etag: "70792ef37fc2c91:2e1" x-archive-orig-server: Microsoft-IIS/6.0 x-archive-orig-x-powered-by: ASP.NET x-archive-orig-date: Tue, 28 Apr 2009 03:09:39 GMT x-archive-orig-connection: close x-archive-guessed-content-type: text/html x-archive-guessed-charset: windows-1252 memento-datetime: Tue, 28 Apr 2009 03:09:41 GMT link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Mon, 19 Dec 2005 19:01:28 GMT", ; rel="prev memento"; datetime="Sat, 28 Mar 2009 04:58:55 GMT", ; rel="memento"; datetime="Tue, 28 Apr 2009 03:09:41 GMT", ; rel="next memento"; datetime="Thu, 04 Jun 2009 14:28:49 GMT", ; rel="last memento"; datetime="Thu, 04 Jun 2009 14:28:49 GMT" content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org x-archive-src: 51_9_20090428002042_crawl102-c/51_9_20090428030728_crawl102.arc.gz server-timing: captures_list;dur=0.667932, exclusion.robots;dur=0.025041, exclusion.robots.policy;dur=0.012320, esindex;dur=0.012806, cdx.remote;dur=259.593261, LoadShardBlock;dur=289.662672, PetaboxLoader3.datanode;dur=172.543889, PetaboxLoader3.resolve;dur=213.949233, load_resource;dur=155.651810 x-app-server: wwwb-app214 x-ts: 200 x-tr: 751 server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0 x-location: All x-rl: 0 x-na: 0 x-page-cache: MISS server-timing: MISS x-nid: DigitalOcean referrer-policy: no-referrer-when-downgrade permissions-policy: interest-cohort=() content-encoding: gzip Coding Horror: Just Try Again

programming and human factors
by Jeff Atwood

July 20, 2005

Just Try Again

It's funny because it's true:

A Software Engineer, a Hardware Engineer and a Departmental Manager were on their way to a meeting in Switzerland. They were driving down a steep mountain road when suddenly the brakes on their car failed. The car careened almost out of control down the road, bouncing off the crash barriers, until it miraculously ground to a halt, scraping along the mountainside. The car's occupants, shaken but unhurt, now had a problem: they were stuck halfway down a mountain in a car with no brakes. What were they to do?
"I know", said the Departmental Manager, "Let's have a meeting, propose a Vision, formulate a Mission Statement, define some Goals, and by a process of Continuous Improvement find a solution to the Critical Problems, and we can be on our way."
"No, no", said the Hardware Engineer, "That will take far too long, and besides, that method has never worked before. I've got my Swiss Army knife with me, and in no time at all I can strip down the car's braking system, isolate the fault, fix it, and we can be on our way."
"Well", said the Software Engineer, "Before we do anything, I think we should push the car back up the road and see if it happens again."

In all seriousness, I can't recall a single week that I haven't done this exact thing at least once: Geez, I dunno, just run it again and see if the problem recurs. I don't know if it's a sad indictment of the state of software engineering or a not-so-subtle hint that software engineers deal with thousands of variables in even the simplest of programs.

Posted by Jeff Atwood View blog reactions

« On Being Pushy

Show, Don't Tell »

Comments

The problem with a lot of software, is it's a seal box operating in the wild.

It's sealed, so the user (or developer sometimes) can't just peer in and say "Oh I see what's going wrong there"

Also when a program goes *bang*, unless I'm there seeing and working with the problem, it makes it a lot harder to work out what's going wrong.

For a new website I've been developing for the last year, one of the key components is that every single error that occurs on the website is logged with as much detail as possible.
In addition it was text message/page/call pre-selected person(s) out for certain critcal problems.

The site isn't launched yet, but I'm hoping this will result in a much better experience for all concerned!

Peter Bridger on July 21, 2005 08:11 AM

While it may sound like an unreasonable and funny approach to working on a car, software isn't like a car. Take any analogy too far and it falls apart.

Re-running the software is a perfectly logical approach to troubleshooting. What was the cause of the problem? Does it happen every time? If so, why? If not, why not? Could it be some outside interference that only affected the program that one time, or is it something inherent to the program itself that will happen every time?

Once you do narrow down the cause, you can address it.

Eric K. on July 21, 2005 08:58 AM

@Peter, one extension we've made to that "log every error" approach is to create customizable RSS feeds. All apps on a server log to a central reporter which sends out feeds. The feeds have minimal detail for security reasons, but the link takes you to a suitably secured page that displays the relevant info. Saves you from checking religiously and also reminds you to go look when needed.

Tom Clancy on July 21, 2005 09:54 AM

> While it may sound like an unreasonable and funny approach to working on a car, software isn't like a car. Take any analogy too far and it falls apart.

Right; there are no physical consequences to trying software again, which is why the joke is funny.

I do think we're (or at least I am) occasionally guilty of blindly trying again without doing any kind of postmortem..

Jeff Atwood on July 21, 2005 11:17 AM

I hope the software you run is never as dangerous as a car!

Terrier on July 21, 2005 11:26 AM

Sure, that's what they said about Skynet, too.. ;)

Jeff Atwood on July 21, 2005 05:47 PM

> Once you do narrow down the cause, you can address it.

And if you can't reliably reproduce the problem, how can you be sure you've actually fixed it?

Bruce McGee on July 22, 2005 08:40 AM

While I agree with the posts here, I think we're missing out on a key issue. Another reason that I think software developers like to see an error repeated is to make sure their users are actually reporting what they are seeing. I've been in countless situations where well-meaning users call/email to report an issue only to have said issue be a non-issue. I'm sure most level 1 support folks can attest to trigger happy users calling up when the slightest "out of the norm" thing happens.

zigzag on July 22, 2005 11:29 AM

Ill admit that my first attempt is often to reproduce the error in a controlled environment (my own). The more complex the problem, the less chance of this succeeding though.

I have no problem saying that I do this more out of laziness when it is simpler to reproduce than to analyze the relevant code. Then again there are probably more moving combined parts in your average enterprise app than there are in any car. You dont have only one or two engineers with an understanding of that system, but thats rarely the case for those of us in software.

The 'log everything' approach can work if you spend enough time refactoring (I know I never log _everything_ while designing the code - the 'should never fail' always will), but has the obvious problem that the log anaylzer becomes a critical piece of software in itself to wade through the mountains of information spewed from any long-lived app.

Since the systems I work on tend to be distributed workflows, I switched over to a multicast socket scenario. That way, I can (if I want) have a listener that records to a database, another that jumps in mid-stream to display current system activity on screen, and another that escalates conditions to email/pager notification.

But then of course, that logging system needs to be thoroughly tested....

Escobar on July 23, 2005 02:04 PM

Newer »

Show, Don't Tell

« Older

On Being Pushy

Home

Browse All Posts

[ad] Boost the performance of your .NET applications with ANTS Profiler. Download a free trial now.

[ad] Reason #5 not to use SlickEdit: "The more I type, the better I get at it." Download a free trial!

[ad] Think code review sucks? Try peer code review with Code Collaborator ? no meetings, no paper, no kidding! Also learn tips and tricks with this free book.

Traffic Stats

About Me

Subscribe in a reader

Subscribe via email

Original Source | Taken Source