Something Went Wrong. Never Mind What.

Albert Cory

May 3

The Problem

Read →

8 Comments

PeterL

May 4

I've seen code like this:

```

def process_file(fname):

try:

f = open(fname, 'rb')

process_opened_file(f)

close(f)

except IOError as e:

print('Could not read file:', fname)

sys.exit()

except:

pass

```

(for the non-technical: this outputs a message if there's a problem opening or reading the file; and silently keeps going for any other exception)

The person who wrote this was lauded for how quickly they cranked out code that (mostly) worked. I was criticized for filing 300 bugs against that code when I had to support it at a customer's site.

[This code doesn't even output what went wrong with opening or reading the file; providing that extra information would take less than 10 more keystrokes]

Expand full comment

Imperceptible Relics

May 4

I am curious if there is an error message that unpacks into like 3 or 4 tiers of error instructions.

Like, "Click here to read error message for end-user: "(e.g. laymen instructions, call this number and mention this error code, or contact the system administrator)"

"Click here to read error message for developer:" "(Boop beep boop beep jargon code)

"Click here to read error message for manager: (type passcode and contact Steve at ext. 211)"

Or "press 1,2, or 3 depending on the user"

Expand full comment

Reply (1)

Albert Cory

May 4

Good ideas, for a company that cares about it.

Expand full comment

Tracy Carver

May 4

I made a comment on Nextdoor about this overlooking the hardware side of things. But, I've certainly in recent years seen quite a bit of code running where there was little to no error checking / exception handling. In my old-school days of C that was by-and-large never the case.

Expand full comment

Reply (1)

Albert Cory

May 4

Thanks, yeah, I don’t know much about hardware.

Nowadays, if you don’t have exception handling, the OS just aborts with “uncaught exception.”

Expand full comment

Reply (1)

Tracy Carver

May 4

Hardware is often much less reliable than you'd think. (I don't know much about it either but I've picked up a bit here and there). Hard drives for instance, real lifetimes can be shockingly short. On big filers I've worked on with RAID 5 configuration, say with 20-30 HDDs you might see 3 out of that set of 20 or 30 "brand new" disks fail in a 2 year period. Or even fiber-optic network cables could go bad, cause a lot of weird "software" failures and that's hard to isolate.

Expand full comment

Reply (1)

Albert Cory

May 4

Google’s insight was: just use cheap HW that fails, rather than paying a lot for robust stuff. Replace it often.

Expand full comment

Reply (1)

Mike Kupfer

May 11

Replace it often, and design the software so that it handles failure gracefully. I've never used MapReduce, but when I read about it and how it makes it easy to recover from failures, I was quite impressed.

Expand full comment

Life Since the Baby Boom

Something Went Wrong. Never Mind What.