8 Comments
User's avatar
PeterL's avatar

I've seen code like this:

```

def process_file(fname):

try:

f = open(fname, 'rb')

process_opened_file(f)

close(f)

except IOError as e:

print('Could not read file:', fname)

sys.exit()

except:

pass

```

(for the non-technical: this outputs a message if there's a problem opening or reading the file; and silently keeps going for any other exception)

The person who wrote this was lauded for how quickly they cranked out code that (mostly) worked. I was criticized for filing 300 bugs against that code when I had to support it at a customer's site.

[This code doesn't even output what went wrong with opening or reading the file; providing that extra information would take less than 10 more keystrokes]

Expand full comment
Imperceptible Relics's avatar

I am curious if there is an error message that unpacks into like 3 or 4 tiers of error instructions.

Like, "Click here to read error message for end-user: "(e.g. laymen instructions, call this number and mention this error code, or contact the system administrator)"

"Click here to read error message for developer:" "(Boop beep boop beep jargon code)

"Click here to read error message for manager: (type passcode and contact Steve at ext. 211)"

Or "press 1,2, or 3 depending on the user"

Expand full comment
Albert Cory's avatar

Good ideas, for a company that cares about it.

Expand full comment
Tracy Carver's avatar

I made a comment on Nextdoor about this overlooking the hardware side of things. But, I've certainly in recent years seen quite a bit of code running where there was little to no error checking / exception handling. In my old-school days of C that was by-and-large never the case.

Expand full comment
Albert Cory's avatar

Thanks, yeah, I don’t know much about hardware.

Nowadays, if you don’t have exception handling, the OS just aborts with “uncaught exception.”

Expand full comment
Tracy Carver's avatar

Hardware is often much less reliable than you'd think. (I don't know much about it either but I've picked up a bit here and there). Hard drives for instance, real lifetimes can be shockingly short. On big filers I've worked on with RAID 5 configuration, say with 20-30 HDDs you might see 3 out of that set of 20 or 30 "brand new" disks fail in a 2 year period. Or even fiber-optic network cables could go bad, cause a lot of weird "software" failures and that's hard to isolate.

Expand full comment
Albert Cory's avatar

Google’s insight was: just use cheap HW that fails, rather than paying a lot for robust stuff. Replace it often.

Expand full comment
Mike Kupfer's avatar

Replace it often, and design the software so that it handles failure gracefully. I've never used MapReduce, but when I read about it and how it makes it easy to recover from failures, I was quite impressed.

Expand full comment