Ed Weissman depth
ED WEISSMAN
Programmer • Writer • Teacher
Y in

Eddie's Law of Depth

"Understanding is directly related to the square of the distance below the surface."

How far below the surface do you have to dive to get what you want?

As deep as it takes.

Things are not as they appear on the surface. Not with technology. Not in our domains. And definitely not with other people. It's always been like this, but now more than ever.

Few of us go more than one level deep. Forget "thin clients". Our real worry should be "thin I.T. people".

Days Since Last Bug: 0
dunningKruger
Why is Order Entry so slow? Someone speed it up!
garyRiggs
I recompiled
the mods with PRECISION. But that didn't work.
floChart
I re-indexed the data base. But that didn't work.
TAROTAGILEVOODOOeeddieWeissman
I removed a "SLEEP 10" from the POST. That did it.

We see a simple form but don't grasp the 20,000 lines of server-side code that make it work. We see a business process but don't appreciate the 30 years of lessons that enabled the 42 things that must work 6 levels deep. We only see people superficially (I like him. She's stylish. What a dork. Not a team player.) without understanding who they are, what they've been through, and what they must do every day to keep things moving along.

If only we'd look a little deeper...

"Skimming the Surface" has 3 possible outcomes,
all of them bad.

1. We do too little.

I was called in to handle an "emergency". We had over 300 bugs in the past six months, in multiple systems, where a record entered into one system failed to process in another. A Sales Order from ecommerce wouldn't process in fulfillment. A Purchase Order Acknowledgement received via EDI wasn't noticed in Procurement. A Price entered in web-based Contracts didn't take in Order Processing (with devastating results).

How had they been handling these problems? By "forcing" the record in the receiving system and hoping that suddenly and miraculously it would never happen again. So I looked under the hood. In every case, invalid control characters compromised that data. The solution? A one line function with a regular expression to remove illegal control characters in every API.

They were so busy clearing problems from the surface that they never dove down a single level for an easy prize. The error logs told them what to do if only someone had bothered to look.

2. We do too much.

How many times has this happened to you? (Tell the truth.) You have to maintain an old program and find the exact same logic in five different places. Exactly the same. How did that happen?

The obvious answer is that a previous programmer found what they were looking for elsewhere in the program, got lazy, copied it, pasted it, and no one in Peer Review, Code Review, or QA caught it. (Quelle surprise.) The less obvious answer is that a prior programmer never bothered to look beyond what they were doing.

Maybe they were lazy too. Or it was an emergency. Or they were in a rush for some other reason. But this is one of the many reasons we have so much crap to maintain, and maintain, and maintain some more. Because someone didn't dive more than one level deep.

3. We do both.

The first two scenarios are amateurs compared to this one. In order to do real damage, you have to be special enough to do both too little and too much at the same time...

The corporate office purchases some outrageously expensive best of greed software package for another division. Then some accountant thinks, "We could save $10 million by deploying SoftCrap in this other division."

So in order to save $10 million, your I.T. department drops other projects with real payback and spends two years and $2 million implementing software that doesn't do what's needed and causes your users to lose another $20 million.

On the surface, it appeared to make sense. Both divisions distribute hardware. But the first is commodity while the second is proprietary. The first is business-to-consumer (B2C) while the second is business-to-business (B2B). The first buys from the lowest bidder while the second uses Approved Sources. The first sells off the shelf while the second needs lot traceability and expiration dates. And so on, and so on, and so on... Mission critical domain issues invisible to the novice at the surface, but easily accessible to anyone who dives more than one level deep.

We hire programmers who don't know our stack and then we don't train them. We bring in business analysts and project managers with foodservice backgrounds to help run our medical equipment business. We have managers "managing" people doing what they have never done themselves.

And then we wonder why no one looks more than one level deep.

The Four Agents of Technical Depth

My favorite metaphor for Depth is the Prize on the Sea Floor. It represents some key to our business, in either the software, the domain, or the people. It can only be reached by those with sufficient skill, experience, and determination. Everyone else is a poser who talks about the prize, but never retrieves it.

How Deep Can You Dive?shorebossscrumbelljuniorprogrammersenior programmer$$$


This picture displays the metaphor. But real life isn't so clear and the water is muddy. We call the agents all kinds of H.R. fluff: "business partner", "functional admin", "PM/BA", or "tech analyst". We call the prize "allocation algorithm", "supply/demand exceptions", "defense sector parameters", "substitution logic", or my favorite, "regression scripts". But no one really knows what any of this crap means, so shrouded in mystery, the prize remains on the sea floor until a programmer swims down and gets it.

Level 0 - The Shoreboss
He knows there's a body of water somewhere, but doesn't know where or why. It doesn't matter. He's the boss and doesn't care about the prize on the sea floor. He is the prize.
Level 1 - The Scrumbell
This Scrum Master, Project Manager, Business Analyst, or Functional Architect (whatever that is) doesn't know how to swim, so he never learns and stays in his boat. He covers up by pretending to know what's on the sea floor.
Level 2 - The Junior Programmer
If she adjusts her snorkel and mask right, she can see the prize on the sea floor. Not well enough to fully understand it, but she's learning and someday will swim to embrace it.
Level 3 - The Senior Programmer
Which is it? The programmer is a senior because she swims for the prize? Or she swims for the prize because she is senior?
Yes.


What do you do if you don't know the prize?

In every I.T. department I've ever worked in came the inevitable, "With continuous deployment, there are too many bugs. So from now on we're only going to deploy software to production in one release per month." This is how shorebosses who don't know how to build software respond to a problem with a bigger problem. (It's also like a restaurant that's only open one day a month to reduce food poisoning.)

In this case, the prize is "Building Software Properly". You clarify Business Requirements. You detail Functional Specifications. You identify Test Cases. You write a Project Plan. In short, you build software the way my buddy Pete and I tackled Chemistry Lab a million years ago.

But instead of doing this, we "solve" some other problem. Why? Like "Air Disasters", there are multiple causes. First, the shoreboss wasn't aware of the prize. But even if he was, he wouldn't have thought to ask a Senior Programmer to swim down and retrieve it.

This is why the Principle of Depth is so important. We live in a multi-level world. Single-level (and zero-level) thinking doesn't work anymore. Someone must be able to drill down multiple levels to get to the heart of the matter.

And that someone is usually one of us programmers. If shorebosses could learn just one thing, this may be it. Imagine how much better life would be if they did.

What's the difference between Depth and Pulse?

Get
More
Coffee
Set up
Happy
Hour
Hide
from
Nero
Bet
on
Bears
Bring
some
Donuts
Take
Long
Lunch
scrumBell
I have my finger on the Pulse!
I know what's going on!
I know that we need to do standard cost!
CLOSED TICKETS
- add Std Cost
- add Box IDs
- add WIP Report
- add AWS API
- add XML feed
- add Dropdown
- add colors
OPEN TICKETS
- fix Std Cost
- fix Box IDs
- fix WIP Report
- fix AWS API
- fix XML feed
- fix Dropdown
- fix colors
paulaNomial
So will we need to do roll-up or fold-in standard cost?
Get
More
Coffee
Set up
Happy
Hour
Hide
from
Nero
Bet
on
Bears
Bring
some
Donuts
Take
Long
Lunch
scrumBell
I don't know that. You mean I need to dig deeper?

Pulse is awareness. Depth is understanding.

Pulse is asking "What?" until you know. Depth is asking "Why?" until you grok it.

Pulse is horizontal. Depth is vertical.

Pulse is knowing that Spanish rice is on today's menu. Depth is knowing that lard is in the recipe.

Pulse is knowing that we no longer need overnight CRON jobs to make sure that the Business Intelligence Metrics database matches the Order Processing database. Depth is understanding that's because of the 14 indexes that we added to 8 tables, and what they all are, how they work, and the tradeoffs we made when we did it.

Pulse is knowing that we have Smart Part Numbers and they are maintained by the Quality Control Department. Depth is understanding that a Proprietary Part Number with an "05" in digits 3 and 4 can only be sold to the Department of Defense if it's zinc plated, heat treated by a certified subcontractor, and accompanied by an ARD8130 Certification document.

Pulse is knowing that Code Standards prohibit early exits from functions. Depth is understanding how quickly legacy software degraded in maintenance before we had that standard, what people did wrong, what we should be doing instead, and why it matters.

Pulse is addressed by getting up off your butt. Depth is addressed by looking far enough past your nose to learn something while you're up.

What makes us tick?
Get below the surface and find out.

alGorithm
We have no margin for error.
CLOSED TICKETS
- add Std Cost
- add Box IDs
- add WIP Report
- add AWS API
- add XML feed
- add Dropdown
- add colors
OPEN TICKETS
- fix Std Cost
- fix Box IDs
- fix WIP Report
- fix AWS API
- fix XML feed
- fix Dropdown
- fix colors
paulaNomial
We don't follow our plans.
garyRiggs
And we forget how to count.
oinksAlot
Then it must be the software. We have to replace it.

I was brought into Papermaker to maintain their legacy system while their Pig4 accounting firm found them a new ERP system. In order to fix their bugs and add critical enhancements, I had to learn what made them tick. In order to do that, I had to get wet and go and find out.

Papermaker made special absorbant paper for healthcare and foodservice in two shops. One shop used their magical formulas to make paper in large vats and spin them onto giant 12 foot rolls. The other shop slit and cut those rolls into specific sizes and boxed them.

After a little discovery, I figured out that Papermaker's production and distribution was one giant linear programming problem governed by nonnegotiable constraints. Without massive (and unrealistic) capital expenditures, they were able to produce enough paper to fill 12,000 cartons per day. Six massive machines each cut and slit 2,000 cartons per day. They had enough room to store 80,000 cartons. They were able to fit 1,200 cartons into each of ten 40 foot trailers per day. Customers ordered between 50 and 3,000 cartons per order, but would only take delivery within 3 weeks of their scheduled receipt date. And everything had to run 24/7 in order to meet their schedules and absorb their massive overhead.

Are you starting to see how critical their decisions were? Which recipe of paper should be cooked in the vat? Which 12 foot rolls should be loaded onto which machine on which shift? Where should the slitting (lengthwise) blades be set? Where should the knives (widthwise) be set? What should be stored where? Which cartons should be loaded onto which trucks? And where should those trucks go?

(And perhaps the biggest question of all: How did they manage to stay in business by doing all of this for a hundred years without a computer?)

This was a process shop in the back with a job shop in the front and no margin for error. Other than a few trade secrets, it was a simple operation. Few Stock Keeping Units, few Customers, few Vendors, and few Orders. Simple reporting and critical decision making.

So what could go wrong? Two things, mainly. Last minute changes, for a hundred different reasons, caused both large (vertical blades) and small (horizontal knives) setup changes to the slitting machines. This threw off production schedules, which cascaded everywhere else. The other problem was a killer. If a supervisor miscounted or forget to update his shift count, the next supervisor would produce the exact same product a second time. This wasted raw material on the wrong finished good, lost time on the slitter, and left nothing to put on the truck and ship to the customer. Sometimes it took a week to recover from a FUBAR like this.

Lack of discipline and inaccurate counting continuously threatened the future of this hundred year old concern. So what did their Pig4 accountants say? They blamed their software! The only solution? Millions of dollars for a new ERP system that would take them three years to implement and address none of their problems.

These Pig4 consultants were the worst kind of shorebosses. They were "above" going into the water. But they wouldn't have had to dive very deeply for the prize: understanding the business. It took a nerdy programmer like me less than a week to figure it out just by walking around, watching, and asking questions. It was so easy, I didn't even need scuba gear.

Insidious Bug or Comedy of Errors?
Get below the surface and find out.

MISSION STATEMENT
If you're reading this, you're the first person who ever did.
ORG CHART
I'm the boss. You're not.
OPEN DOOR POLICY
lol lol lol lol
neroFiddler
Why are orders going out without prices?
helenWaite
Maybe we forgot to run the Price Load job.
Get
More
Coffee
Set up
Happy
Hour
Hide
from
Nero
Bet
on
Bears
Bring
some
Donuts
Take
Long
Lunch
scrumBell
Or the users aren't keeping the Price file updated.
Days Since Last Bug: 0
dunningKruger
I thought we were having a clearance sale.

At Mediocre Products Inc., a user gave me a pdf of a Purchase Order that had been emailed to a Vendor. The problem? No prices. Yikes. Huge problem! Mediocre emails thousands of Purchase Orders to Vendors every day, full of data supporting critical legal and mission critical transactions. The fundamental data elements are Part Number, Quantity, and Price. How could the price be missing? And how could it only be missing from one (or a few) out of thousands of Purchase Orders?

Fortunately, I was able to reproduce the problem on the first try. I reprinted the Purchase Order and sure enough, no prices. I reprinted several others and there were prices.

The next step was to isolate the problem, debugging backwards. Output Record? Blank price. Variable feeding Output Record? Null value. Price on Purchase Order data base record. Fine. Hmmm. Next I examined the logic pulling the data from the data base and placing it in the output variable. It was looking for the Price in Column 22, the column for Foreign Currency Price. On an order to a California Vendor? OK, I was onto something.

I zeroed in on these two lines in the print program:

CurrencyCode = PORec[45]

if CurrencyCode = "USD" then PriceCol = 21 else PriceCol = 22

What was in Column 45 of this PO Record for this California Vendor? "USD" and a bunch of control characters. Hmmm. That would cause PriceCol to be 22 when we obviously want it to be 21. The Price was in Column 21 but we are pulling a null out of Column 22. Bingo.

The customers are screaming. The business is suffering. Now what?

Stupid way out: Get the Currency Code from the Vendor record, not the PO Record

Lazy way out: Strip the control characters from PORec[45].

Right way out: Find out what's putting control characters into Column 45 of the PO Record.

The right way out can be very difficult with a large code base. First I isolated the 614 programs that had been promoted into production in the last 90 days. (I figured that the problem was new so the culprit program must be fresh.) I searched for the string "45". 42 hits. Nothing suspicious. Next I looked at data dictionaries and canned functions that provided potential synonyms for Column 45 of the PO Record. I found four possibilities. Then I searched the 614 programs for each of these. Nothing. Hmmm. Standards that no one follows. OK.

Then I simply scoured the list of 614 programs. One name caught my eye: "PoSplitter". Brand new. Written by a contractor who didn't know the whole application. Promoted 3 weeks ago. I read the whole program. No reference to "45", "Foreign Currency", or anything seemingly related. But one variable looked suspicious: DatasetCols. What was this? A list of columns in the PO Record that had matching multiple values, one for each Part on the PO. DatasetCols was a global variable passed down by a master routine. I read that routine and (bingo!) found 45 in the list of DatasetCols. I traced the mods back to 2005 when it was added to the list.

I double-checked the data dictionaries and the common functions. All said that Column 45 of the PO Record must be a single Foreign Currency Code defaulted from the Vendor Record and joined to a preset table. On the other hand, the master PO routine had it in a dataset list. A dataset list that had never been referenced by any other program until that contractor used it in PoSplitter. So, as soon as his program went into production, for every Purchase Order that was "split", Column 45 kept its original Foreign Currency Code along with a control character for a delimitter for each Part on the PO. Which in turn caused the PO Print program to fail to secure "USD" and automatically default to Foreign Currency (note that this bug would never affect foreign orders).

The immediate solution:

1. Remove Column 45 from the variable "DatasetCols" in the master routine. Recompile all affected programs.
2. Clean up the data base.

We programmers are used to deep diving, looking for the prize that would explain how the software works and why we have this bug. That's what we do. Too bad more of us don't do enough of that before deploying to production.

Read the whole story, including the long term solution:

Insidious Bug or Comedy of Errors?


Why is everything so sloooowww???
Get below the surface and find out.
(Starting to notice a pattern?)

MISSION STATEMENT
If you're reading this, you're the first person who ever did.
ORG CHART
I'm the boss. You're not.
OPEN DOOR POLICY
lol lol lol lol
neroFiddler
Why is everything running so slow?
wandaWant
We must have lots of orders to re-process.
Get
More
Coffee
Set up
Happy
Hour
Hide
from
Nero
Bet
on
Bears
Bring
some
Donuts
Take
Long
Lunch
scrumBell
It always gets like this at month end.
Days Since Last Bug: 0
dunningKruger
Maybe because it's so cold outside.

RB Insurance Services had no I.T. department and just a standard software package with support from their vendor. It had been running slower and slower over time until they brought me in to find out why. I asked a dozen people from both RB and their software vendor for clues, but nothing helped.

Then I examined the database structure and saw that they ran a hashed location system. For each "row" to be stored, a "hash" was calculated on the key to determine which "frame" it should be inserted into. The number of frames per table was originally set by the vendor upon installation four years earlier. If the calculated frame was full, then the row would be stored at the end of the Overflow area.

When I ran the "File Stats Report", I discovered that the business had grown so much that every frame size was grossly undersized. Over 99% of the database was in Overflow!

FILE STATS REPORT

File Name# Recs Status
INVEN_DTL 1,233,956OVERFLOW
INVEN_HDR 243,956OVERFLOW
INVEN_LOG 5,968,203OVERFLOW
INVEN_MSTR 294,158OVERFLOW
INVOICE_ARCH84,392,654OVERFLOW
INVOICE_DTL 7,123,465OVERFLOW
INVOICE_HDR 1,238,314OVERFLOW
INVOICE_LOG 14,293,421OVERFLOW
INVOICE_MSTR823,570OVERFLOW
INVOICE_XREF94,450OVERFLOW

Page 43 of 127

So I wrote a quick program to count every record in the database and calculate a new frame size for each table, and added another 50%. Then I rebuilt the database into these new frames. Both run times and overflow decreased by 99%.

This had gone on for four years! Everyone in the company had to sit and wait and wait and wait for their computer. All because no one from the software vendor or customer had bothered to dig one level below the surface for a simple report that would explain everything.

Sometimes the the biggest prizes are the simplest and float just beneath the surface. But someone still has to look for them.

Find out what really keeps them up at night.
Get below the surface and find out.
(OK, we get it already!)

wandaWant
Everyone loves this software. And it's pretty.
alGorithm
It must be good if everyone else wants it.
oinksAlot
And we get a special deal with the publisher.
Call-In User 1
No. This won't work. Find me what will!

In-and-Out Cable Assemblies was a Contract Manufacturer. They had one customer who was also their only vendor. They dropped off packs of components, 400 In-and-Out workers assembled them on benches, and the customer picked them up. Because of the complexity of the finished products and the volume of orders, In-and-Out brought me in to find them a manufacturing software package.

Two prior consultants had recommended standard off-the-shelf job shop packages, but Bill Grant, the President, refused. I was Consultant #3.

I didn't understand the Bill's problem with the first two engagements, so I went out with him after work for a beer (or two). We hit it off because we had a lot in common. We came from the same home town, knew some of the same people, and thought somewhat alike. We enjoyed our talks so much, Happy Hour became a regular thing.

It was the third Happy Hour when I finally figured it out. Bill was petrified that his business was dependent upon one customer who could leave him at any time. He had considered expanding contract manufacturing to other customers, but knew deep down that his future had to be in leveraging In-and-Out's custom assembly expertise into building inventory to sell to multiple markets. He also wanted something none of his competitors had, for an advantage.

Job Shop software would not support this. I only knew one software vendor who had everything we needed. Escom had both a job shop system and an ERP system that ran from the same database and had open hooks for the customer to add custom software. They also had a local Value-Added Distributor to serve In-and-Out.

I arranged a demo with In-and-Out's own products in both job shop and MRP modes. Bill fell in love with it and claimed, "This is exactly what I had in mind!" He bought it and we started work immediately.

Sometimes we have to deep dive for a prize inside another person's head (or heart).

And sometimes we have to do that during Happy Hour. Who said deep diving for the prize shouldn't be fun?

Break it Down
Get below the surface and find out.
(Enough already! We have our scuba gear.)

MISSION STATEMENT
If you're reading this, you're the first person who ever did.
ORG CHART
I'm the boss. You're not.
OPEN DOOR POLICY
lol lol lol lol
neroFiddler
Why does the overnight batch run 18 hours?
wandaWant
Because we have so much data.
xavier127.0.0.1
Because our server is too small.
Days Since Last Bug: 0
dunningKruger
It's a software package. There's nothing we can do.

Dummyprise, Inc. had implemented the world's largest, most sophisticated ERP system but it didn't work. Why? Because it took over 18 hours to run the overnight batch. So Monday was the only day with up-to-date data.

As usual, I asked everyone what they thought and where to look. No help. So I did the next best thing. I collected data. Every day I tracked the run times from the server logs and made a report like this...

Dummyprise, Inc.Overnight Run Timesengineeringinventorysupply demandaccountingMon  Tue  Wed  Thu  Fri  runhours181512963


Wow! A picture is worth a thousand minutes. It sure looks like the issue is with the Supply Demand update, doesn't it? So I shared this report with SAPU, our software vendor. They responded, "Oh yes. We have seen long Supply Demand run times like this with other customers. We have a fix."

That weekend, they applied their fix and got overnight run times under eight hours. Problem solved.

I just pulled the server logs and assembled the data to reveal the problem. This took almost no work. Just a little shallow dive below the surface. Once again, the power of Depth.

I wish our shorebosses would stop trying to shift paradigms with key performance indicators to grab the low hanging fruit. If instead they would just take a look through their snorkel masks, imagine how much easier everyone else's lives would be.

�If you want something, Go Get It, PERIOD!�
- The Pursuit of Happyness

Hmmmm... At Papermaker, I was the only one who figured out what made us tick. At Mediocre Products, I was the only one who figured out the Foreign Currency logic. At RB Insurance, I was the only one who thought to run the File Stats Report. At In-and-Out Cable, I was the only one who knew to ask Bill Grant why he didn't like the first two options. And at Dummyprise, I was the only one who figured out which batch job was gumming up the works.

I must be the smartest person in the world!

Or maybe, just maybe, I was the only one who bothered to look more than zero levels below the surface.

This is the essence of Depth. I was too lazy to suffer with a problem for years, so I did a tiny bit of work up front to eliminate a giant bunch of work (and aggravation) later.

You don't have to be smarter than anyone else (although a little technical background helps) You don't need much more training or experience. You don't need to be assigned to the glamour projects. And you certainly don't need permission.

In order to get to the critical heart of the matter, you usually need only two things: To know that the prize is below the surface. And that you have to go get that prize.

That's it. Eddie's Principle of Depth. Grok it, go for it, and get your prize. And don't forget to dry off. It's wet down there.

Everything you need to know about
Eddie's Principle of Depth
is at the other end of this yellow line.

There. That wasn't so hard, was it?

Then why don't you deep dive like that at work?

Study Questions

1. When was the last time you had to deep dive when no one else did? What did you accomplish? How did you feel? What was different after that?

2. Did any of these stories strike a nerve or remind you of something interesting from your past? Did any seem strange or unrealistic? (Yes, they all happened for real.)

3. Who at your work could benefit from this essay? Leave a hard copy on their desk, but not under too much other stuff. They may never find it.

4. What's the biggest reason at your work that people don't do enough deep diving? What's the best way to fix that?

5. Which comic character most reminds you of someone at your work? Will you tell them? How?

Quiz

1. Which is not an Agent of Technical Depth.
 a. the junior programmer
 b. the scrumbell
 c. the shoreboss
 d. the senior dev
 e. the navy seal with a CS degree


2. What's the difference between deep diving and debugging?
 a. Deep diving is before. Debugging is after.
 b. No one notices when you deep dive. Everyone notices when you debug.
 c. Debugging is urgent. Deep diving is important.
 d. Debugging is (usually) structured. Deep diving is unstructured.
 e. all of the above


3. Why don't Pig4 consultants deep dive?
 a. If they did, their customers wouldn't want to do it anyway.
 b. They're better at marketing than actually solving problems.
 c. The fleabosses who hire them don't know any different.
 d. It doesn't matter. Their business is built-in from the auditing division.
 e. Pigs can't swim.


4. Why don't people notice the prize right in front of them?
 a. They're stupid.
 b. They're lazy.
 c. They're too busy.
 d. It's not their job.
 e. They don't want to make more work for themselves.


5. Why is finding the prize in someone else's head so hard?
 a. We'd rather work on code than people.
 b. We mistake people problems for technical problems.
 c. They don't see the benefit of sharing.
 d. Dear Abby already has another job.
 e. All of the above.

You May Also Like

Eddie's Law of Empathy
The best way to understand the other is to have been the other.

How Chem Lab Made Me a Better Developer
(It had nothing to do with chemistry.)

Eddie's Law of Pulse
Deep Awareness is inversely related to the square of the distance from the source.

Insidious Bug or Comedy of Errors?
Just another day in the nest

It Takes 6 Days to Change 1 Line of Code
I love programming. It's the process I can't stand.


edw519 at gmail
Y in