Systematic Bias and theSoftware Developer



Systematic Bias and theSoftware Developer

0 0


pytn-systematic-bias

Slides from my talk at 2014 PyTennessee on Systematic Bias and Software Development

On Github briandailey / pytn-systematic-bias

Systematic Bias and theSoftware Developer

2014 PyTennessee

Brian Dailey / @byeliad

Who Am I

  • Python developer
  • CTO & co-founder at Stratasan
  • Freelancer
Quickly get this out of the way.
This talk is mainly going over material from "Thinking, Fast and Slow" by Daniel Kahneman. I read the book about a year ago and couldn't help but think about the ways that it applied to the software development profession. Kahneman is a Professor at Princeton and did pioneering work in how we make decisions with Amos Tversky. The book summarizes much of his research over his career along with other research in decision making.
"Before Kahneman and Tversky, people who thought about social problems and human behavior tended to assume that we are mostly rational agents." - David Brooks, NYT It's hard to summarize the effect Kahneman and Tversky have had on cognitive research. Kaheman - Nobel Prize in Economics in 2002
software development is a lot of probabalistic decision making * is this project technically feasible at this cost? * can we develop X in this timeline? * will this library make it easier for us to do our job or just add complications? * will this person be a good fit for our team?
"Some predictive judgments, such as those made by engineers, rely largely on lookup tables, precise calculations, and explicit analyses of outcomes observed on similar occasions." Alas, this sounds nothing like software development as we know it.
"Some intuitions draw primarily on skill and expertise acquired by repeated experience." This sounds more like the software dev universe as we know it. We are often called to make forecasts. And often we rely on our intuition tuned by years of experience.

Systematic vs. Random Bias

Random bias we can do nothing about. Systematic bias we can make efforts to minimize and control for. Kahneman's underlying approach is to categorize the brain's activities into two systems: System 1 and System 2.

System 2

Let's talk about Kahneman's model for decision making. This is the conscious decision making system. It is what most will self-identify with as "you." It is your conscious thought. slow, deliberate and arduous

System 1

fast, associative, automatic and supple Drives or confirms many of the decisions made by System 2. According to Kahneman, this is where the interesting things happen. I won't spend a ton of time detailing the differences between the two systems and how Kahneman arrived at this conclusion. Read the book. However, it's important to know that the decisions we make are often driven by a subconscious engine. This often causes us to make judgment errors based on weak evidence. Let's talk some systematic biases. [5:00]

Mental Substitution

You like or dislike people long before you know much about them; you trust or distrust strangers without knowing why; you feel that an enterprise is bound to succeed without analyzing it. — Kahneman

Succintly put, we often choose to skip the hard question and substitude an easier one.

Will this project succeed or fail?

Let's start with a sample question.

Hard

  • Who are the competitors?
  • What is the market size?
  • Is capital available to build & grow?
  • How much would it cost to maintain?
  • What can we charge for it?

Easy

  • Do I like the management?
  • Do I find the business idea appealing?
  • Have I heard about similar businesses that succeeded or failed?

Intensity Matching

If a question has an easily defined scale, we can borrow the scale from a similar question.

How happy are you these days?

How many dates did you have this month?

When asked these questions, college students showed no correlation between two.

How many dates did you have this month?

How happy are you these days?

But when they were reversed, the correlation became quite strong. Students used the scale of the first question to answer the second one. [8:00]

Anchoring

This one has entered the popular vernacular because it's so fascinating (Gladwell mentions it).

Anchoring occurs when you "consider a particular value for an unknown quantity before estimating that quantity."

"How long do you think think this feature will take? 2 weeks?"
"How long do you think think this feature will take? 6 months?"
If you know it's complicated, you may short yourself by anchoring on the short timeline. If you know it's a trivial change, you may pad it given the second, longer timeline.

Anchoring

2 Factors

Adjustment is an effortful operation.

This gets into the idea of ego depletion. In research, if a user was recently engaged in a cognitively heavy task, they are more influenced by anchors. Guess who engages in concentrated work most of the day?

Priming

Other factor in play is priming. Researchers found that if they primed subjects with a temperature (32F vs 70F) and flashed words quickly, words that were related to the temperature (winter, ice, frigid) would be more easily recalled. Same thing worked with a price and recalling car brands. (p. 123)

Anchoring is one of the few systematic biases that can be quantified.

30-55% influence is most common. Examples of this in action: real estate agents swayed by a listing price (41% vs 48% for amateurs) Ebay's "Buy Now" button Artificial rationing at the grocery store

It even works when the primer is completely preposterous.

"Was Ghandi older or younger than 144 when he died?"

Anchoring is a well-known tactic among negotiators.

First-movers will often throw out a number to anchor the discussion.

How does one adjust for anchors?

In the example of the time estimate, some use poker planning. If the number is crazy, there are 2 techniques: storm out, or spend time thinking of all of the ways the estimate should be at the other extreme. Ex: how could this go wrong and take longer than 6 months?
"Many people find priming results upsetting [...] because they threaten the subjective sense of agency and autonomy." All the more reason to be aware of the effects! [15:00]

Availability Bias

AKA "WYSIATI"

When we judge the frequency of an event by the ease with with which similar instances come to mind.

XUZONLCJM TAPCERHOB

How many instances? It doesn't have to be any at all. Kahneman used the following test: If asked which of these combinations can construct words more easily, you almost immediately know that the second has more potential.
"You wish to estimate the size of a category of the frequency of an event, but you report an impression of the ease with which instances come to mind."

"How often do we see bugs in production code?"

We estimate the frequency of bugs by how easily instances come to mind. If we've had three bugs in the past week you will estimate higher, even if the past month have had relatively few bugs.

INTENSITY MATCHING

We already hit on this briefly - mental substitution. We'll estimate higher if bugs were in sales demo rather than just affecting internal staff. Events in news (scope) is easily recalled. Personal experiences more than stories of other or stats.

An Interesting Flip

People are less confident in a choice when they are asked to produce more arguments to support it. If I am having so much more trouble than I expected in coming up with instances of my assertiveness, then I can't be very assertive. Professor used this to his advantage.
"Because we saw two major failures in Amazon Web Services in the past year, we believe Rackspace would be a better solution for hosting." Those failures were high profile outages widely covered in tech media. Would Rackspace outages gather that much attention?
"How many users have requested this feature?" Recency? Which users? [19:30]

Representativeness

This one is a mouthful, so you'll have to forgive me if I stumble on it multiple times.

Tom W

Tom W is of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to have little feel and little sympathy for other people, and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense. I've committed the sin of a wall of text. Forgive me. Instead of reading this wall of text I'll let you read it. Key take-aways: this sounds like an orderly fellow although perhaps a little socially awkward.

Tom W is a graduate student. Rank the following fields of specialization in the order of the likelihood that Tom W is a student in the field.

  • business administration
  • computer science
  • engineering
  • humanities and education
  • law
  • medicine
  • library science
  • physicial and life sciences
  • social science and social work
the question requires you to construct a stereotype of grad studnets in the diff fields. in the 70's this experiment yielded computer science as the top, then engineering, business, life sciences.

base rates

When evidence is weak, stick with the base rates. The "rules" for this are provided by Bayesian stats. anchor with base rates, adjust based on the diagnosticity of the evidence. question that diagnositicity!

A Thought Experiment

Someone at a Python conf offers to sprint with you on your OS project. * Macbook Pro with some github stickers, running emacs, snappy fedora, 20's male. * Dell Inspiron, Windows and PyCharm, mid 40's female Easy for us to dismiss people based on these rates. Don't let bias of representativeness draw your conclusion prematurely. [22:30]

Intuition vs. Formula

Paul Meehl - results of predicting grades of freshman at the end of the year, violations of parole, success in pilot training, etc.
"...60% of studies show significantly better accuracy for the algorithms." Many others have conducted similar studies and they all say the same thing. This is particular true in "domains that entail a significant degree of uncertainty and unpredictability"! Economics, football games, future prices of wine...
Wines are affected by weather. Predicting value of a vintage is of financial value. Orley Ashenfelter did it with 3 factors: Avg temp over growing season, amount of rain at harvest time, and total rainfall in prev winter. Correlation of prediction to actual prices is above 0.9

So why are the experts so bad?

Meehl suspects experts try to get clever. Complexity reduces validity. Experts even feel they can override a formula. "Broken-leg" rule. We are too easily influenced by system 1!

"The Robust Beauty of Improper Linear Models in Decision Making" - Robyn Dawson

Sometimes simpler algorithms can often be 'good enough.' Thing Apgar score. Standardized and saved lives.

Long feedback loops make forecasting difficult.

It's easy to be under the impression that you are great at making forecasts based on short-term feedback loops. "I knew that this feature was going to fail." "I know our users would love this." Long term is harder.

The Aversion to Machines We have a deeply seated mistrust of these algorithms. We see them as uncaring, sterile, and inflexible. Is it possible this is costing lives?

Intuition is still relevant.

The "RPD" (recognition primed decision) arises in domains where skill and experience come in. System 1 forms a tentative plan via associative memory (basically pattern recognition) and system 2 engages in a mental simulation. Chess players do this. (p. 237) It's relevant, just less so than we usually think. Confidence is NOT a reliable guide.

How predictive skill is built

  • A regular, predictable environment.
  • Practice over time.
[33:00]

Optimism: The Engine of Capitalism

I've saved the best for last. This is, says Kahneman, the most significant of the cognitive biases. Optimism is, to some degree, good. Live longer, say they are happier.
What I mean by optimism, really, is confidence. Optimistic entrepreneurs often belive they are bing prudent, even when they are not.
And, really, by confidence, I mean overconfidence.
You believe that you are superior to most others in most desirable traits, and you're willing to put money on it. This has been proven by a stack of research. That corner building that has had 10 restaurants come and go, each one believes they could do better.

Confidence causes you to neglect the skills of others.

I can write that library better than this! I can build a better product! How hard can it be?

Confidence causes you to neglect the role of luck.

Confidence causes you to neglect base rates.

35% of businesses survive more than 5 years. 81% of Entrepreneurs said their changes were 70% or higher. 33% said they could not fail. Restaurants in East Nashville.

Confidence causes us to focus on what we know, ignoring what we don't.

Back to WYSIWYG
"People tend to be optimistic about any activity in which they do moderately well." Many of us have seen this in any kind of skill. Martial arts, beginner developers. Blind to their own blindness.

Confidence builds credibility.

This one is scary to me. People listen to those that make forecasts with confidence. Be bold publicly, but privately doubt yourself. Don Moore.
However, if you project confidence and you're wrong, you may lose credibility.
"Even if [CFOs] knew how little he know, they would be penalized for admitting it."

3 Kinds of Overconfidence

  • Over estimation
  • Over placement
  • Over precision
Research from Don Moore (expert on managerial decision making) People overestimate the work they will get done. Overestimate speed, and act as if we have more control over circumstances than they do. Placement - Overestimate your own skills in making a prediction Precision - More precision than you actually have (accurate, but not precise, versus precise, but no accurate)

The Pre-Mortem

Kahneman argues this bias cannot be completely vanquished. However, assume the outcome was a disaser. Write a history of that disaster. Overcome groupthink, unleash imagination. Legitimizes doubts.

Wrap-up

"Without data, you are just another person with an opinion.

Hedge Your Bets

How important is precision vs accuracy?

Are we forcasting with a 90% confidence interval? Or much less?

You, as a professional, are responsible for being aware of the minefield that is your mind.

As knowledge workers, it's vitally important that we have a construct in which we can discuss these issues and talk about how it influences our decisions. As partners with a business, we are responsible for calling out their blind spots.

THE END

Brian Dailey / dailytechnology.net

@byeliad

Photo Credits