Say that ten times, as fast as you can, regardless of where you are standing right now.  Not only will be tough, someone will call security to have you removed.  Imagine the story you’ll have to share afterwards.  That one was on me.  You’re welcome.

crazy_crash_30

Okay.  SO, there’s a point to the Dr. Seuss-ish headline actually.

Back in college, I came within 5 millimeters of switching my major from CS/IS to Math/Statistics.  That’s the distance between the checkboxes on the old paper forms we used for enrollments.  It was probably for the best that I didn’t, but I was deeply in love with statistics and data modeling.  For me, it felt like the scene where Neo stares at the falling numbers and tilts his head, signaling he finally saw the patterns that represented the virtual world around him.  Okay, not quite that dramatic, and I can’t pass for Keanu on my best day.

If you read the “causation” page on Wikipedia, you’ll get a taste for what pulled me closer to statistical analysis.

However, two very important aspects about statistics that every human should learn and apply every single day are:

Why?

Because you’re being brainwashed every minute of every day by the media, the news and marketing.  If you understand these two concepts, you’ll almost immediately see what I’m talking about.

Case 1

News reports that a major survey was completed and it shows 52% of participants are <fill in the blank>, while 48% are not <fill in the blank>.

At the bottom of the screen, in the tiniest print, it shows “Margin” or “Margin of Error” or “MOE”, etc. as 8%.  That basically means the results could be “off” by 8 percentage points in either direction.  That alone isn’t enough to invalidate the results, but without the Standard Deviation curve, and more detail about the sampling methods, it boils down to an arbitrary story.

Even more common these days, are statistics without any mention of the margin of error.  So we have no idea how accurate the numbers are.  (NOTE: Some of you have raised the 800 lb gorilla in the room, which is that we don’t even know if a study was really even performed.  The entire claim could be false.  Which leads into my “dolphin safe” mention, further down)

Case 2

More red cars get pulled over by police for speeding than cars of other colors.

Questions that have to be validated before this claim can be believed:

  1. Do more aggressive lead-foot drivers prefer red cars?
  2. Do more red cars exist in the areas where the sampling was performed?  Is it tougher to find other color cars in the areas where sampling was done?
  3. Do police write more tickets for red cars, but issue warnings for others instead?
  4. Is the proportion of red cars higher than red colored trucks, motorcycles, carts, buses and so on?
  5. Are other demographics involved?  Such as age, sex, gender, nationality, personality type, economic status.  Would a deeper study reveal it’s really about boys between 18-22 who prefer faster, red cars and happen to be on the road more often?

Item #1 is the most-often cited example.  However, item #4 is valid as well.  What if more red cars are ticketed, but green trucks are ticketed even more than red cars?  Does that still mean “red” is the key differentiator, or is it part of an aggregate condition?

You can obviously disprove or exclude any or all of these possible criteria from the causal analysis.  But the bigger question then is: did anyone do that?

The point I’m trying to make, is to be careful whenever you see or hear claims that something leads to something else.

I often joke about this dilemma as being the ‘dolphin safe’ scenario.  Back in the 1980’s, there was a huge campaign to force tuna and salmon product vendors to guarantee their suppliers were adopting methods to prevent the accidental netting and killing of dolphins.  In short time, the vendors (no names, but you know who they are) began stamping “Dolphin safe!” labels on all their fish products.

The public quieted down and moved on to the next social outrage du jour.  But without having a big enough boat to venture out hundreds of miles into fishing areas, there was little the average person could do to challenge the validity of those labels.  I can recall many times when I’d joke someone eating tuna, “how do you know?”.  And their response would often be, “because it says it on the can!  They wouldn’t lie about that.”

Sure they wouldn’t.

One of my teachers once said something like “Statistics are being twisted into becoming the parsley on the plate of dogma.”  Just think about reading the labels before you eat it.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s