Mitigate the risk of temporary data science development hacks

John Weathington

Updated December 13, 2014 at 6:00 AM

Image: iStock

I'm in the doghouse again. I'm the one responsible in my household for hooking up anything electronic. Therefore, when the call came to wire-up all the various devices connected to our family room TV, I responded with dignity and valor. My approach resembled my agile-esque philosophy on life: first make it work, then make it pretty, and finally make it better.

Within an hour or two, I was done with phase one. Everything was working; however, there were wires everywhere. I assured my concerned wife that I would soon make everything look clean and neat. That was several years ago. My wife has had it, and I don't blame her.

This happens in the world of data science as well. We're all guilty of writing a hack with the best of intentions of fixing it later, and then forgetting all about it until it shows up at the most inauspicious time (like during a client demo). When building data science solutions, make sure your temporary hacks don't become permanent.

Good intentions and bad practices

Temporary code is a necessary evil. Unfortunately, it's part and parcel of an extremely effective best practice. Code for data science can get extremely complex, which leads to confusion and frustration when things aren't going right.

To simplify your approach, isolate errant code, and most importantly protect your sanity. Your first order of business must be to get the code working as quickly as possible -- by any means necessary. More often than not, that includes one or more temporary hacks. For instance, you might hard-code the value of an incoming parameter, so you can troubleshoot what's happening with the other parameter. Of course, you see the setup here. If you accidentally deploy this code into production without removing your hack, any calls to this function will return misleading results.

To make matters worse, it's very difficult to catch these mistakes. To your computer, your hack looks like perfectly working code, so you won't get any complaints at compile-time. Plus, most programmers aren't conspicuous enough about their hacks. Hacks are usually buried somewhere deep in the code with a small, unnoticeable comment -- successfully escaping functional testing and even code reviews.

When profiling the risks of data science development, most analytic managers focus on probability and impact, but overlook one very important third dimension: detectability. Temporary hacks shoot the moon with risk: highly probable, highly impactful, and extremely difficult to detect. Something must be done to mitigate this risk.

Clean up your mess

Dealing with temporary hacks involves a couple of preventive measures. First, you can prevent the appearance of temporary hacks by following good unit testing practices. Instead of writing a hack, just write a good test that doesn't interfere with your production code. So, instead of hard-coding that parameter's value in your production code, call the function from within your unit test's code, with the hard-coded parameter value. Since the code for your unit tests aren't deployed with your production code, you eliminate any risk that these hacks will show up in the wrong place. This takes planning and discipline, so your success with this approach is greatly dependent on your team's overall approach to test-driven design (TDD).

Even if your team hasn't fully embraced the TDD paradigm, there's still hope with another preventive design pattern called a marker interface. A marker is like a comment on steroids. It's a simple but effective design pattern that structurally tags any function that contains a hack. In an object-oriented language like Java, you would create a simple marker interface (i.e., with no method declarations) with a conspicuous name like TEMPORARY_HACK. Then in your production code, just before you hard-code that parameter's value, you would write a small comment to mark your place (which you probably would have done otherwise) and then immediately tell your class to implement the marker interface. Finally, make sure your build process flags temporary hacks (i.e., all classes that implement the TEMPORARY_HACK interface) as errors (not warnings), so those insidious hacks never make it to your production code.

Once your temporary hack is removed, you can safely remove the TEMPORARY_HACK marker. Worst-case, you forget to remove it, and your build process erroneously stops the build. That is much better than erroneously allowing temporary hacks into your production code. Once you understand the concept, you can easily apply this technique to non-object-oriented languages as well.

Summary

It's never a good scenario when temporary code becomes permanent code. Even with the best of intentions, we're all guilty of diving deep into code surgery only to leave sponges and sutures in the production image.

I've shown you a couple of ways to prevent these hacks from ever making their way into the wrong place: coded unit tests and marker interfaces. Both involve good practices and discipline, so review your code but more importantly review your habits. Your dog may be your best friend, but it's pretty cold outside in his house.

Automatically subscribe to TechRepublic's Big Data Analytics newsletter.

Yahoo Sports
2024 NBA Mock Draft 7.0: Who will the Hawks take at No. 1? Our projections for every pick with lottery order now set
With the lottery order set, here's a look at Yahoo Sports' projections for both rounds of the 2024 NBA Draft.
Yahoo Sports
NBA Draft Lottery: Hawks get No. 1 pick, despite 3 percent chance of winning
The Atlanta Hawks won the No. 1 overall selection in the NBA Draft Lottery. The Hawks had a 3 percent chance of winning the top pick.
Yahoo Sports
NBA playoffs: Nuggets stun Timberwolves with Jamal Murray prayer; tie series, reclaim home-court advantage
The champs are back.
Yahoo Sports
Former MLB infielder, Little League World Series star Sean Burroughs dies at 43
The seven-year major leaguer collapsed while coaching his son's Little League game on Thursday.
Yahoo Sports
Anthony Edwards talks postgame exchange with Jamal Murray: 'We love that, keep talking that'
Edwards is here for the chatter. And he's goading Murray for more.
Yahoo Sports
The best RBs for 2024 fantasy football, according to our experts
The Yahoo Fantasy football analysts reveal their first running back rankings for the 2024 NFL season.
Yahoo Finance
Here's 1 big investing mistake you are probably still making
Maybe a 5% CD isn't the best choice for your hard-earned money.
Yahoo Sports
Dolphins owner Stephen Ross reportedly declined $10 billion for team, stadium and F1 race
The value of the Dolphins and Formula One racing is enormous.
Yahoo Finance
How rich homebuyers are avoiding high mortgage rates
Homebuyers with means are turning to an old strategy to get around a new crop of high mortgage rates: all-cash deals.
Yahoo Sports
Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'
Murray made a bad night on the court worse during a moment of frustration on the bench.
Yahoo Finance
The FDIC change that leaves wealthy bank depositors with less protection
Affluent Americans may want to double-check how much of their bank deposits are protected by government-backed insurance. The rules governing trust accounts just changed.
Yahoo Sports
Fantasy Baseball Waiver Wire: A hitter who should be rostered in every league is available in more than half of them
Prep for the final days of Week 6 with Dalton Del Don's latest batch of fantasy baseball waiver wire pickups!
Autoblog
Which pickup trucks get the best fuel economy? Here are the tops for gas mileage (or diesel)
Trucks aren't known for being fuel efficient, though times are changing. These are the trucks with the best gas mileage in various segments.
Engadget
The best budgeting apps for 2024
Budgeting apps can help you keep track of your finances, stick to a spending plan and reach your money goals. These are the best budget-tracking apps available right now.
Yahoo Sports
Wide receiver rankings for 2024 fantasy football
The Yahoo Fantasy football analysts reveal their first wide receiver rankings for the 2024 NFL season.
Yahoo Sports
Tight end rankings for fantasy football 2024
The Yahoo Fantasy football analysts reveal their first tight end rankings for the 2024 NFL season.
Yahoo Finance
Former House Speaker Paul Ryan says he’s not voting for Trump : 'Character is too important'
Ryan says he would be writing in a Republican candidate instead of voting for Donald Trump.
Yahoo Sports
Derrick Lewis strips off shorts, moons crowd in St. Louis after KO win over Rodrigo Nascimento
“I appreciate St. Louis for letting me show my naked ass tonight."
Yahoo Finance
Bud Light sales still falling as Modelo, Coors fight to keep their gains
The competition among beer giants is still brewing.
Yahoo Finance
Australian ambassador: 'American model is proving its resilience' despite threat from Chinese industrial policy
China may be outspending the US when it comes to industrial policy in sectors like electric vehicles and semiconductors, but America is winning on innovation where it can’t on price, according to one China expert.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Mitigate the risk of temporary data science development hacks

Good intentions and bad practices

Clean up your mess

Summary

Automatically subscribe to TechRepublic's Big Data Analytics newsletter.

Recommended Stories

2024 NBA Mock Draft 7.0: Who will the Hawks take at No. 1? Our projections for every pick with lottery order now set

NBA Draft Lottery: Hawks get No. 1 pick, despite 3 percent chance of winning

NBA playoffs: Nuggets stun Timberwolves with Jamal Murray prayer; tie series, reclaim home-court advantage

Former MLB infielder, Little League World Series star Sean Burroughs dies at 43

Anthony Edwards talks postgame exchange with Jamal Murray: 'We love that, keep talking that'

The best RBs for 2024 fantasy football, according to our experts

Here's 1 big investing mistake you are probably still making

Dolphins owner Stephen Ross reportedly declined $10 billion for team, stadium and F1 race

How rich homebuyers are avoiding high mortgage rates

Timberwolves coach Chris Finch calls Jamal Murray's heat-pack toss on court 'inexcusable and dangerous'

The FDIC change that leaves wealthy bank depositors with less protection

Fantasy Baseball Waiver Wire: A hitter who should be rostered in every league is available in more than half of them

Which pickup trucks get the best fuel economy? Here are the tops for gas mileage (or diesel)

The best budgeting apps for 2024

Wide receiver rankings for 2024 fantasy football

Tight end rankings for fantasy football 2024

Former House Speaker Paul Ryan says he’s not voting for Trump : 'Character is too important'

Derrick Lewis strips off shorts, moons crowd in St. Louis after KO win over Rodrigo Nascimento

Bud Light sales still falling as Modelo, Coors fight to keep their gains

Australian ambassador: 'American model is proving its resilience' despite threat from Chinese industrial policy