Wake Me When It’s Over

One of the more challenging aspects of working at TAF is the requirement to alter one’s working hours on occasion to come in at times when some people may just be heading off to bed. For us, this occurs whenever we do a system release, typically on a Thursday every month or so. On those days, team members show up anywhere from 4:00 to 5:30. The first arrival is to cut over the system and monitor start-up. Others come in to test changes in place. The whole point of this is to do as much as possible to verify system functionality before production begins at 6:30.

If problems are detected, we won’t hesitate to revert to the previous build; the cardinal sin at TAF is doing anything that interferes with production! Of course, if we do back out a release, we have to verify that we’ve put everything back correctly. This is why we start at 4:00 in the morning.

Testing on the floor is an unfortunate necessity. While we do have automated testing (but not nearly enough or of good enough quality), there’s often no alternative to getting one’s hands dirty. Our systems interact with too great a variety of programmable logic controllers (PLCs) and other manufacturing tools to be able keep a fully equipped lab. To make things even more interesting, the programming of PLCs, which are really computers in their own right, is handled by different groups in the plant, who are free to make changes at their own discretion. Occasionally, these changes affect interactions with our own system. Hilarity ensues…

Coming in at 4:00 teaches some valuable lessons. The most important one is to avoid engaging in activities that require a great deal of brain power. We make the cutover and fail-back processes as simple as possible so that a sleep-deprived chimpanzee can do it on command. We have to be able to back out quickly, so restoring a copy of a database to undo changes isn’t an option. This means that implementing a change may actually be phased in over the course of a couple of releases, or there may be an interim release with only the changes necessary to stage the important new feature.

I’ve also learned not to do any coding that I actually plan on using. It’s too easy for silly mistakes to creep in, and we don’t have a good enough testing environment to risk it. Instead, I find it’s a good time to run lint and code analysis checks to identify areas for attack when I feel I can trust myself again. Based on the results of these checks, I might engage in “speculative refactoring” just to see how I might change things and what the benefits could be gained.

You might be wondering why we do builds on a Thursday. It’s a compromise. We’ve found that the effects of early mornings can linger for several days, so doing it as late in the week as possible minimizes lost productivity. However, physical changes in the plant generally occur on weekends. This means that the systems really should be known to be stable before physical changes take place. This argues for builds being done as early as possible in the week, but not Mondays. Mondays tend to get quite interesting as the operators discover equipment changes or PLCs having been left powered off. This leaves Tuesday, Wednesday and Thursday.

And now you know why you didn’t hear from me on Thursday and Friday last week!


0 Responses to “Wake Me When It’s Over”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: