In the previous post we introduced the concept of treating data collection and management as a manufacturing process. We touched on the notion that all data is potentially important, and should be treated as such. Finally, we posited that the first requisite of successful data governance is an attitude that focuses on attention to detail. This is all well and good, but face it, a positive attitude and five bucks won’t buy you a non-fat, no-whip, double-shot, mocha macchiato. What you need is some concrete help. That’s what we’ll discuss here.

Previously we introduced two schools of process improvement: Six Sigma and Lean Manufacturing. We’ll add a third here, namely Theory of Constraints (TOC). With apologies for gross oversimplification, the three schools can be summarized as follows:

• Six Sigma focuses on process improvement through defect reduction and process uniformity

• Lean Manufacturing focuses process improvement through elimination of waste

• TOC focuses on process improvement through maximization of throughput (or, more appropriately to our discussion, minimization of cycle time)

A thorough treatment of the above is the study of a lifetime. Some hair splitting purists insist Lean and TOC are intrinsically at odds, but I’m a lumper, not a splitter. While the three schools differ somewhat in approach and emphasis, all offer important lessons for pipeline data governance. For us, the combination of Lean Six Sigma and TOC is complementary.

In pipeline data governance, our two most damaging wastes are data defects and long cycle times. Our typical as-is state is not unlike that of Lucy and Ethel in the candy factory. Unfortunately, data is not candy, so we can’t just eat it. Most of us have experienced situations where data collected in the field takes months or years to make its way to maps or alignment sheets. Data management practioners become overwhelmed; data errors creep in, and timely data distribution lags. The poor guy in the field ends up viewing GIS as a data roach motel; data checks in, but it never checks out. Fortunately, three simple process managment lessons can help us tackle this gnarly problem.

The first lesson hails from Six Sigma, and was touched on previously. If you don’t establish measurements for your data processes, then you really can’t know very much about them. Two forms of measurement are critical: 1) those that track cycle time, and 2) those that monitor defects. Measures of cycle time define the yardstick for overall process efficiency. For us, the most critical measures of cycle time are those that monitor how long it takes for data captured in the field to make it back out to the field (e.g., in updated maps or alignment sheets). Identification and characterization of defects is of paramount importance, because until you understand the root cause of a particular type of defect, you can’t correct the process that produces it.

The second lesson combines concepts from Lean and Six Sigma. From Six Sigma, all data managment processes should incorporate fail-safe steps (or poka-yokes) designed to detect data defects and stop defective data from entering the database. According to Lean, these fail-safe steps should be automated; when a fail-safe is triggered, processing should automatically be halted and human intervention initiated. In Lean-speak, this is “autonomation.” Let’s assume we’re collecting casing vent locations. Every casing vent collected should be in close proximity to both the pipeline centerline and a casing, and casings should, in general, be in close proximity to a road or rail crossing. These types of “spatial context” fail-safes are simple to automate, and if any are triggered, something is likely wrong with our hypothetical casing vent.

The third lesson comes from TOC, and guides prioritzation of process improvement. A close look at most as-is data managment processes reveals a target-rich environment for process improvement. The trick is prioritizing opportunities. Following the precepts of TOC, process constraints should be addressed in priority order; the constraint that most impacts cycle time should be tackled first. Naturally, if you have appropriate process measurements, critical constraints tend to reveal themselves.

Most of this best lends itself to ongoing data collection and processing. However, much is applicable to the validation, verification and correction of existing historical data. At Eagle we affectionately refer to this process as “Forensic Data Analysis,” or FDA. As a result of the NTSB’s recent urgent pipeline safety recommendations, many of you are now embroiled in forensic data analysis. Our FDA Team (FDAT) is ready to help.

Finally, it sure would be nice if a cool software package was available to deal with all of this… wait for it… Eagle to the rescue!!! We’ve created a new product called gisgap that implements the process managment methods we’ve been discussing. Stay tuned for more on gisgap in the next post.