Narasi Sridhar

# How can we make decisions based on probability analyses?

Making decisions based on probabilities is both instinctive and difficult to quantify. Which fork in the road do I take? One looks bright and inviting and the other looks dark and forbidding. But, are they going to stay that way?

According to the U.S. Pipeline and Hazardous Materials Safety Administration, the number of serious incidents on crude oil/refined liquids pipelines in the U.S. fluctuates between 0 and 2 per 100,000 miles each year since 2005 and a trend line shows decreasing frequency. Similar rates are also reported for natural gas transmission lines. The probability of 2 per 100,000 miles per year appears to be small. How do we use this frequency measure to make decisions? Is every mile of a pipeline identical? Such low average probabilities are obtained by dividing the number of incidents by a large denominator - there are more than 100,000 miles of transmission pipelines operating for decades in the U.S. Not every mile of a pipeline is identical. Some of them are very old and in urban areas. Others are in rural areas, but are encroached by new developments. The soil conditions, terrain, earth forces, rain fall, temperatures, vegetation, near-by bodies of water, etc., are different in different locations of a pipeline. Often, pipeline companies do not have much data regarding these local factors. They collect a lot of cathodic protection data, but using this to predict where and when failure will occur is nearly impossible.

Another averaging occurs over time period. If the assumption is that the defect growth occurs over the entire time period, then the average growth rate is small and the probability of a defect is conditioned by that growth rate. Corrosion defects do not always grow continuously. There are daily, seasonal, and episodic variations. This is similar to the question of why hundred-year floods can occur two years in a row. Partly this is the nature of probability, but partly it is the assumption that the data used to calculate a return period of hundred years assumed that the same mechanism would operate in the future. As climate change or other local phenomena occur, the causal factors leading to flooding may change.

Bayesian network (BN) models incorporating various causative factors leading to pipeline failures have been assembled. The BN models predict the combined probabilities of different factors leading to a failure, also called marginal probabilities. Another advantage of the BN models is that inspection and excavation data can be used to update the probabilities of the causative factors using Bayes theorem. The probabilities predicted by the Bayesian network models can be high. For experts used to seeing low probabilities predicted using pipeline incident data such as the frequencies found in pipeline statistics, the high marginal probabilities predicted by BN appear as incredible. How do we bridge the gap between the expectation of the experts and the BN model prediction?

It is important to realize that there is no such thing as "absolute" probability. All probabilistic evaluations are marginal (relative) probabilities conditioned by our knowledge of the system, which is used to derive the causal linkages and the probabilities of various causative factors. Therefore decisions should be made based on relative ranking of probabilities that are themselves quantified using known physics of the phenomena.