Identification of Successful Practices in Hydraulic Fracturing Using Intelligent Data Mining Tools; Application to the Codell Formation in the DJ -Basin

0
492

In a detail data mining study about 150 wells that have been completed in the Codell formation, DJ Basin, have been analyzed to identify successful practices in hydraulic fracturing. The Codell formation is a low permeability sandstone within the Wattenburg field in the DJ Basin of Colorado. Since 1997 over 1500 Codell wells have been restimulated. As part of a Gas Research Institute restimulation project 150 wells were studied to optimize candidate selection and identify successful practices. Hydraulic fracturing is an economic way of increasing gas well productivity. Hydraulic fracturing is routinely performed on many gas wells in fields that contain hundreds of wells. During the process of hydraulically fracturing gas wells over many years, companies usually record the relevant data on methods and materials in a database. These databases usually include general information such as date of the job, Service Company performing the job, fluid type and amount, proppant type and amount, and pump rate. Sometimes more detail information may be available such as breakers, additives, amount of nitrogen, and ISIP to name a few. These data are usually of little use in complex 3-D hydraulic fracture simulators. These models require additional and more detailed information. On the other hand, the collected data contain valuable information that can be processed using virtual intelligence tools. The process covered in this paper takes the above-mentioned data and couples it with general information from each well (things like latitude, longitude and elevation), any information available from log analysis and production data. The conclusion of the analysis is a set of successful practices that has been implemented in a particular field and recommendations on how to precede with further hydraulic fracture jobs. In this paper the results of applying this process to about 150 Codell wells during the GRI sponsored project is presented. This process provides an important step toward constructing a comprehensive set of methods and processes for data mining, knowledge discovery, and data-knowledge fusion from data sets in oil and gas industry. Introduction Patina Oil and Gas has been very active in the DJ basin in recent years. They have been one of the most active operators in the United States in identifying and restimulationg tigh gas sand wells. Patina has over 3,400 producing wells in the basin, and has restimulated over 230 Niobrara/Codell completions so far. Furthermore, it is estimated that the results they are achieving in terms of incremental recoveries are up to 60% better than other operators. Studies and analysis such as the one being presented in this paper has the potential to help operators like Patina Oil & Gas to increase their chance of success even to a higher percentage. It also has the potential to help other operators in increasing their chances of success in DJ Basin or any other locations throughout the North America. This stuy is probably one of the most comprehensive analyses of its kind ever to be performed on a set of wells in the United States. In this technical paper the authors’ intention is to introduce this new and novel methodology in its entirety and present as much of the results as the page limitations of this paper allows. Please note that due to the comprehensive nature of this methodology many of the topics cannot be discussed in much detail. It is our intention to introduce these topics in much more detail in series of upcoming technical papers. Methodology The process of “Successful Practices Identification” using state-of-the-art data mining, knowledge discovery and dataknowledge fusion techniques includes the following five steps. In order to comprehensively cover the theoretical 2 MOHAGHEGH, POPA, GASKARI, AMERI & WOLHART SPE 77597 background of each step involved in this process, a separate paper may be needed for each. Some of the ideas have been introduced in the past and are referred to in the references. Detail on other topics will be the subject of future papers. In this article the goal is to provide a view of the methodology as a whole, and therefore, authors will simply introduce and give a brief explanation of each of the topics to clarify their role in the process. Step One: Data Quality Control The process starts with a thorough quality control of the data set. During this process the outliers and their nature (are they really valid data elements or are they the result of human error either in measurement or in recording the values?) as well as missing data are identified, and using advanced intelligent techniques, the data set is repaired. It is important to note that the repair of the data set at this stage of the analysis is aimed at rescuing the remaining data elements in a particular record (a data record here is referred to a row of the data matrix) that includes many features (features are the columns of the data matrix). The goal is not to “magically” find the piece of the missing data or substitute the outlier with the correct value. The goal is merely to put the best possible value in place of the missing data or the outlier that would allow the analysis to continue without losing the information content that exists in the rest of the features in a particular data record. Moreover, a new and novel methodology has been developed in order to identify and eliminate erroneous data records from the data set. These techniques, verification of their accuracy and how they are implemented are subjects of a future paper. Step Two: Fuzzy Combinatorial Analysis The second step of the process is a complete “Fuzzy Combinatorial Analysis – FCA” that examines each feature in the data set in order to identify its influence on the process outcome. The process outcome is a feature (usually a production indicator such as cumulative production, 5 year cum., 10 year cum., best 12 months of production, etc.) in the data set that is designated to identify the success of the practices in a field. For example if 5 year cumulative gas production is selected to be the process outcome, then a high 5 year cum. would indicate performance of good practices for that particular well. During that “Fuzzy Combinatorial Analysis” each feature’s influence is examined on the process outcome both individually and in combination with other features. This is due to the fact that influence of a particular feature (say a fracturing fluid) on the process outcome may be altered once it is combined with the effects of other features (say specific additives) that are present in the process. Therefore it is important to perform the analysis in a combinatorial fashion (hence the name combinatorial analysis) in order to reveal the true influence of the features present in the data set on the process outcome. A note of caution is in order here. Many commercial, off-theshelf neural network software applications claim to identify the influence of features on the output once a neural network model is build for a data set, and many practitioners in our industry have been using them as the true influence of parameters on the output. These products simply use the summation of the weights connected to a particular input neuron in order to achieve this. Authors believe that this is a gross simplification of a complex problem, and does not provide an accurate account of the influence of each feature and therefore should not be used as such. This method simply is an artifact of the modeling process and in the event of changing the architecture or learning algorithms of the modeling process the influence of features may be altered. Step Three: Intelligent Production Data Analysis The third step in the “Successful Practices Identification” methodology is a process called “Intelligent Production Data Analysis – IPDA.” The word “intelligent” in the above phrase refers to the use of intelligent systems techniques in the production data analysis process. During the IPDA process production data is used to identify a series of “Production Indicators” that would represent the state of the production from a particular field in time and space. The time and space representation is aimed at capturing the depletion and pressure decline in the field as new wells are drilled and put into production at different rates. The dynamic nature of this analysis (simultaneous analysis of the data in four dimensions x, y, z, and t) allows the user to identify the sweet spots as well as bad (unproductive) spots in a field. Such analysis would prove quite valuable during field development strategies that include infill drilling programs and candidate selection for stimulation, restimulation and workovers. Step Four: Neural Model Building Next step (Step 4) in the process calls for building a predictive neural model based on the available data. This step has been coved in detail in the several prior papers and will not be repeated here. Step Five: Successful Practices Analysis Once a representative neural model is successfully trained, calibrated and verified, the process of “Successful Practices Identification” using state-of-the-art data mining, knowledge discovery and data-knowledge fusion techniques is concluded with a three stage analysis. The three stage analysis combines the neural model with Monte Carlo simulation, genetic algorithms search and optimization routines, and fuzzy set theory to identify the successful practices on a single well basis, on groups of wells basis, and on a field wide basis. During the single well analysis, each well is thoroughly analyzed in order to identify the sensitivity of that particular well to different operational conditions.