Well Log Analysis Projects

This repository contains two projects focused on well log analysis and oil production forecasting using various data analysis and machine learning techniques.

Project 1: Salt Creek Field Well Log Analysis

Overview

Dataset: 'Salt-Creek.DAT' containing 403 rows and 10 columns of well log data
Variables: GR, log10(LLD), log10(MSFL), DT, RHOB, NPHI, PEF, POR, Kg, ln(Kg)
Objective: Analyze well logs using statistical analysis, classification, and regression

Methodology

Data Preprocessing
- Removed outliers using the Interquartile Range (IQR) method
- Reduced data from 403 to 286 rows
Exploratory Data Analysis
- Visualized correlations using Seaborn's pairplot
- Observed Gaussian or log-normal distributions for all variables
Dimensional Reduction
- Applied Principal Component Analysis (PCA)
- Selected 4 principal components explaining ~90% of variance
Classification
- Used K-means clustering with 6 clusters
- Visualized clusters along first two principal components
Regression
- Implemented a Decision Tree Regressor to predict permeability
- Used 70% training, 30% testing split
- Evaluated using MAE, MSE, RMSE, and R-squared metrics

Results

Successfully created a model for classifying data and predicting permeability
Suggested future improvements: hyperparameter tuning, ensemble methods, and refined clustering

Project 2: Volve Field Time Series Analysis

Overview

Dataset: Volve Field time series data from 16 wells (2008-2016)
Focus: Daily Oil Production Rate (m^3/day)
Selected wells: P-F-12 and P-F-14 (highest non-zero production days)

Methodology

Data Exploration
- Visualized time series plots for P-F-12 and P-F-14
- Removed "shut-in" time (zero production) for analysis
Persistence Forecast
- Applied to P-F-12 well
- Limitations: 1-step horizon, struggles with large changes
LSTM (Long Short-Term Memory) Model
- Transformed time series into supervised learning problem
- Made data stationary through differencing
- Scaled data from -1 to 1
- Used 300 epochs and 4 neurons for training
Model Validation
- Applied LSTM model to P-F-14 well
- Attempted to increase forecast horizon

Results

LSTM outperformed Persistence Forecast (RMSE: 12.7 vs 27)
Validation on P-F-14 showed unrealistic sudden production increase
Identified areas for improvement in increasing forecast horizon

Future Work

Refine LSTM model to improve long-term forecasting
Explore other machine learning techniques for well log analysis
Investigate methods to increase forecast horizon reliability

Technologies Used

Python
Pandas
Seaborn
Scikit-learn
Keras (for LSTM implementation)

This README provides an overview of two well log analysis projects. For detailed information, please refer to the individual project reports and code files within this repository.

Project 1: Salt Creek Field Well Log Analysis

Overview

Dataset: 'Salt-Creek.DAT' containing 403 rows and 10 columns of well log data

Variables: GR, log10(LLD), log10(MSFL), DT, RHOB, NPHI, PEF, POR, Kg, ln(Kg)

Objective: Analyze well logs using statistical analysis, classification, and regression

Methodology

Data Preprocessing

Removed outliers using the Interquartile Range (IQR) method
Reduced data from 403 to 286 rows

Exploratory Data Analysis

Visualized correlations using Seaborn's pairplot
Observed Gaussian or log-normal distributions for all variables

Dimensional Reduction

Applied Principal Component Analysis (PCA)
Selected 4 principal components explaining ~90% of variance

Classification

Used K-means clustering with 6 clusters
Visualized clusters along first two principal components

Regression

Implemented a Decision Tree Regressor to predict permeability
Used 70% training, 30% testing split
Evaluated using MAE, MSE, RMSE, and R-squared metrics

Results

Successfully created a model for classifying data and predicting permeability

Suggested future improvements: hyperparameter tuning, ensemble methods, and refined clustering

Project 2: Volve Field Time Series Analysis

Overview

Dataset: Volve Field time series data from 16 wells (2008-2016)

Focus: Daily Oil Production Rate (m^3/day)

Selected wells: P-F-12 and P-F-14 (highest non-zero production days)

Methodology

Data Exploration

Visualized time series plots for P-F-12 and P-F-14
Removed "shut-in" time (zero production) for analysis

Persistence Forecast

Applied to P-F-12 well
Limitations: 1-step horizon, struggles with large changes

LSTM (Long Short-Term Memory) Model

Transformed time series into supervised learning problem
Made data stationary through differencing
Scaled data from -1 to 1
Used 300 epochs and 4 neurons for training

Model Validation

Applied LSTM model to P-F-14 well
Attempted to increase forecast horizon

Results

LSTM outperformed Persistence Forecast (RMSE: 12.7 vs 27)

Validation on P-F-14 showed unrealistic sudden production increase

Identified areas for improvement in increasing forecast horizon

Oil Well Data Analysis

May 2021

Well Log Analysis Projects

Project 1: Salt Creek Field Well Log Analysis

Overview

Methodology

Results

Project 2: Volve Field Time Series Analysis

Overview

Methodology

Results

Future Work

Technologies Used

Well Log Analysis Projects

Project 1: Salt Creek Field Well Log Analysis

Overview

Methodology

Results

Project 2: Volve Field Time Series Analysis

Overview

Methodology

Results

Future Work

Technologies Used