Global sensitivity analysis
Global sensitivity analysis is a Monte Carlo based method to rank the importance of parameters in a given modelling problem. As opposed to local senstivity analysis, it does not require the construction of the Jacobian, making it a flexible tool to evaluate complex problems.
Global sensitivty analysis is available in mainly uncertainty quantificaiton packages, as well as some flow and transport programs (e.g. iTOUGH2). GSA is also very popular in catchment modelling and civil engineering/risk analysis problems.
Some GSA work in hydrogeophysics (mainly by Berkeley Lab):
coupled hydrological-thermal-geophysical inversion (Tran et al 2017) > Nicely show how to simplify (i.e. reduce the number of parameters) for a very complex, highly coupled problem
making sense of global senstivity analysis (Wainwright et al 2014) > Very good review article
Sensitivity analysis of environmental models (Pianosi et al 2014) > A systematic review, includes GLUE and RSA
hydrogeology of a nuclear site in the Paris Basin (Deman et al 2016) > A different GSA method was used instead here to look at the low probability breakthrough events
Global Sensitivity and Data-Worth Analyses in iTOUGH2 User’s Guide (Wainwright et al 2016) > An useful manual if you want to learn about the details of setting up a probllem.
In this tutorial, we will see how to link the RESiPy API and SALib for senstivity analysis. Two key elements of SA are (i) forward modelling (Monte Carlo runs) and (ii) specifying the parameter ranges. This notebook will showcase of the use of the Method of Morris, which is known for its relatively small computational cost. This tutorial is modified from the one posted on https://github.com/SALib/SATut to demonstrate its coupling with RESiPy
Morris sensitivity method
The Morris one-at-a-time (OAT) method (Morris, 1991) can be considered as an extension of the local sensitivity method. Each parameter range is scaled to the unit interval [0, 1] and partitioned into \((p−1)\) equally-sized intervals. The reference value of each parameter is selected randomly from the set \({0, 1/(p−1), 2/(p−1), …, 1−Δ}\). The fixed increment \(Δ=p/{2(p−1)}\) is added to each parameter in random order to compute the elementary effect (\(EE\)) of \(x_i\)
We compute three statistics: the mean \(EE\), standard deviation (STD) of \(EE\), and mean of absolute \(EE\). * mean EE (:math:`mu`) represents the average effect of each parameter over the parameter space, the mean EE can be regarded as a global sensitivity measure. * mean |EE| (:math:`mu*`) is used to identify the non-influential factors, * STD of EE (:math:`sigma`) is used to identify nonlinear and/or interaction effects. (The standard error of mean (SEM) of EE, defined as \(SEM=STD/r^{0.5}\), is used to calculate the confidence interval of mean EE (Morris, 1991))
This cell is copied from (Wainwright et al 2014)
Importing libraries
import warnings
warnings.filterwarnings('ignore')
import os
import sys
sys.path.append((os.path.relpath('../src'))) # add here the relative path of the API folder
import numpy as np # numpy for electrode generation
import pandas as pd
from IPython.utils import io # suppress R2 outputs during MC runs
from resipy import Project
API path = /media/jkl/data/phd/tmp/pyr2/src/resipy
ResIPy version = 2.1.0
The SALib package
SALib is a free open-source Python library
If you use Python, you can install it by running the command
pip install SALib
Documentation is available online and you can also view the code on Github.
The library includes: * Sobol Sensitivity Analysis (Sobol 2001, Saltelli 2002, Saltelli et al. 2010) * Method of Morris, including groups and optimal trajectories (Morris 1991, Campolongo et al. 2007) * Fourier Amplitude Sensitivity Test (FAST) (Cukier et al. 1973, Saltelli et al. 1999) * Delta Moment-Independent Measure (Borgonovo 2007, Plischke et al. 2013) * Derivative-based Global Sensitivity Measure (DGSM) (Sobol and Kucherenko 2009) * Fractional Factorial Sensitivity Analysis (Saltelli et al. 2008)
# import the packages
from SALib.sample import morris as ms
from SALib.analyze import morris as ma
from SALib.plotting import morris as mp
Create ERT forward problem with ResIPy
In the code below, created a Project
forward problem to be analyzed
k = Project(typ='R2')
elec = np.zeros((24,3))
elec[:,0] = np.arange(0, 24*0.5, 0.5) # with 0.5 m spacing and 24 electrodes
k.setElec(elec)
#print(k.elec)
# defining electrode array
x = np.zeros((24, 3))
x[:,0] = np.arange(0, 24*0.5, 0.5)
k.setElec(elec)
# creating mesh
k.createMesh(res0=20)
# add region
k.addRegion(np.array([[2,-0.3],[2,-2],[3,-2],[3,-0.3],[2,-0.3]]), 50)
k.addRegion(np.array([[5,-2],[5,-3.5],[8,-3.5],[8,-2],[5,-2]]), 500)
# define sequence
k.createSequence([('dpdp1', 1, 10)])
# forward modelling
k.forward(noise=0.025)
# read results
fwd_dir = os.path.relpath('../src/resipy/invdir/fwd')
obs_data = np.loadtxt(os.path.join(fwd_dir, 'R2_forward.dat'),skiprows =1)
obs_data = obs_data[:,6]
# plot
k.showMesh()
Working directory is: /media/jkl/data/phd/tmp/pyr2/src/resipy/invdir
clearing the dirname
Creating triangular mesh...Reading mesh.msh
Gmsh version == 3.x
reading node coordinates...
Determining element type...Triangle
Reading connection matrix...
ignoring 0 elements in the mesh file, as they are not required for R2/R3t
Finished reading .msh file
done
165 quadrupoles generated.
Writing .in file and mesh.dat... done!
Writing protocol.dat... done!
Running forward model...
>> R 2 R e s i s t i v i t y I n v e r s i o n v4.0 <<
>> D a t e : 04 - 05 - 2020
>> My beautiful survey
>> F o r w a r d S o l u t i o n S e l e c t e d <<
>> Determining storage needed for finite element conductance matrix
>> Generating index array for finite element conductance matrix
>> Reading start resistivity from resistivity.dat
Measurements read: 165 Measurements rejected: 0
>> Total Memory required is: 0.066 Gb
165/165 reciprocal measurements NOT found.
0 measurements error > 20 %
Forward modelling done.

Option to view resistivity fields with pyvista
import pyvista as pv
mesh = pv.read(os.path.join(fwd_dir, 'forward_model.vtk'))
mesh.cell_arrays
mesh.plot(scalars='Resistivity(log10)',notebook=True)
mesh

Header | Data Arrays | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
import pyvista as pv
plotter = pv.Plotter()
mesh = pv.read(os.path.join(fwd_dir, 'forward_model.vtk'))
plotter.add_mesh(mesh,cmap="viridis" ,stitle="Resistivity(log10)") #scalars='Resistivity(log10)'
#plotter.show_axes()
_ = plotter.show_bounds(grid='front', location='outer', all_edges=True, xlabel="X [m]",ylabel="Y [m]")
plotter.add_text('pyvista example')
#plotter.update_scalar_bar_range([-2000,2000], name="Resistivity(log10)")
#plotter.add_mesh(mesh, cmap="bone", opacity="linear", stitle="Linear Opacity")
plotter.view_xy()
plotter.show()
#plotter.show(screenshot='test.png')

Define a problem file
In the code below, a problem file is used to define the parameters and their ranges we wish to explore, which corresponds to the following table:
Parameter |
Range |
Description |
---|---|---|
rho0 [ohm m] |
10 1 |
background |
rho1 [ohm m] |
10 2 |
inclusion A |
rho2 [ohm m] |
10 3 |
inclusion B |
morris_problem = {
'num_vars': 3,
# These are their names
'names': ['rho1', 'rho2', 'rho3'], # can add z1 z2 etc.
# Plausible ranges over which we'll move the variables
'bounds': [[0.5,3.5], # log10 of rho (ohm m)
[0.5,3.5],
[0.5,3.5]#,
# [-3,-1],
# [-7,-4],
],
# I don't want to group any of these variables together
'groups': None
}
Generate a Sample
We then generate a sample using the morris.sample()
procedure from
the SALib package.
number_of_trajectories = 20
sample = ms.sample(morris_problem, number_of_trajectories, num_levels=10)
len(sample)
print(sample[79,:])
[1.83333333 3.5 1.83333333]
Run the sample through the monte carlo procedure in R2
Great! You have defined your problem and have created a series of input files for forward runs. Now you need to run R2 for each of them to obtain their ERT responses.
For this example, each sample takes a few seconds to run on a PC.
#%%capture
simu_ensemble = np.zeros((len(obs_data),len(sample)))
for ii in range(0, len(sample)):
with io.capture_output() as captured: # suppress inline output from ResIPy
# creating mesh
k.createMesh(res0=10**sample[ii,0]) # need to use more effective method, no need to create mesh every time
# add region
k.addRegion(np.array([[2,-0.3],[2,-2],[3,-2],[3,-0.3],[2,-0.3]]), 10**sample[ii,1])
k.addRegion(np.array([[5,-2],[5,-3.5],[8,-3.5],[8,-2],[5,-2]]), 10**sample[ii,2])
# forward modelling
k.forward(noise=0.025, iplot = False)
out_data = np.loadtxt(os.path.join(fwd_dir, 'R2_forward.dat'),skiprows =1)
simu_ensemble[:,ii] = out_data[:,6]
print("Running sample",ii+1)
Running sample 1
Running sample 2
Running sample 3
Running sample 4
Running sample 5
Running sample 6
Running sample 7
Running sample 8
Running sample 9
Running sample 10
Running sample 11
Running sample 12
Running sample 13
Running sample 14
Running sample 15
Running sample 16
Running sample 17
Running sample 18
Running sample 19
Running sample 20
Running sample 21
Running sample 22
Running sample 23
Running sample 24
Running sample 25
Running sample 26
Running sample 27
Running sample 28
Running sample 29
Running sample 30
Running sample 31
Running sample 32
Running sample 33
Running sample 34
Running sample 35
Running sample 36
Running sample 37
Running sample 38
Running sample 39
Running sample 40
Running sample 41
Running sample 42
Running sample 43
Running sample 44
Running sample 45
Running sample 46
Running sample 47
Running sample 48
Running sample 49
Running sample 50
Running sample 51
Running sample 52
Running sample 53
Running sample 54
Running sample 55
Running sample 56
Running sample 57
Running sample 58
Running sample 59
Running sample 60
Running sample 61
Running sample 62
Running sample 63
Running sample 64
Running sample 65
Running sample 66
Running sample 67
Running sample 68
Running sample 69
Running sample 70
Running sample 71
Running sample 72
Running sample 73
Running sample 74
Running sample 75
Running sample 76
Running sample 77
Running sample 78
Running sample 79
Running sample 80
Factor Prioritisation
We’ll run a sensitivity analysis of the power module to see which is the most influential parameter.
The results parameters are called mu, sigma and mu_star.
Mu is the mean effect caused by the input parameter being moved over its range.
Sigma is the standard deviation of the mean effect.
Mu_star is the mean absolute effect.
The higher the mean absolute effect for a parameter, the more sensitive/important it is*
# Define an objective function: here I use the error weighted rmse
def obj_fun(sim,obs,noise):
y = np.divide(sim-obs,noise) # weighted data misfit
y = np.sqrt(np.inner(y,y))
return y
output = np.zeros((1,len(sample)))
for ii in range(0, len(sample)):
output[0,ii] = obj_fun(simu_ensemble[:,ii],obs_data,0.025*obs_data) # assume 2.5% noise in the data
# Store the results for plotting of the analysis
Si = ma.analyze(morris_problem, sample, output, print_to_console=False)
print("{:20s} {:>7s} {:>7s} {:>7s}".format("Name", "mean(EE)", "mean(|EE|)", "std(EE)"))
for name, s1, st, mean in zip(morris_problem['names'],
Si['mu'],
Si['mu_star'],
Si['sigma']):
print("{:20s} {:=7.3f} {:=7.3f} {:=7.3f}".format(name, s1, st, mean))
Name mean(EE) mean(|EE|) std(EE)
rho1 59023.790 59023.790 46473.532
rho2 2972.826 2979.171 4804.446
rho3 1930.959 1933.686 3186.307
# make a plot
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ax.scatter(Si['mu_star'],Si['sigma'])
#ax.plot(Si['mu_star'],2*Si['sigma']/np.sqrt(number_of_trajectories),'--',alpha=0.5)
#ax.plot(np.array([0,Si['mu_star'][0]]),2*np.array([0,Si['sigma'][0]/np.sqrt(number_of_trajectories)]),'--',alpha=0.5)
plt.title('Distribution of Elementary effects')
plt.xlabel('mean(|EE|)')
plt.ylabel('std($EE$)')
for i, txt in enumerate(Si['names']):
ax.annotate(txt, (Si['mu_star'][i], Si['sigma'][i]))
# higher mean |EE|, more important factor
# line within the dashed envelope means nonlinear or interaction effects dominant

Summary
Sensitivity analysis helps you:
Think through your assumptions
Quantify uncertainty
Focus on the most influential uncertainties first
Learn Python
Similar packages to `SALib <>`__ for other languages/programmes:
`sensitivity
package for R <https://cran.r-project.org/web/packages/sensitivity/index.html>`__ (Michael used it for GSA in his leak detection paper)
# run this so that a navigation sidebar will bee generated when exporting this notebook as HTML
#%%javascript
#$('<div id="toc"></div>').css({position: 'fixed', top: '120px', left: 0}).appendTo(document.body);
#$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js');
help(k.addRegion)
Help on method addRegion in module resipy.R2: addRegion(xz, res0=100, phase0=1, blocky=False, fixed=False, ax=None, iplot=False) method of resipy.R2.R2 instance Add region according to a polyline defined by xz and assign it the starting resistivity res0. Parameters ---------- xz : array Array with two columns for the x and y coordinates. res0 : float, optional Resistivity values of the defined area. phase0 : float, optional Read only if you choose the cR2 option. Phase value of the defined area in mrad blocky : bool, optional If True the boundary of the region will be blocky if inversion is block inversion. fixed : bool, optional If True, the inversion will keep the starting resistivity of this region. ax : matplotlib.axes.Axes If not None, the region will be plotted against this axes. iplot : bool, optional If True , the updated mesh with the region will be plotted.
Download python script: nb_Morris.py
Download Jupyter notebook: nb_Morris.ipynb
View the notebook in the Jupyter nbviewer