Nuit Blanche: Monday Morning Algorithm part 5: The 1-D Experimental Probabilistic Hypersurface

Sunday, December 02, 2007

Monday Morning Algorithm part 5: The 1-D Experimental Probabilistic Hypersurface

The Experimental Probabilistic Hypersurface is a tool initially designed by Bernard Beauzamy. Let me first try to say why it is useful first. Imagine you have a function of several variables, say 100, and it takes 2 hours to evaluate that function. At the end of his/her 8 hours day, the engineer will have processed at best 4 function evaluations. It will take a long time of doing any type of sensitivity studies.

In order to have a faster turn around, one could foresee a scheme where one would do a linear interpolation between points (of dimension 100) that we already know. This would provide an idea of what the function looks like in a larger region. Unfortunately, this interpolation scheme in dimension 100 would by itself take too much time. Irrespective to this time constraint, there is nothing that says this function should be linear or even continuous.

In 2004, Bernard Beauzamy made this original construction whereby the function to be evaluated is a multidimensional probabilistic distribution. For every set of (100) parameters that have been already evaluated, the result of the EPH for this function is a dirac. For all other sets of parameters that have not been evaluated, the result of the EPH for this function is a distribution whose shape will be as peaky (resp. flat) as one is close (resp. far) from the sets of parameters for which the function has already been evaluated. Olga Zeydina, a Ph.D candidate, is writing her thesis on the subject. Most information on this construction can be found in the document section of the Robust Mathematical Modeling program at SCM. Of particular interest are: the initial practical implementation [ Report 1 ] and a newer implementation [Report 2].

Today's algorithm will be featured in two files. One that implements the EPH, the other using the EPH function and produce graphs from this document. Listed in this entry, I will list the EPH function file. The second file can be found in the MMA repository.

function [t,pxx]=EPH(xi,theta,x,t1,tau,xe)

% Algorithm developed by Bernard Beauzamy

% and Olga Zeydina

% This implementation was written by

% Igor Carron

% Inputs

% xi: a parameter set where the function

% has been evaluated

% theta: the result of the

% function evaluation for parameter set xi

% x: bounding value for parameter set xi

% t1: bounding value for result of

% function evaluation

% tau: is the increment in the interval

% defined by the bounding value of t1

% xe: Query parameter set. We want to

% know the value of the function at

% this point.

% Outputs

% t: interval where the distribution is

% defined

% pxx: probability distribution sought

% sampled over t.

N =size(xi,2)

x1_min= x(1,1)

x1_max= x(1,2)

t_min= t1(1,1);

t_max= t1(1,2);

nu=size([t_min:tau:t_max],2)-1

t=[t_min:tau:t_max];

dmax(1)=max(abs(x(1,1)-xi(1)),abs(x(1,2)-xi(1)));

lambd(1)=log(nu)/dmax(1);

for i=2:N

dmax(i)=max(abs(x(1,1)-xi(i)),abs(x(1,2)-xi(i)));

if dmax(i)>dmax(i-1)

lambd(i)=log(nu)/dmax(i);

else

lambd(i)=lambd(i-1);

end

nxe=size(xe,2);

options = optimset('TolFun',1e-3);

for jxx=1:nxe

for i=1:N

d(i)=abs(xe(jxx)-xi(i));

end

for i=1:N

a =pi/tau.^2*exp(1-2*lambd(i)*d(i));

lamn=lambd(i)*d(i);

at2=sum(exp(-a.*t.^2));

at3=sum(t.^2.*exp(-a.*t.^2));

diffe=abs(at2-1/tau*(pi/a).^(1/2))/(1/tau*(pi/a).^(1/2));

if abs(diffe)<1e-10

areal(i) = a;

else

areal(i) = fzero(@(x) fina(x,t,lamn),a,options);

end

b = 1/2-lambd(i)*d(i);

breal(i) = log(1/(sum(exp(-areal(i).*t.^2))));

aa(i)=areal(i);

bb(i)=breal(i);

end

for i=1:N

pro(i,:)=d;

end

prd=cumprod(d,2);

for i=1:N

pro(i,i)=1.0;

end

xpro=cumprod(pro,2);

pioverdi=xpro(:,N);

d11s=sum(pioverdi);

for j=1:nu+1

for i=1:N

pij1(i,j)=exp(-aa(i)*(t(j)-theta(i)).^2 + bb(i));

end

pxx(jxx,j)=1/d11s*(pioverdi'*pij1(:,j));

end

To run this function, one needs to run the MMA5.m file. This is a simple 1-D implementation. One can obviously foresee its application to 2-D or n-D functions. But more important, one could use the resident expert knowledge of the engineer and use the algorithm of last week [1] to produce a new metric in the parameter space and then use the EPH. If you are using it for some serious work, you may want to join us in the Robust Mathematical Modeling program. This probabilistic approach to function evaluation can also be used as a storage mechanism. Another issue that most engineers face is that after so many function evaluations, it generally is difficult to summarize all the findings. This approach solves this problem. One can even implement a way for the engineer to add parameters as he/she goes.

[1] Improving Embeddings by Flexible Exploitation of Side Information

2 comments:

Anonymous said...: Do you know anything about how the Experimental Probabilistic Hypersurface compares to Gaussian process statistical emulators (e.g., BACCO), for the purpose of efficiently representing complex model output and their uncertainty? Advantages or disadvantages?; Monday, December 3, 2007 at 5:01:00 PM CST
Igor said...: The short answer is no. The longer answer is: I have tried to find out what other types of modeling tools such as this one were on the "market", see the comment section of this entry:

http://www.stat.columbia.edu/~cook/movabletype/archives/2007/06/overview_of_mis.html

Igor.; Tuesday, December 4, 2007 at 8:40:00 AM CST