Wednesday, December 15, 2010

On the Difficulty of Price Modeling

I was recently looking for a clean example of a service or an item that could clearly show the difficulty of the pricing of said service or item. I just found one on Dan Ariely's blog: Locksmiths. Here is the video:

Do you have other examples ?

Tuesday, December 7, 2010

Human Behavior Modeling Failures

This blog entry entitled Millenium bridge, endogeneity and risk management features two examples of faulty modeling in bridges and value-at-risk models (VaR) that take their roots in their not taking into account human behavior. One but wonders if people were dealing with unknown unknowns when the initial modeling was performed. In a different direction, when one deals with models and human behavior there is always the possibility for subgroups to game the system and make the intended modeling worthless. Here is an example related to ranking academics and researchers.

Monday, December 6, 2010

RMM Example #3: Nuclear Proliferation and Terrorism Risk Assessment.

In the U.S, the nuclear fuel cycle is termed 'one through' as nuclear fuel passes in nuclear power plant only once before being discarded. The debate on whether the country should be reprocessing some of these materials has been an ongoing discussion as early as the 1950's. The problematic is extremely complex and has many stakeholders at the table. A subset of the discussions include the fact that if the U.S. were to ever perform some reprocessing in their civilian fuel cycle, it would provide some grounds for other countries to do the same. For technical reasons, reprocessing is considered to be a good tool for proliferation because as soon as you allow for your nuclear fuel to be reprocessed, you  also open the door to the ability to extract material for "other purposes". Most countries are signatories of the Non Proliferation Treaty (NPT) and as you all know the IAEA is in charge of verifying compliance (all countries that are signatories are subject to yearly visits by IAEA staff) And so, a major technical effort in any type of Research and Development at the U.S. Department of Energy (and other countries that comply with the NPT) revolves around bringing technical solutions to some proliferation issues (as other proliferation issues are political in nature). As part of the recent DOE Nuclear Energy Enabling Technologies Program Workshop, there was an interesting subsection of the meeting dedicated to Proliferation and Terrorism Risk Assessment. I am not a specialist of this particular area, so I will just feature the material presented there for illustration purposes as they all represents some issues commonly found in difficult to solve RMM examples. From the presentation introducing the assessment here were the Goals and Objectives of this sub-meeting:
Solicit views of a broad cross‐section of stakeholders on the following questions:
  • Would you favor an expanded R&D effort on proliferation and terrorism risk assessment? Why or why not?
  • In what ways have current methodologies been useful, how might R&D make them more effective?
  • If an expanded R&D program was initiated, what are promising areas for R&D, areas less worthwhile, and what mix of topics would best balance an expanded R&D portfolio?
  • If an expanded R&D program was initiated, what cautions and recommendations should DOE‐NE consider as the program is planned and implemented?
Panel presentations to stimulate the discussion will address:
  • Existing state‐of‐the‐art tools and methodologies for proliferation and terrorism risk assessment.
  • The potential impact of improved tools and methodologies as well as factors that should be carefully considered in their use and any further development efforts.
  • Identification of the challenges, areas for improvement, and gaps associated with broader utilization and acceptance of proliferation and terrorism risk assessment tools and methodologies.
  • Identification of promising opportunities for R&D. Broad discussion/input is essential, active participation of all session attendees will: Provide important perspectives on proliferation and terrorism risk assessment R&D and ultimately strengthen capabilities for supporting NE’s development of new reactor and fuel cycle technologies/concepts while minimizing proliferation and terrorism risks.

Why talk about this subject on RMM ? Well I was struck by the type of questions being asked to technical people as they looked like typical questions one would ask in the context of an RMM example.

In Robert Bari's presentation one can read:

"Conveying Results: In particular, what we know about what we do not know" sounded a little too much like the categories discussed in The Modeler's Known Unknowns and Unknown Knowns

In Proliferation Resistance and Proliferation Risk Analysis: Thoughts on a Path Forward by William S. Charlton, one can read:

I wonder about the type of modeling that goes into estimating uncertainties. Finally, in Bill Burchill's slides, one can read on the proliferation pathways the following:

Saturday, December 4, 2010

Solution to the "Selling from Novosibirsk" business model riddle

Bernard Beauzamy (the owner of SCM SA) had set up a 500 euros prize for whoever could find a way for the business model riddle featured on TuesdayBernard tells me that nobody won the prize, here is the answer:
....The answers we received fall into two categories:

Those who want to send 10 000 left-hand gloves, and then 10 000 right-hand gloves, and declare them of value zero.

This answer does not make sense ! Do you think that the customs will be stupid enough not to observe that they see only left-hand gloves, and only later right-hand gloves ? They would confiscate both, and the sender would go to jail, for attempt to cheat the customs. Many years of jail !

Those who want to create a branch of the company in the destination country, and claim that they would evade the customs this way.

This answer does not make sense either ! It only reduces the profits, since one has to pay all the people in the local structure. And it changes nothing to the fact that customs tax the final selling price. If you have 10 intermediaries, you will have to give a salary to the ten, and the customer pays the same price, so the producer gets less money. In all circumstances, customs or not, the fewer intermediaries you have, the better you feel.

The answer, as we expected, was not found by anyone, because all our readers, by education or taste, want to build mathematical models, and this is a situation where no model is possible. It defies imagination and logics, and contradicts all existing economical models.

First of all, the solution looks impossible. If we sell each pair at its maximum price, that is 200 Rubles, the customs takes 160, we keep 40, and this is exactly equal to the production cost, so we have no benefit at all. It is even worse if we sell at a lower price.

The solution is this : we have to impose fabrication defects to 5 000 pairs (both left and right gloves). After that, we export 5 000 pairs, of which the left one is normal and the right one is defective (for example a big scar across one finger). These gloves are declared as "fabrication rejects", for a very small price, for instance 20 Rubles a pair. Note that selling and exporting "fabrication rejects" is quite ordinary and legal, and is common practice.

Then, next month, we do the converse : we export 5 000 pairs, of which the left one is defective and the right one is normal. We put all gloves together, and we get 5 000 pairs of normal gloves, which we sell at the maximum price. The total cost is 400 000 Rubles (fabrication), plus 160 000 Rubles (customs). The sales bring 1 000 000 Rubles, so we have a benefit of 840 000 Rubles. We can of course sell the defective gloves, just to have some receipts for the customs.

The ideal solution, but this is a remarkable industrial achievement is to program the fabrication machine so that it put defects on one pair out of two.

We observe that the solution is perfectly legal. Fabrication defects exist and are sold worlwide, for a low price. Each pair is perfectly declared at is correct value.

We said earlier that mathematical modeling is impossible. In fact, this example shows that all Nobel prizes given to economists since 1969 (year of creation of the prize) should be withdrawn, because they have no value at all.

Precisely, we see here that the notion of price is not mathematically well-defined. We cannot talk about the price of a glove, even not of a pair of gloves. We see that the price is not a continuous function, nor an increasing function, nor an additive function. The price of two objects together is not the sum of their individual prices. Still, the economists will build nice models, defective by all parts, and no reassembly can bring them any value !

Remember this : this is a true story, and the man who invented the solution did not know what a mathematical model was and did not have any degree at all…

[P.S: added Dec 4 (3:40PM CST): It's a riddle. One could make the point that this type of business model is not robust. All the countries in the world revise their own laws in order to effectively plug holes such as the one presented here. The ever growing sophistication / complexities of the tax systems in most countries reflects this adaptive behavior. If this system were robust it would be common business practice by now. It may have worked in some countries in the past however. The most important take away from this riddle is that the definition of the price of an item is indeed a difficult problem for which modeling is going to be tricky for a long time]

Friday, December 3, 2010

The RMM LinkedIn Group

As some of you may know, the Robust Mathematical Modeling blog has its own group on LinkedIn. Rodrigo Carvalho is our latest member which put our count to 21.

The 500 euros prize for the solution to the business model riddle ends today in about 7 hours. Good luck!

Tuesday, November 30, 2010

A Prize for Modeling a Business with Constraints: Selling from Novosibirsk

Sometimes, the model that is given to you has to be tweaked radically in order to explain the situation at hand. With this thought in mind, Bernard Beauzamy (the owner of SCM SA) has set up a 500 euros prize for whoever can find a way for the following business model to work (see below). The solutions should be sent before Friday December 3rd, 2010, 5 pm (Paris local time, that's GMT+1) to
Selling from Novosibirsk
A factory in Novosibirsk produces gloves. Each pair costs 40 rubles to produce, including everything : raw material, salary of workers, machines, transportation, and so on.
They produce 10,000 pairs and they want to sell them in a country where customs duties are 4/5 of the selling price. They cannot sell at a price higher than 200 rubles each pair, because of local competition and local buying power (at a higher price, nobody would buy). How do they manage to make a profit, and how much do they gain ?
One cannot cheat with the customs and corruption is forbidden. The price declared for the sale must be the true price. The exchange rate between currencies is assumed to be fixed and is not to be taken into account : everything is stated in the seller's currency (here the ruble). The solution should work repeatedly, any number of times, without violating any law, between any countries with normal customs.
Prize offered : 500 Euros, for the best answer received before Friday, December 3rd, 2010, 5 pm (Paris local time). Send answers to Answers may be written in English, French, Russian.

[Check the solution here]

Saturday, November 27, 2010

Optimal according to what ?

Redefining optimal is a blog entry by some of the folks at the Department of Systems Biology at Harvard Medical School. It is very nicely written and includes some nice comments. The entry specifically points to a paper by Fernández Slezak D, Suárez C, Cecchi GA, Marshall G, & Stolovitzky G (2010) entitled When the optimal is not the best: parameter estimation in complex biological models (PloS one, 5 (10) PMID: 21049094). The abstract and conclusions read:


BACKGROUND: The vast computational resources that became available during the past decade enabled the development and simulation of increasingly complex mathematical models of cancer growth. These models typically involve many free parameters whose determination is a substantial obstacle to model development. Direct measurement of biochemical parameters in vivo is often difficult and sometimes impracticable, while fitting them under data-poor conditions may result in biologically implausible values.

RESULTS: We discuss different methodological approaches to estimate parameters in complex biological models. We make use of the high computational power of the Blue Gene technology to perform an extensive study of the parameter space in a model of avascular tumor growth. We explicitly show that the landscape of the cost function used to optimize the model to the data has a very rugged surface in parameter space. This cost function has many local minima with unrealistic solutions, including the global minimum corresponding to the best fit.

CONCLUSIONS: The case studied in this paper shows one example in which model parameters that optimally fit the data are not necessarily the best ones from a biological point of view. To avoid force-fitting a model to a dataset, we propose that the best model parameters should be found by choosing, among suboptimal parameters, those that match criteria other than the ones used to fit the model. We also conclude that the model, data and optimization approach form a new complex system and point to the need of a theory that addresses this problem more generally.

Evidently, the post would have some relevance to compressive sensing if the model were to be linear, which it is not in this case.

Friday, November 19, 2010

The Modeler's Known Unknowns and Unknown Knowns

In Satyajit Das's Blog entitled "Fear & Loathing in Financial Products" one can read the following entry entitled WMD or what are derivatives 
....During the Iraqi conflict, Donald Rumsfeld, the US Defense Secretary, inadvertently stated a framework for understanding the modern world (12 February 2002 Department of Defense News Briefing). The framework perfectly fits the derivatives business. There were “known knowns” – these were things that you knew you knew. There were “known unknowns” – these were things that you knew you did not know. Then, there were “unknown knowns” – things that you did not know you knew. Finally, there were “unknown unknowns” – things that you did not know you did not know...
Then Satyajit goes on to clarify this term a little further:
....In most businesses, the nature of the product is a known known. We do not spend a lot of time debating the use of or our need for a pair of shoes. We also understand our choices – lace up or slip-on, black or brown. I speak, of course, of men’s shoes here. Women’s shoes, well, they are closer to derivatives. Derivatives are more complex. You may not know that you need the product until you saw it – an unknown known. You probably haven’t got the faintest idea of what a double knockout currency option with rebate is or does – a known unknown. What should you pay for this particular item? Definitely, unknown unknown. Derivatives are similar to a Manolo Blahnik or Jimmy Choo pair of women’s shoes....
There is also a word for the last one: Unk-Unk 
n. especially in engineering, something, such as a problem, that has not been and could not have been imagined or anticipated; an unknown unknown.
but I am not sure it fits with the previous definition. Looking up wikipedia, we have:
In epistemology and decision theory, the term unknown unknown refers to circumstances or outcomes that were not conceived of by an observer at a given point in time. The meaning of the term becomes more clear when it is contrasted with the known unknown, which refers to circumstances or outcomes that are known to be possible, but it is unknown whether or not they will be realized. The term is used in project planning and decision analysis to explain that any model of the future can only be informed by information that is currently available to the observer and, as such, faces substantial limitations and unknown risk.

How are these notions applicable to Robust Mathematical Modeling ?  John Cook reminded me recently of the Titanic effect presented initially in Jerry Weinberg's  Secrets of Consulting: A Guide to Giving and Getting Advice Successfully 
The thought that disaster is impossible often leads to an unthinkable disaster.
When modeling a complex  reality, we always strive to make the problem simple and then expect to build a more complex and realistic idealization of that problem. But in the end, even the most complex model is still an idealization of sorts. So every time I get to read about a disaster, grounded in some engineering mistake, I always wonder which part was the known unknown, the unknown known and the unknown unknown.

One of these moment that left me thinking happened back in February 2003 during last flight of the Space Shuttle Columbia. As you know, the spacecraft disintegrated over Texas. It did so because a piece of foam had hit one of its wings fifteen days earlier at launch. That explanation relied on a footage of the launch showing a piece of foam slowly falling from the main booster onto the edge of the wing of the Orbiter. Fifteen days later, when the Orbiter came back to land in Florida, it had a large hole that enabled air at a speed of Mach 17 to enter the inside of the wing, damaging it and thereby destroying the spacecraft. The remains of our experiment showed the temperature had reached well over 600C for a long period of time.

I was in the room next to the MCC during STS-107 as we were flying an instrument on top of the orbiter. We watched the take-off, we listened to all the conversations in the com loop between MCC and the astronauts during the fifteen days it flew (we asked some of the astronauts to manipulate our instrument at one point). At no time was there a hint of a possible mishap. We learned afterwards that even engineers, in the room next door and who had doubts, had requested  imagery from spy sats (but management canceled that request). However what was the most revealing after the tragedy was that I specifically recall that nobody around me could conceive the foam could have been making that much damage. None of us thought the speed differential between the foam and the spacecraft at launch could be that large. According to a simple computation based on the video of the launch, a half pound piece of foam hit the leading edge of the Orbiter's wing at an estimated speed of  924 fps or 1060 km/hr. It's always the square of the velocity that kills you.

More photos of the test performed in San Antonio on July 7, 2003 to recreate what happened can be found here.

As one can see from the small movie above, all previous tests had been performed with small pieces of foam. Further, attention had always been focused on the tiles underneath the Orbiter - never on the leading edge of the wings-. The video of the test performed in San Antonio, several months later, had everyone gasping in horror when the piece of foam opened a large hole in the leading edge of the wing. The most aggravating part of the story is that the Columbia flew for fifteen days with very little care about this issue while the Shuttle program had already seen take-offs with several near dangerous hits in the past.

On an unrelated issue, Ed Tufte also pointed out very clearly how the communication style using Powerpoint was a point of failure in the decision making process of deciding whether the foam strike  was a problem or not.. 

Of note, the communication to managment did not clearly delineate that there was absolutely no experience with such a large chunk of foam (experiments had been performed with 3 inch cube "bullets" vs an actual impactor with a volume of more than 1920 inch cube).

What were the known unknowns, the unknown knowns and the unknown unknowns in this instance ? First let me reframe all the categories of knowns/unknowns:for a modeler of reality or the engineers: With the help of wikipedia, let me summarize them as follows: To a modeler
  • the known known refers to circumstances or outcomes that are known to be possible, it is known that they will be realized with some probability (based on adequate experiments/data)
  • the known unknown refers to circumstances or outcomes that are known to be possible, but it is unknown whether or not they will be realized (no data or very low probability/extreme events).
  • the unknown unknown refers to circumstances or outcomes that were not conceived of by a modeler at a given point in time.
  • the unknown known refers to circumstances or outcome a modeler intentionally refuse to acknowledge that he/she knows
Looking back at the Columbia mishap, how can we categorize the different mistakes made:
  • the foam hitting the RCC was a unknown known to the modelers. People had done their homework and knew that:
    • falling pieces were hitting the orbiter at every launch
    • some launches had bad tile destruction because of falling pieces
    • The engineers went through a series of tests that they eventually put in a database. Most past foam hits fell in the known knowns, as pieces of foam were clearly fitting dimensions used in the database. They knew foam could fall on the RCC instead of the tile yet did not do tests or felt the tests were necessary. At issue is really the fact that there was an assumption that the RCC was tougher than tiles. It actually is more brittle but then a lot of things are brittle when hit with a large chunk of something at 1000 km/hr.
  •  The speed of the foam was also a known known to the modelers. It could be computed right after the launch and was within the range listed in the database mentioned above.
  • A known unknown to the modeler was the impact effect of a very large piece of foam on the leading edge of the wing. This is reflected in the size fragments used in the database. There was simply no data.
  • An unknown unknown to the manager and the rest of the world was the eventual demise of the whole spacecraft fifteen days later due to this impact. To the modeler, I believe the demise of the spacecraft was rather a unknown known.
Unknown unknowns are clearly outside of most modeling for a multitude of reasons. Robust mathematical modeling ought to provide some warning about known unknowns and most importantly provide a framework for not allowing unknown knowns to go unnoticed by either the engineers or their management.

Wednesday, November 10, 2010

RMM Example #2: Spacecraft Thermal Management

When designing Spacecrafts, one of the major issue aside for designing its primary instruments is to devise its thermal management (i,e, managing the way power produced by the spacecraft can be removed so that it does not overheat). The thermal management of Spacecrafts requires solving different sets of issues with regards to modeling. Because spacecrafts generally live in low earth or geostationary orbit, the only way to remove power generated on the spacecraft is through radiation out  of its radiators. This radiator point is the lowest temperature the spacecraft will experience. If the spacecraft is well conditioned all other parts of the spacecraft will have higher temperature no matter what. The main issue of thermal modeling for spacecraft design is really making sure that all the other points of the spacecraft will be within the temperature bounds they are designed for: i.e. The thermal rating for a DC/DC converter is widely different than that of a simple CMOS or the lens of a camera. Hence computing the radiator temperature is of paramount importance and can be done very quickly with a one node analysis. Yes, you read this right, at the beginning, there is no need for Finite Element computations in spacecraft analysis except maybe for very specific components and very specific conditions. The most important computation is figuring out this one spacecraft-node analysis. In terms of modeling, it doesn't get any simpler and it is robust. The issues that tend to crop up are when one gets into the detailed power consumption and thermal energy flow within the spacecraft as more detailed constraints. are added To summarize the issues, let me try to follow the list of issues that is making up the definition of problems needing Robust Mathematical Modeling as a guideline:

1. The laws describing the phenomena are not completely known ;

In fact, in this case the laws are known but there are large uncertainties at many different levels:
  • each element of the spacecraft has a thermal conductance, but since one is dealing with heterogeneous elements like a CMOS or a slab of aluminum, the designer is constrained into a lumped analysis involving a delicate weighting.
  • the thermal contact resistances / conductances of the electronics are generally unknowns in terms of performance especially in vacuum. Most information on the electronics is given when convection is available (for ground use). Even when environment is known, electronics components are very hard to evaluate. See this very interesting thread on LinkedIn.
  • the thermal contact conductance of two pieces of metals connected to each other through nuts and bolts is by no means a trivial subject. The contact conductance certainly depends on  how much torque was put on the washer/nuts/bolts and the level of vacuum.
  • the space environment produces different heating and cooling conditions that are inherently different based on the positioning of the spacecraft, its orbit, etc...
  • in order to regulate temperature efficiently, cloth and paints are covering the spacecraft for the duration of its life. There are uncertainties with regards to how these decay over time and most computations include Beginning Of Life (BOL) and End Of Life (EOL) estimates.
  • An element of confusion is a mathematical one too. Since most of the thermal power is managed through conduction, radiation transport (a nonlinear term in T^4) is generally modeled as a linear node. When temperature gets too high, the conductance node varies with temperature to follow the nonlinear T^4 term.

2. The data are missing or corrupted ;

Spacecraft are generally designed with a clear emphasis on reducing its weight at the subsystem or bus level. A GEO satellite maker would rather put one more transponder bringing revenue on its spacecraft than add additional instrumentation to provide data to the ground. Experimental data is rare in spacecraft design because real conditions are rarely fully instrumented. Tests are performed at every iteration of the spacecraft design though, but they are not total reproduction of the actual thermal environment sustained by the future spacecraft. For instance, Sun lamps only produce some subset of the wavelengths given by the Sun, so it difficult to find the thermo-optical properties of some paints or the efficiency of some solar cells. While vacuum tests get rid of the convection issue, it can do little to evaluate the performance of systems that rely on convection inside said systems such as loop-heat pipes.

3. The objectives are multiple and contradictory.

Four words: Hot Case, Cold Case. The worst thermal environments are generally sought in order to provide acceptable bounds during the lifetime of the spacecraft. It is no small secret to say that these two objectives are contradictory. One of the two cases, the cold case generally, also generate additional mass to remedy to it. Adding mass for a subsystem is not considered optimal as every kilogram in Low Earth Orbit cost about $10,000. The number is obviously higher for Geostationary Orbit. Another surprising element is that sometimes, the colder case is not an obvious one so the solver really has to go through many different types of iterations to define what that case it.
The objective to have the lightest spacecraft possible also flies in the face of thermal "equilibiuim". the less thermal mass a spacecraft has, the less capable it will be able to handle environmental swings. Cubesats, for instance, fall in this extreme category of spacecraft for which thermal fluctuations can be very large (and bring about a possible thermal "event:" such electronics board cracking, etc....)

As the design progresses the modeling is iterated upon testing (from components to whole spacecrafts) but as Isidoro Martinez points out, this really is just the beginning of a long processes fitting models with the result of the few experimentl data gathered through the lengthy Thermal Balance and Themal Vacuum Tests (TB/TV tests: From here:

Spacecraft thermal testing

Measurement is the ultimate validation of real behaviour of a physical system. But tests are expensive, not only on the financial budget but on time demanded and other precious resources as qualified personnel. As a trade-off, mathematical models are developed to provide multi-parametric behaviour, with the hope that, if a few predictions are checked against physical tests, the model is validated to be reliable to predict the many other situations not actually tested.

Final testing of a large spacecraft for acceptance is at the present limit of technology, since very large vacuum chambers, with a powerful collimated solar-like beam, and walls kept at cryogenic temperatures, must be provided (and the spacecraft able to be deployed, and rotated in all directions, while measuring). 

Typical temperature discrepancy between the most advanced numerical simulation and the most expensive experimental tests may be some 2 K for most delicate components in integrated spacecraft (much lower when components are separately tested).

Picture of the large vacuum chamber at NASA JSC where the Apollo LEM was tested.

An item of considerable interest to the thermal designer is to reduce substantially the time it takes to fit the few experimental results to the models. This part is by no means trivial given the nonlinearities of radiation heat transfer. The time required to fit models to the experiments is critical as tests are expensive and lengthy. From here the timeline for the TB/TV test of the Rosetta spacecraft that took thirty days:

Additional information can be found in:

Wednesday, November 3, 2010

RMM Example #1: Ground Motion Models for Probabilistic Seismic Hazard Analysis

Following up on my bleg, Nicolas Kuehn kindly responded to my query with the following e-mail:

Dear Igor,

I am writing to you because I saw your blogpost with a call for help/bleg. I am not sure if this is what you had in mind, but I might have something to contribute.

I am a seismologist working on ground motion models for probabilistic seismic hazard analysis (PSHA). In PSHA, we try to estimate the exceedance rate of a destructive ground motion level for a particular site (e.g. a nuclear power plant). It basically comes down to the following:
  1. What is the probability that an earthquake with magnitude M occurs in a distance R from the site in a particular time period.
  2. Given M and R, what is the probability that a certain ground motion level will be exceeded.
  3. Integrate over all magnitude and distance combinations.

I am working on part 2. Here, we want to estimate the conditional probability of the ground motion parameter Y given magnitude and distance. Y can be something like peak ground acceleration. Usually, a lognormal distribution is assumed, and we get something like this:

log Y = f(M,R)+epsilon

The parameters of f are estimated from large strong motion datasets, which consist of recordings of ground motion from large earthquakes at seismic stations.
Now comes the part where you are probably interested in. The estimation of f is not easy. There is physical knowledge about the relationships between Y and M and R, but there are still unexplained effects. For example, physical models make the assumption that an earthquake is a point source, but large earthquakes can have fault dimensions of up 10s or 100s of kilometers. This has to be taken into account. There are also other variables (site effects, rupture directivity effects and others) which influence ground motion. Not for all of them exist physical knowledge about the exact relation.
Another problem is missing data. Site effects are usually quantified by the shear wave profile under the station, but this is not always available. Similar, there is missing data for other variables as well.

There is also the problem that the data is not independent. One normally has many recordings from one earthquake at different stations, which are not independent. Similarly, you can have on station recording several earthquakes.

As I said, I am not sure if this is what you had in mind as problems, but if you are interested, I can provide you with more details....
For illustration purpose here a graph showing the seismicity of Germany ( a fact I was largely unaware of)

 back to Nicolas' problem which is not centered on just Germany,  I then responded with the following:

Thanks for the e-mail. I like your problem. ... I am curious on how: - the lognormality of Y was assumed
  • is there a predetermined shape for f
  • how was this shape found how ?  
  • how do you account for missing data ?

Any other detail is also interesting as well.
Nicolas then responded with:

Hi Igor,

I'm glad that you like the problem. I am also interested to hear about problems from other fields, and how these fields cope. ... Below, I have provided some more details.

- Lognormality: There are two reasons why this is assumed. One is data. I have attached a normal probability plot of ground-motion data from a strong motion dataset (about 3000 datapoints). The plot is taken from some lecture, and you can see that the points follow a straight line, but there are some deviations in the tails.
The second one is more theoretical: The Fourier spectrum of seismic ground motion can be written as F(f) = E(f)xP(f)xS(f), where f is frequency and E(f) is the source component, P(f) is the path and S(f) is the site part (so F(f) is a convolution of these parts). This is a multiplication of random variables, which leads to a lognormal distribution.
There is one problem with the assumption of lognormality which is widely recognized, but no satisfying solution has been proposed: It assigns a nonzero probability for very high, physically impossible ground motions (the tails). This becomes especially important for very low exceedance rates, where these tails become important. Critical facilities such as nuclear power plants or waste deposits must be designed to withstand ground motions with very low exceedance rates.

* shape for f: This is also based on the Fourier spectrum. The source part is proportional to e^M, the path part is proportional to 1/R e^-R, so the most simple model becomes:
f(M,R) = a +bM-cLog(R)-dR
This is based on a model that treats an earthquake as a point. For extended faults, where each part of the fault can emit seismic waves, there is interference and so on. Two observations can be made:
1. a magnitude saturation, where the larger the magnitude, the less the difference in ground motion. This is modeled usually either by a m^2 term in f or by a piecewise linear function for the magnitude.
2. There is an interaction term between M and R (the larger the magnitude, the less the decrease of Y with distance). This is modeled either as (c+c1M)Log(R) or by cLog(R+de^(gM)).

Site effects are usually modeled as discrete variables (the near surface underground is classified as ROCK, STIFF SOIL, SOFT SOIL), each with an individual coefficient. There are different ways how people classify the site conditions, though. In newer studies, one finds also the use of Vs30, which is the average shear wave velocity in the upper 30m, as a predictor variable.

Then, there is the style-of-faulting, which measures how the fault ruptures (horizontally or vertically). It is a discrete, three valued variable.

This leads to this form, which forms the basis of most ground-motion models:
f = a_1+a_2M+(a_3+a_4M) Log (R)+a_5R+a_6 SS+a_7 SA+a_8 FN+a_9FR,
where SS is 1 if the site class is STIFF SOIL and 0 otherwise, SA is one of the site class is SOFT SOIL, and FN is 1 if the style-of-faulting is normal, FR is 1 if the style-of-faulting is reverse.

Newer models take into account more variables and effects (fault dip, nonlinear site amplification, whether the earthquake ruptures the surface or not).

* missing data: This is treated differently
Sometimes, it can be possible to estimate one variable from a related one. E.g., there exist different magnitude measures (moment magnitude, local magnitude, body wave magnitude), on there exist conversion rules between them. There exist also different distance measures, which can be converted.
If a station has no shear wave velocity profile, one can look at the geology. This all introduces uncertainty, though.

What is also sometimes done is first determining the coefficients of magnitude and distance (for which information is usually complete), and later determine the remaining coefficients using the data that is available.

I have tried to determine the coefficients of a model using Bayesian inference via OpenBUGS, where I treated the missing data as parameters for which a posterior distribution was determined.


I am leaving the problematic as is for the time being. In many respects, the type of issues mentioned in Nicolas' problem are very reminiscent of a whole slew of problems found in many different science and engineering fields and subfields. One more thing, Nicolas also has a presentation and a poster that talks about some part of his implementation in A Bayesian Ground Motion Model for Estimating the Covariance Structure of Ground Motion Intensity Parameters and A hierarchical Global Ground Motion Model to Take Into Account Regional Differences.
Here is a video of the presentation:

This is a very nice example where methodologies for Robust Mathematical Modeling are indeed needed, we can see here that
  • The data are missing or corrupted ;
  • The laws describing the phenomena are not completely known ;

Thanks Nicolas !

Friday, October 29, 2010

Robust Optimization and the Donoho-Tanner Phase Transition

Of interest to this blog on Robust modeling here is an excerpt from Nuit Blanche:

Similarly, Sergey points me to this arxiv preprint which made a passing reference to CS: Theory and Applications of Robust Optimization by Dimitris Bertsimas, David B. Brown, Constantine Caramanis. The abstract reads:
In this paper we survey the primary research, both theoretical and applied, in the area of Robust Optimization (RO). Our focus is on the computational attractiveness of RO approaches, as well as the modeling power and broad applicability of the methodology. In addition to surveying prominent theoretical results of RO, we also present some recent results linking RO to adaptable models for multi-stage decision-making problems. Finally, we highlight applications of RO across a wide spectrum of domains, including finance, statistics, learning, and various areas of engineering.
Reading the paper yields to this other paper I had mentioned back in April:
which makes a statement about Robust Linear Regression which in our world translates into multiplicative noise. More Rosetta Stone moments....In the meatnime, you might also be interested in the NIPS 2010 Workshop, entitled Robust Statistical learning (robustml): 

Many of these approaches are based on the fact that data are used to learn or fit models. In effect, most of the literature is focused on linear modeling. Quite a few interesting results have come out of these areas including what I have called the Donoho-Tanner phase transition. I will come back to this subject in another blog entry.

Credit: NASA.

Thursday, October 28, 2010

Call for Help / Bleg: Seeking Technical Areas Where Modeling Is Difficult.

While the recent presentations at SCM were enlightening with regards to known problems that are hard  to model, I wonder if any of the readers have a specific knowledge in a certain subject area where modeling is difficult. Please contact me and we can probably run a Q&A on this blog. If you want to remain anonymous, because you are feeling uncertain about discussing the uncertainties of the modeling in your area, I can also anonymize the Q&A.

The number of readers of this blog is currently at about 80 but I expect it to grow as this issue of robust modeling keeps raising its ugly head in many different field of science and engineering.  Let us recall the areas where robust mathematical modeling might be beneficial:

  • The data are missing or corrupted ;
  • The laws describing the phenomena are not completely known ;
  • The objectives are multiple and contradictory.
  • The computational chain has too many variables.

In the meantime, I'll feature some of the problematic I have seen that never had an easy modeling answer. 

P.S:A bleg is a beg on a blog :-)

Credit photo: ESA, NASA

Tuesday, October 26, 2010

SCM Talks: Electricity Production Management and the MOX Computational Chain

In a different direction than certain communities that are wondering if outreach to applied communities is a good thing, Bernard Beauzamy, a mathematician by trade and owner of SCM, hosted a small workshop last week on the limits of modelisation  ("Les limites de la modélisation"  in French). The workshop featured a set of speakers who are specialists in their fields yet will present their domain expertise in light of how mathematical modeling help or did not help answer their complex issues. We are not talking about just some optimization function with several goals but rather some deeper questioning on how the modeling of reality and reality itself clash with each other. While the presentation were in French, some of the slides do not need much translation if you are coming from English. Here is the list of talks with a link to the presentations:

9 h – 10 h : Dr. Riadh Zorgati, EdF R&D : Le management de l'énergie ; tentatives de modélisation : succès et échecs.
11 h – 12 h : Dr. France Wallet, Evaluation des risques sanitaires et environnementaux, EdF, Service de Santé : Modélisation en santé-environnement : pièges et limites.

14 h – 15 h : M. Giovanni Bruna, Directeur adjoint, Direction de la Sûreté des Réacteurs, Institut de Radioprotection et de Sûreté Nucléaire : Simulation-expérimentation : qui a raison ? L’expérience du combustible MOX.
16 h – 17 h : M. Xavier Roederer, Inspecteur Mission Contrôle Audit Inspection, Agence Nationale de l'Habitat : Peut-on prévoir sans modéliser ?

I could attend only two of the talks: the first and the third. In the first talk, Riadh Zorgati talked about modeling as applied in the context electricity production. He did a great job of providing the different timescales and attendant need for algorithm simplification when it comes to planning/scheduling electricity production in France. Every power plant and hydraulic resources owned by EDF (the main utility in France) have different operating procedures and capabilities as respect to how they can produce power to the grid. Since electricity has to have a continuous equilibrium between production and consumption, an aspect of the scheduling involves computing the need of the country the day after given various input a day before. As it turns out the modeling could be very detailed, but it would lead to a prohibitive computational time to get an answer for the next day of planning (more than a day's worth). The modeling is simplified to a certain extent by resorting to greedy algorithms if I recall to enable quicker predictions. The presentation has much more in it but it was interesting to see that a set of good algorithms were clearly money makers for the utility.

The third presentation was by Giovanni Bruna who talked about the problematic of figuring out how to extract meaningful information out of a set of experiments and computations in the case of plutonium use in nuclear reactors.He spent the better half of the presentation going through a nuclear engineering 101 class that featured a good introduction on the subject of plutonium use in nuclear reactors. Plutonium is a by-product of  the consumption of uranium in a nuclear reactor. In fact, after an 18 month cycle, more than 30 percent of the power of an original uranium rod is produced by the plutonium created in that time period. After some time in the core, the rod is retrieved so that it can be reprocessed yielding the issue of how plutonium can be reused in a material called MOX (at least in France, in the U.S. a policy of no reprocessing is the law of the land). It turns out that plutonium is different from uranium because of its high epithermal cross section yielding a harder spectrum than the one found with uranium. The conundrum faced by the safety folks resides in figuring out how the current measurements and attendant extrapolation to power levels can be done in confidence when replacing uranium by plutonium. The methods used with uranium have more than 40 years of history, with plutonium not so much. It turns out to be a difficult endeavor that can only be managed with a constant investigation between well done experiments and a revision of the calculation processes and a heavy use of margins. This example is also fascinating because this type of exercise reveals all the assumptions built in the computational chain starting from the cold subcritical assemblies Monte Carlo runs all the way to the expected power level found in actual nuclear reactor cores.  It is a computational chain because the data from the experiment does not say anything directly about the actual variable of interest (here the power level). As opposed to Riadh's talk, the focus here is to make sure that the mathematical modeling is robust to changes in assumptions on the physics of the system.

Thanks Bernard for hosting the workshop, it was enlightening.

Thursday, September 16, 2010

Learning Functions in High Dimensions

Ever since Bernard Beauzamy asked me the question on the sampling needed to determine a function, the question has gotten stuck in the back of my mind. Bernard has devised the Experimental Probabilistic Hypersurface (EPH) but some other more theoretical development have popped  up in the past two years. Here is a list of the different papers and links to Nuit Blanche (a blog mostly focused on Compressed Sensing) that try to provide an answer to the subject.  Without further due, here is the list:

Learning Functions of Few Arbitrary Linear Parameters in High Dimensions by Massimo Fornasier, Karin Schnass, Jan Vybíral. The abstract reads:
Let us assume that $f$ is a continuous function defined on the unit ball of $\mathbb R^d$, of the form $f(x) = g (A x)$, where $A$ is a $k \times d$ matrix and $g$ is a function of $k$ variables for $k \ll d$. We are given a budget $m \in \mathbb N$ of possible point evaluations $f(x_i)$, $i=1,...,m$, of $f$, which we are allowed to query in order to construct a uniform approximating function. Under certain smoothness and variation assumptions on the function $g$, and an {\it arbitrary} choice of the matrix $A$, we present in this paper 1. a sampling choice of the points $\{x_i\}$ drawn at random for each function approximation; 2. an algorithm for computing the approximating function, whose complexity is at most polynomial in the dimension $d$ and in the number $m$ of points. Due to the arbitrariness of $A$, the choice of the sampling points will be according to suitable random distributions and our results hold with overwhelming probability. Our approach uses tools taken from the {\it compressed sensing} framework, recent Chernoff bounds for sums of positive-semidefinite matrices, and classical stability bounds for invariant subspaces of singular value decompositions.
From this entry

From this entry:

Following up on some elements shown in the third Paris Lecture of Ron DeVore, here is the more substantive paper: Approximation of functions of few variables in high dimensions by Ron DeVore, Guergana Petrova, Przemyslaw Wojtaszczyk. The abstract reads:

Let f be a continuous function defined on \Omega:= [0, 1]^N which depends on only l coordinate variables, f(x1, . . . , xN) = g(xi1 , . . . , xi` ). We assume that we are given m and are allowed to ask for the values of f at m points in \Omega. If g is in Lip1 and the coordinates i_1, . . . , i_l are known to us, then by asking for the values of f at m = L^l uniformly spaced points, we could recover f to the accuracy |g|Lip1L−1 in the norm of C(). This paper studies whether we can obtain similar results when the coordinates i_1, . . . , i_l are not known to us. A prototypical result of this paper is that by asking for C(l)L^l(log2 N) adaptively chosen point values of f, we can recover f in the uniform norm to accuracy |g|Lip1L−1 when g 2 Lip1. Similar results are proven for more general smoothness conditions on g. Results are also proven under the assumption that f can be approximated to some tolerance \epsilon (which is not known) by functions of l variables.
I note that the authors make a connection to the Junta's problem as discussed by Dick Lipton recently and mentioned here.

From this entry:

Albert Cohen just released the 4th course notes of Ron DeVore's 4th lecture in Paris entitled Capturing Functions in Infinite Dimensions ( the third lecture is here, and the first and second are here), The abstract reads:
The following are notes on stochastic and parametric PDEs of the short course in Paris. Lecture 4: Capturing Functions in Infinite Dimensions. Finally, we want to give an example where the problem is to recover a function of infinitely many variables. We will first show how such problems occur in the context of stochastic partial differential equations.
From here:

The Course Notes of the third lecture given by Ron DeVore in Paris has been released on Albert Cohen's page. It features "Capturing Functions in High Dimensions" and seems to aim at giving bounds for nonlinear compressed sensing and should have an impact in manifold signal processing. Interesting. The first two lectures were mentioned here. The beginning of the lecture starts with

1.1 Classifying High Dimensional Functions:
Our last two lectures will study the problem of approximating (or capturing through queries) a function f defined on ⊂ R^N with N very large. The usual way of classifying functions is by smoothness. The more derivatives a function has the nicer it is and the more efficiently it can be numerically approximated. However, as we move into high space dimension, this type of classification will suffer from the so-called curse of dimensionality which we shall now quantify
From here:
Albert Cohen just released Ron Devore's lecture notes he is presenting in Paris entitled: Foundations of Compressed Sensing.
From here:

* Ronald DeVore: Recovering sparsity in high dimensions

The abstract reads:
We assume that we are in $R^N$ with $N$ large. The first problem we consider is that there is a function $f$ defined on $Omega:=[0,1]^N$ which is a function of just $k$ of the coordinate variables: $f(x_1,dots,x_N)=g(x_{j_1},dots,x_{j_k})$ where $j_1,dots,j_k$ are not known to us. We want to approximate $f$ from some of its point values. We first assume that we are allowed to choose a collection of points in $Omega$ and ask for the values of $f$ at these points. We are interested in what error we can achieve in terms of the number of points when we assume some smoothness of $g$ in the form of Lipschitz or higher smoothness conditions.
We shall consider two settings: adaptive and non-adaptive. In the adaptive setting, we are allowed to ask for a value of $f$ and then on the basis of the answer we get we can ask for another value. In the non-adaptive setting, we must prescribe the $m$ points in advance.A second problem we shall consider is when $f$ is not necessarily only a function of $k$ variables but it can be approximated to some tolerance $epsilon$ by such a function. We seek again sets of points where the knowledge of the values of $f$ at such points will allow us to approximate $f$ well. Our main consideration is to derive results which are not severely effected by the size of $N$, i.e. are not victim of the curse of dimensionality. We shall see that this is possible.