Redefining optimal is a blog entry by some of the folks at the Department of Systems Biology at Harvard Medical School. It is very nicely written and includes some nice comments. The entry specifically points to a paper by Fernández Slezak D, Suárez C, Cecchi GA, Marshall G, & Stolovitzky G (2010) entitled When the optimal is not the best: parameter estimation in complex biological models (PloS one, 5 (10) PMID: 21049094). The abstract and conclusions read:

Abstract

BACKGROUND: The vast computational resources that became available during the past decade enabled the development and simulation of increasingly complex mathematical models of cancer growth. These models typically involve many free parameters whose determination is a substantial obstacle to model development. Direct measurement of biochemical parameters in vivo is often difficult and sometimes impracticable, while fitting them under data-poor conditions may result in biologically implausible values.

RESULTS: We discuss different methodological approaches to estimate parameters in complex biological models. We make use of the high computational power of the Blue Gene technology to perform an extensive study of the parameter space in a model of avascular tumor growth. We explicitly show that the landscape of the cost function used to optimize the model to the data has a very rugged surface in parameter space. This cost function has many local minima with unrealistic solutions, including the global minimum corresponding to the best fit.

CONCLUSIONS: The case studied in this paper shows one example in which model parameters that optimally fit the data are not necessarily the best ones from a biological point of view. To avoid force-fitting a model to a dataset, we propose that the best model parameters should be found by choosing, among suboptimal parameters, those that match criteria other than the ones used to fit the model. We also conclude that the model, data and optimization approach form a new complex system and point to the need of a theory that addresses this problem more generally.

Evidently, the post would have some relevance to compressive sensing if the model were to be linear, which it is not in this case.

## 4 comments:

Deconvolution of the point spread function of a computed tomography algorithm similarly leads to a solution of underdetermined equations that is far from any optimization algorithm. See:

Dhawan, A.P., R. Gordon & R.M. Rangayyan (1984). Nevoscopy: three-dimensional computed tomography for nevi and melanomas in situ by transillumination. IEEE Trans. Med. Imaging MI-3(2), 54-61.

Dhawan, A.P., R.M. Rangayyan & R. Gordon (1984). Wiener filtering for deconvolution of geometric artifacts in limited-view image reconstruction. Proc. SPIE 515, 168-172.

Dhawan, A.P., R.M. Rangayyan & R. Gordon (1985). Image restoration by Wiener deconvolution in limited-view computed tomography. Applied Optics 24(23), 4013-4020.

Rangayyan, R.M., A.P. Dhawan & R. Gordon (1985). Algorithms for limited-view computed tomography: an annotated bibliography and a challenge. Applied Optics 24(23), 4000-4012.

Yours, -Dick Gordon gordonr@cc.umanitoba.ca

They are defining optimal as the least squares fit to the training data, adding that "conventional wisdom would indicate that the best parameter set is the one that minimizes the cost function, i.e. the best fit to the experimental data." Sounds like the concept of overfitting is not so widely known in computational biology.

I agree with Yaroslav. They use simple LSM without any regularizes or outliers rejection. No wonder they got nonsensical optimums. They should have tried L1 IRLS at least (my favorite :)

Yaroslav and Sergei,

If you read the comments (Yaroslav I know you have), it looks like this community is slowly being made aware of the regularization issues but I think one of the issue the first commenter rightly points to is the need for them to agree that parcimony is a good thing or a recognizable fact. Then, they can go on and customize and improve the whole arsenal of tools used so far in Machine Learning, statistics, CS. Compressive sensing for instance is a nice framework in this context not because of the potential parcimony but rather because it is robust. This community is really looking at a vexing problem: they spend an enormous amount of resources in the description of the circuitry and so after all these efforts, modeling can only be a supplemental task to this endeavor. In other words, parcimony may not be an obvious interest to them considering the heavy investement of the description task.

Igor.

Post a Comment