Highly influential “dual-process” accounts of human cognition postulate the coexistence of a slow accurate system with a fast error-prone system. But why would there be just 2 systems rather than, say, one or 93?
Here, we argue that a dual-process architecture might reflect a rational tradeoff between the cognitive flexibility afforded by multiple systems and the time and effort required to choose between them. We investigate what the optimal set and number of cognitive systems would depend on the structure of the environment.
We find that the optimal number of systems depends on the variability of the environment and the difficulty of deciding when which system should be used. Furthermore, we find that there is a plausible range of conditions under which it is optimal to be equipped with a fast system that performs no deliberation (“System 1”) and a slow system that achieves a higher expected accuracy through deliberation (“System 2”).
Our findings thereby suggest a rational reinterpretation of dual-process theories.
…We study this problem in 4 different domains where the dual systems framework has been applied to explain human decision-making: binary choice, planning, strategic interaction, and multi-alternative, multi-attribute risky choice. We investigate how the optimal cognitive architecture for each domain depends on the variability of the environment and the cost of choosing between multiple cognitive systems, which we call metareasoning cost.
We estimate the distribution of television advertising elasticities and the distribution of the advertising return on investment (ROI) for a large number of products in many categories…We construct a data set by merging market (DMA) level TV advertising data with retail sales and price data at the brand level…Our identification strategy is based on the institutions of the ad buying process.
Our results reveal substantially smaller advertising elasticities compared to the results documented in the literature, as well as a sizable percentage of statistically insignificant or negative estimates. The results are robust to functional form assumptions and are not driven by insufficient statistical power or measurement error.
The ROI analysis shows negative ROIs at the margin for more than 80% of brands, implying over-investment in advertising by most firms. Further, the overall ROI of the observed advertising schedule is only positive for one third of all brands.
[Keywords: advertising, return on investment, empirical generalizations, agency issues, consumer packaged goods, media markets]
…We find that the mean and median of the distribution of estimated long-run own-advertising elasticities are 0.023 and 0.014, respectively, and 2 thirds of the elasticity estimates are not statistically different from zero. These magnitudes are considerably smaller than the results in the extant literature. The results are robust to controls for own and competitor prices and feature and display advertising, and the advertising effect distributions are similar whether a carryover parameter is assumed or estimated. The estimates are also robust if we allow for a flexible functional form for the advertising effect, and they do not appear to be driven by measurement error. As we are not able to include all sensitivity checks in the paper, we created an interactive web application that allows the reader to explore all model specifications. The web application is available.
…First, the advertising elasticity estimates in the baseline specification are small. The median elasticity is 0.0140, and the mean is 0.0233. These averages are substantially smaller than the average elasticities reported in extant meta-analyses of published case studies (Assmus, Farley, and Lehmann (1984b), Sethuraman, Tellis, and Briesch (2011)). Second, 2 thirds of the estimates are not statistically distinguishable from zero. We show in Figure 2 that the most precise estimates are those closest to the mean and the least precise estimates are in the extremes.
…6.1 Average ROI of Advertising in a Given Week:
In the first policy experiment, we measure the ROI of the observedadvertising levels (in all DMAs) in a given week t relative to not advertising in week t. For each brand, we compute the corresponding ROI for all weeks with positive advertising, and then average the ROIsacross all weeks to compute the average ROI of weekly advertising. This metric reveals if, on the margin, firms choose the (approximately) correct advertising level or could increase profits by either increasing or decreasing advertising.
We provide key summary statistics in the top panel of Table III, and we show the distribution of the predicted ROIs in Figure 3(a). The average ROI of weekly advertising is negative for most brands over the whole range of assumed manufacturer margins. At a 30% margin, the median ROI is −88.15%, and only 12% of brands have positive ROI.Further, for only 3% of brands the ROI is positive and statisticallydifferent from zero, whereas for 68% of brands the ROI is negative and statistically different from zero.
These results provide strong evidence for over-investment in advertising at the margin. [In Appendix C.3, we assess how much larger the TV advertising effects would need to be for the observed level of weekly advertising to be profitable. For the median brand with a positive estimated ad elasticity, the advertising effect would have to be 5.33× larger for the observed level of weekly advertising to yield a positive ROI (assuming a 30% margin).]
6.2 Overall ROI of the Observed Advertising Schedule: In the second policy experiment, we investigate if firms are better off when advertising at the observed levels versus not advertising at all. Hence, we calculate the ROI of the observed advertising schedule relative to a counterfactual baseline with zero advertising in all periods.
We present the results in the bottom panel of Table III and in Figure 3(b). At a 30% margin, the median ROI is −57.34%, and 34% of brands have a positive return from the observed advertising schedule versus not advertising at all. Whereas 12% of brands only have positive and 30% of brands only negative values in their confidence intervals, there is more uncertainty about the sign of the ROI for the remaining 58% of brands. This evidence leaves open the possibility that advertising may be valuable for a substantial number of brands, especially if they reduce advertising on the margin.
…Our results have important positive and normative implications. Why do firms spend billions of dollars on TV advertising each year if the return is negative? There are several possible explanations. First, agency issues, in particular career concerns, may lead managers (or consultants) to overstate the effectiveness of advertising if they expect to lose their jobs if their advertising campaigns are revealed to be unprofitable. Second, an incorrect prior (i.e., conventional wisdom that advertising is typically effective) may lead a decision maker to rationally shrink the estimated advertising effect from their data to an incorrect, inflated prior mean. These proposed explanations are not mutually exclusive. In particular, agency issues may be exacerbated if the general effectiveness of advertising or a specific advertising effect estimate is overstated. [Another explanation is that many brands have objectives for advertising other than stimulating sales. This is a nonstandard objective in economic analysis, but nonetheless, we cannot rule it out.] While we cannot conclusively point to these explanations as the source of the documented over-investment in advertising, our discussions with managers and industry insiders suggest that these may be contributing factors.
We investigate how people make choices when they are unsure about the value of the options they face and have to decide whether to choose now or wait and acquire more information first.
In an experiment, we find that participants deviate from optimal information acquisition in a systematic manner. They acquire too much information (when they should only collect little) or not enough (when they should collect a lot). We show that this pattern can be explained as naturally emerging from Fechner cognitive errors. Over time participants tend to learn to approximate the optimal strategy when information is relatively costly.
…We design a controlled situation where individuals have to choose between 2 alternatives with uncertain payoffs. Before making a choice, they have the opportunity to wait and collect additional (costly) pieces of information which help them get a better idea of the likely alternatives’ payoffs. The design of the experiment allows us to precisely identify the optimal sequential sampling strategy and to assess whether participants are able to approximate it.
We find that participants deviate in systematic ways from the optimal strategy. They tend to hesitate too long and oversample information when it is relatively costly, and therefore when the optimal strategy is to collect only little information. On the contrary, they tend to undersample information when it is relatively cheap, and therefore when the optimal strategy is to collect a lot of information. We show that this pattern of oversampling and undersampling can be explained as the result of Fechner cognitive errors which introduce stochasticity in decisions about whether or not to stop. Cognitive errors create a risk to stop at any time by mistake. When the optimal level of information to acquire is high, DMs should continue to sample information for a long time. As a consequence, errors are likely to lead to stop too early, and therefore to undersampling. When the optimal level of evidence to acquire is low, DMs should stop sampling early. In that case, cognitive errors are more likely to lead to fail to stop early enough, and therefore to oversampling. The deviations we observe, lead participants to lose between 10 and 25% of their potential payoff. However, participants learn to get closer to the optimal strategy over time, as long as information is relatively costly.
This review uses the empirical analysis of portfolio choice to illustrate econometric issues that arise in decision problems. Subjective expected utility (SEU) can provide normative guidance to an investor making a portfolio choice. The investor, however, may have doubts on the specification of the distribution and may seek a decision theory that is less sensitive to the specification. I consider three such theories: maxmin expected utility, variational preferences (including multiplier and divergence preferences and the associated constraint preferences), and smooth ambiguity preferences. I use a simple two-period model to illustrate their application. Normative empirical work on portfolio choice is mainly in the SEU framework, and bringing in ideas from robust decision theory may be fruitful.
Speed-accuracy trade-off (SAT) is the tendency for decision speed tocovary with decision accuracy. SAT is an inescapable property of aimed movements being present in a wide range of species, from insects to primates. An aspect that remains unsolved is whether SAT extends to plants’ movement.
Here, we tested this possibility by examining the swaying in circles of the tips of shoots exhibited by climbing plants (Pisum sativum L.) as they approach to grasp a potential support. In particular, by means of 3-dimensional kinematical analysis, we investigated whether climbing plants scale movement velocity as a function of the difficulty to coil a support.
Results showed that plants are able to process the properties of the support before contact and, similarly to animal species, strategically modulate movement velocity according to task difficulty.
…To date, a great absent in the Fitts’s law literature is the “green kingdom.” At first glance, plants seem relatively immobile, stuck to the ground in rigid structures and, unlike animals, unable to escape stressful environments. But, although markedly different from those of animals, movement pervades all aspects of plant behavior (Darwin & Darwin 1880). As observed by Darwin 1875, the tendrils of climbing plants undergo subtle movements around their axes of elongation. This elliptical movement, known as circumnutation, allows plants to explore their immediate surroundings in search, for instance, of a physical support to enhance light acquisition (Larson 2000). Also, Darwin (1875; see also Trewavas 2017) observed that the tendrils tend to assume the shape of whatever surface before they come into contact with. Implicitly this might signify that they “see” the support and plan the movement accordingly. In this view, climbing plants might be able to plan the course of an action ahead of time and program the tendrils’ choreography according to the “to-be-grasped” object.
Support for this contention comes from both theoretical and empirical studies suggesting that plant movement is not a simple product of cause–effect mechanisms but rather seems to be driven by processes that are anticipatory in nature (e.g., Calvo & Friston 2017; Guerra et al 2019). For instance, a recent study shows that a climbing plant (Pisum sativum L.) not only is able to perceive a potential support, but it also scales the kinematics of tendrils’ aperture according to its size well ahead they touch the stimulus (Guerra et al 2019). This has been taken as the demonstration that plants plan the movement purposefully and in ways that are flexible and anticipatory.
With this in mind, one of the empirical predictions stemming from Fitts’s law can be well-suited to model the 3-dimensional circumnutation of plants. Precisely, we refer to the evidence that movement time scales as a function of the target’s size: When the distance is constant, thinner targets are reached more slowly than thicker ones (see Murata & Iwase 2001). We test this prediction in Pisum sativum L. by assessing the change of velocity of the tendrils during their approach-to-grasp a thin or to a thicker support.
…Results…The analysis of movement time confirms this evidence, showing that movement time was shorter for the thinner than for the thicker stimulus (β < 0) with a probability of 79.3%. This evidence suggests that plants are able to process the properties of the support and are endowed with a form of perception underwriting a goal-directed and anticipatory behavior (Guerra et al 2019). However, in contrast with previous human and animal literature (e.g., Beggs & Howarth 1972; Fitts 1954; Heitz & Schall 2012), our results indicate an opposite pattern of what Fitts’s law predicts. Remember that according to Fitts’s law, the velocity of the movement is inversely proportional to ID (2D/W). In other words, our results seem to suggest that plants exhibit more difficulty grasping a thicker than a thinner support. These findings are line with previous reports showing a lower success rate of attachment for thick supports (Peñalosa 1982), and a preference for plants to climb supports with a smaller diameter (Darwin 1875; Putz 1984; Putz & Holbrook 1992 [The Biology of Vines]). Furthermore, by using the curvature of tendrils during the twining phase, Goriely & Neukirch 2006 demonstrate that for thinner supports, the contact angle (i.e.t, the angle between the tip of the tendril and the tangent of the support) is a near-zero value. Instead, with thicker supports, the contact angle tends to increase as tendrils must curl into the support’s surface to maintain an efficient grip. When the support is too thick, the contact angle increases to an extent that the tendril curls back on itself, losing grip. Interestingly, field studies in rainforests showed that the presence of climbing plants tends to decrease in areas in which there is a prevalence of thicker supports (Carrasco-Urra & Gianoli 2009).
A possible explanation for this phenomenon may reside in the fact that, for plants, reaching to grasp thick supports is a more energy consuming process than grasping for thinner ones. Indeed, the grasping of a thick support implies that plants have to increase the tendril length in order to efficiently coil the support (Rowe et al 2006), and to strengthen the tensional forces to resist gravity (Gianoli 2015)
We analyze the Gambler’s problem, a simple reinforcement learning problem where the gambler has the chance to double or lose the bets until the target is reached. This is an early example introduced in the reinforcement learning textbook by Sutton and Barto (2018), where they mention an interesting pattern of the optimal value function with high-frequency components and repeating non-smooth points. It is however without further investigation. We provide the exact formula for the optimal value function for both the discrete and the continuous cases. Though simple as it might seem, the value function is pathological: fractal, self-similar, derivative taking either zero or infinity, and not written as elementary functions. It is in fact one of the generalized Cantor functions, where it holds a complexity that has been uncharted thus far. Our analyses could provide insights into improving value function approximation, gradient-based algorithms, and Q-learning, in real applications and implementations.
A school may improve its students’ job outcomes if it issues only coarse grades. Google can reduce congestion on roads by giving drivers noisy information about the state of traffic. A social planner might raise everyone’s welfare by providing only partial information about solvency of banks. All of this can happen even when everyone is fully rational and understands the data-generating process. Each of these examples raises questions of what is the (socially or privately) optimal information that should be revealed. In this article, I review the literature that answers such questions.
We provide generalizable and robust results on the causal sales effect of TV advertising based on the distribution of advertising elasticities for a large number of products (brands) in many categories. Such generalizable results provide a prior distribution that can improve the advertising decisions made by firms and the analysis and recommendations of anti-trust and public policy makers. A single case study cannot provide generalizable results, and hence the marketing literature provides several meta-analyses based on published case studies of advertising effects. However, publication bias results if the research or review process systematically rejects estimates of small, statistically insignificant, or “unexpected” advertising elasticities. Consequently, if there is publication bias, the results of a meta-analysis will not reflect the true population distribution of advertising effects.
To provide generalizable results, we base our analysis on a large number of products and clearly lay out the research protocol used to select the products. We characterize the distribution of all estimates, irrespective of sign, size, or statistical-significance. To ensure generalizability we document the robustness of the estimates. First, we examine the sensitivity of the results to the approach and assumptions made when constructing the data used in estimation from the raw sources. Second, as we aim to provide causal estimates, we document if the estimated effects are sensitive to the identification strategies that we use to claim causality based on observational data. Our results reveal substantially smaller effects of own-advertising compared to the results documented in the extant literature, as well as a sizable percentage of statistically insignificant or negative estimates. If we only select products with statistically-significant and positive estimates, the mean or median of the advertising effect distribution increases by a factor of about five.
The results are robust to various identifying assumptions, and are consistent with both publication bias and bias due to non-robust identification strategies to obtain causal estimates in the literature.
“Meta-learning of Sequential Strategies”, Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg (2019-05-08):
In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.
Implicit in the drug-approval process is a host of decisions—target patient population, control group, primary endpoint, sample size, follow-up period, etc.—all of which determine the trade-off between Type I and Type II error. We explore the application of Bayesian decision analysis (BDA) to minimize the expected cost of drug approval, where the relative costs of the two types of errors are calibrated using U.S. Burden of Disease Study 2010 data. The results for conventional fixed-sample randomized clinical-trial designs suggest that for terminal illnesses with no existing therapies such as pancreatic cancer, the standard threshold of 2.5% is substantially more conservative than the BDA-optimal threshold of 23.9% to 27.8%. For relatively less deadly conditions such as prostate cancer, 2.5% is more risk-tolerant or aggressive than the BDA-optimal threshold of1.2% to 1.5%. We compute BDA-optimal sizes for 25 of the most lethaldiseases and show how a BDA-informed approval process can incorporate all stakeholders’ views in a systematic, transparent, internally consistent, and repeatable manner.
Delphi is a procedure that produces forecasts on technological and social developments. This article traces the history of Delphi’s development to the early 1950s, where a group of logicians and mathematicians working at the RAND Corporation carried out experiments to assess the predictive capacities of groups of experts. While Delphi now has a rather stable methodological shape, this was not so in its early years. The vision that Delphi’s creators had for their brainchild changed considerably. While they had initially seen it as a technique, a few years later they reconfigured it as a scientific method. After some more years, however, they conceived of Delphi as a tool. This turbulent youth of Delphi can be explained by parallel changes in the fields that were deemed relevant audiences for the technique, operations research and the policy sciences. While changing the shape of Delphi led to some success, it had severe, yet unrecognized methodological consequences. The core assumption of Delphi that the convergence of expert opinions observed over the iterative stages of the procedure can be interpreted as consensus, appears not to be justified for the third shape of Delphi as a tool that continues to be the most prominent one.
In spite of its familiar phenomenology, the mechanistic basis for mental effort remains poorly understood. Although most researchers agree that mental effort is aversive and stems from limitations in our capacity to exercise cognitive control, it is unclear what gives rise to those limitations and why they result in an experience of control as costly. The presence of these control costs also raises further questions regarding how best to allocate mental effort to minimize those costs and maximize the attendant benefits. This review explores recent advances in computational modeling and empirical research aimed at addressing these questions at the level of psychological process and neural mechanism, examining both the limitations to mental effort exertion and how we manage those limited cognitive resources. We conclude by identifying remaining challenges for theoretical accounts of mental effort as well as possible applications of the available findings to understanding the causes of and potential solutions for apparent failures to exert the mental effort required of us.
What would you do if you were invited to play a game where you were given $25 and allowed to place bets for 30 minutes on a coin that you were told was biased to come up heads 60% of the time? This is exactly what we did, gathering 61 young, quantitatively trained men and women to play this game. The results, in a nutshell, were that the majority of these 61 players did not place their bets very well, displaying a broad panoply of behaviorial and cognitive biases. About 30% of the subjects actually went bust, losing their full $25 stake. We also discuss optimal betting strategies, valuation of the opportunity to play the game and its similarities to investing in the stock market. The main implication of our study is that people need to be better educated and trained in how to approach decision making under uncertainty. If these quantitatively trained players, playing the simplest game we can think of involving uncertainty and favourable odds, did not play well, what hope is there for the rest of us when it comes to playing the biggest and most important game of all: investing our savings? In the words of Ed Thorp, who gave us helpful feedback on our research: “This is a great experiment for many reasons. It ought to become part of the basic education of anyone interested in finance or gambling.”
25 large field experiments with major U.S. retailers and brokerages, most reaching millions of customers and collectively representing $3.44$2.82015 million in digital advertising expenditure, reveal that measuring the returns to advertising is difficult. The median confidence interval on return on investment is over 100 percentage points wide. Detailed sales data show that relative to the per capita cost of the advertising, individual-level sales are very volatile; a coefficient of variation of 10 is common. Hence, informative advertising experiments can easily require more than 10 million person-weeks, making experiments costly and potentially infeasible for many firms. Despite these unfavorable economics, randomized control trials represent progress by injecting new, unbiased information into the market. The inference challenges revealed in the field experiments also show that selection bias, due to the targeted nature of advertising, is a crippling concern for widely employed observational methods.
Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.
Over the past 10+ years, online companies large and small have adopted widespread A/B testing as a robust data-based method for evaluating potential product improvements. In online experimentation, it is straightforward to measure the short-term effect, ie., the impact observed during the experiment. However, the short-term effect is not always predictive of the long-term effect, ie., the final impact once the product has fully launched and users have changed their behavior in response. Thus, the challenge is how to determine the long-term user impact while still being able to make decisions in a timely manner.
We tackle that challenge in this paper by first developing experiment methodology for quantifying long-term user learning. We then apply this methodology to ads shown on Google search, more specifically, to determine and quantify the drivers of ads blindness and sightedness, the phenomenon of users changing their inherent propensity to click on or interact with ads.
We use these results to create a model that uses metrics measurable in the short-term to predict the long-term. We learn that user satisfaction is paramount: ads blindness and sightedness are driven by the quality of previously viewed or clicked ads, as measured by both ad relevance and landing page quality. Focusing on user satisfaction both ensures happier users but also makes business sense, as our results illustrate. We describe two major applications of our findings: a conceptual change to our search ads auction that further increased the importance of ads quality, and a 50% reduction of the ad load on Google’s mobile search interface.
The results presented in this paper are generalizable in two major ways. First, the methodology may be used to quantify user learning effects and to evaluate online experiments in contexts other than ads. Second, the ads blindness/sightedness results indicate that a focus on user satisfaction could help to reduce the ad load on the internet at large with long-term neutral, or even positive, business impact.
Classical theories of the firm assume access to reliable signals to measure the causal impact of choice variables on profit. For advertising expenditure we show, using 25 online field experiments (representing $3.60$2.82013 million) with major U.S. retailers and brokerages, that this assumption typically does not hold. Statistical evidence from the randomized trials is very weak because individual-level sales are incredibly volatile relative to the per capita cost of a campaign—a “small” impact on a noisy dependent variable can generate positive returns. A concise statistical argument shows that the required sample size for an experiment to generate sufficiently informative confidence intervals is typically in excess of ten million person-weeks. This also implies that heterogeneity bias (or model misspecification) unaccounted for by observational methods only needs to explain a tiny fraction of the variation in sales to severely bias estimates. The weak informational feedback means most firms cannot even approach profit maximization.
Randomized experiments are the “gold standard” for estimating causal effects, yet often in practice, chance imbalances exist in covariate distributions between treatment groups. If covariate data are available before units are exposed to treatments, these chance imbalances can be mitigated by first checking covariate balance before the physical experiment takes place. Provided a precise definition of imbalance has been specified in advance, unbalanced randomizations can be discarded, followed by a rerandomization, and this process can continue until a randomization yielding balance according to the definition is achieved. By improving covariate balance, rerandomization provides more precise and trustworthy estimates of treatment effects.
We measure the causal effects of online advertising on sales, using a randomized experiment performed in cooperation between Yahoo! and a major retailer. After identifying over one million customers matched in the databases of the retailer and Yahoo!, we randomly assign them to treatment and control groups. We analyze individual-level data on ad exposure and weekly purchases at this retailer, both online and in stores. We find statistically-significant and economically substantial impacts of the advertising on sales. The treatment effect persists for weeks after the end of an advertising campaign, and the total effect on revenues is estimated to be more than seven times the retailer’s expenditure on advertising during the study. Additional results explore differences in the number of advertising impressions delivered to each individual, online and offline sales, and the effects of advertising on those who click the ads versus those who merely view them. Power calculations show that, due to the high variance of sales, our large number of observations brings us just to the frontier of being able to measure economically substantial effects of advertising. We also demonstrate that without an experiment, using industry-standard methods based on endogenous crosssectional variation in advertising exposure, we would have obtained a wildly inaccurate estimate of advertising effectiveness.
Measuring the causal effects of online advertising (adfx) on user behavior is important to the health of the WWW publishing industry. In this paper, using three controlled experiments, we show that observational data frequently lead to incorrect estimates of adfx. The reason, which we label “activity bias”, comes from the surprising amount of time-based correlation between the myriad activities that users undertake online.
In Experiment 1, users who are exposed to an ad on a given day are much more likely to engage in brand-relevant search queries as compared to their recent history for reasons that had nothing do with the advertisement. In Experiment 2, we show that activity bias occurs for page views across diverse websites. In Experiment 3, we track account sign-ups at a competitor’s (of the advertiser) website and find that many more people sign-up on the day they saw an advertisement than on other days, but that the true “competitive effect” was minimal.
In all three experiments, exposure to a campaign signals doing “more of everything” in given period of time, making it difficult to find a suitable “matched control” using prior behavior. In such cases, the “match” is fundamentally different from the exposed group, and we show how and why observational methods lead to a massive overestimate of adfx in such circumstances.
The biopharmaceutical industry is facing unprecedented challenges to its fundamental business model and currently cannot sustain sufficient innovation to replace its products and revenues lost due to patent expirations.
The number of truly innovative new medicines approved by regulatory agencies such as the US Food and Drug Administration has declined substantially despite continued increases in R&D spending, raising the current cost of each new molecular entity (NME) to approximately US$2.43$1.82010 billion
Declining R&D productivity is arguably the most importantchallenge the industry faces and thus improving R&D productivity is its most important priority.
A detailed analysis of the key elements that determine overall R&Dproductivity and the cost to successfully develop an NME revealsexactly where (and to what degree) R&D productivity can (and must) be improved.
Reducing late-stage (Phase II and III) attrition rates and cycle times during drug development are among the key requirements for improving R&D productivity.
To achieve the necessary increase in R&D productivity, R&D investments, both financial and intellectual, must be focused on the ‘sweet spot’ of drug discovery and early clinical development, from target selection to clinical proof-of-concept.
The transformation from a traditional biopharmaceutical FIPCo(fully integrated pharmaceutical company) to a FIPNet (fullyintegrated pharmaceutical network) should allow a given R&D organization to ‘play bigger than its size’ and to more affordably fund the necessary number and quality of pipeline assets.
The pharmaceutical industry is under growing pressure from a range of environmental issues, including major losses of revenue owing to patent expirations, increasingly cost-constrained healthcare systems and more demanding regulatory requirements. In our view, the key to tackling the challenges such issues pose to both the future viability of the pharmaceutical industry and advances in healthcare is to substantially increase the number and quality of innovative, cost-effective new medicines, without incurring unsustainable R&Dcosts. However, it is widely acknowledged that trends in industry R&D productivity have been moving in the opposite direction for a number of years.
Here, we present a detailed analysis based on comprehensive, recent, industry-wide data to identify the relative contributions of each of the steps in the drug discovery and development process to overall R&D productivity. We then propose specific strategies that could have the most substantial impact in improving R&D productivity.
Applications in counterterrorism and corporate competition have led to the development of new methods for the analysis of decision making when there are intelligent opponents and uncertain outcomes.
This field represents a combination of statistical risk analysis and game theory, and is sometimes called adversarial risk analysis.
In this article, we describe several formulations of adversarial risk problems, and provide a framework that extends traditional risk analysis tools, such as influence diagrams and probabilistic reasoning, to adversarial problems.
We also discuss the research challenges that arise when dealing with these models, illustrate the ideas with examples from business, and point out relevance to national defense. [keywords: auctions, decision theory, game theory, influence diagrams]
In economics and other sciences, “statistical-significance” is by custom, habit, and education a necessary and sufficient condition for proving an empirical result (Ziliak and McCloskey, 2008; McCloskey & Ziliak, 1996). The canonical routine is to calculate what’s called a t-statistic and then to compare its estimated value against a theoretically expected value of it, which is found in “Student’s” t table. A result yielding a t-value greater than or equal to about 2.0 is said to be “statistically-significant at the 95 percent level.” Alternatively, a regression coefficient is said to be “statistically-significantly different from the null, p < 0.05.” Canonically speaking, if a coefficient clears the 95 percent hurdle, it warrants additional scientific attention. If not, not. The first presentation of “Student’s” test of statistical-significance came a century ago, in “The Probable Error of a Mean” (1908b), published by an anonymous “Student.” The author’s commercial employer required that his identity be shielded from competitors, but we have known for some decades that the article was written by William Sealy Gosset (1876–1937), whose entire career was spent at Guinness’s brewery in Dublin, where Gosset was a master brewer and experimental scientist (E. S. Pearson, 1937). Perhaps surprisingly, the ingenious “Student” did not give a hoot for a single finding of “statistical”-significance, even at the 95 percent level of statistical-significance as established by his own tables. Beginning in 1904, “Student”, who was a businessman besides a scientist, took an economic approach to the logic of uncertainty, arguing finally that statistical-significance is “nearly valueless” in itself.
We present a theory of decision by sampling (DbS) in which, in contrast with traditional models, there are no underlying psychoeconomic scales.
Instead, we assume that an attribute’s subjective value is constructed from a series of binary, ordinal comparisons to a sample of attribute values drawn from memory and is its rank within the sample. We assume that the sample reflects both the immediate distribution of attribute values from the current decision’s context and also the background, real-world distribution of attribute values.
DbS accounts for concave utility functions; losses looming larger than gains; hyperbolic temporal discounting; and the overestimation of small probabilities and the underestimation of large probabilities.
Decision analysis produces measures of value such as expected net present values or expected utilities and ranks alternatives by these value estimates. Other optimization-based processes operate in a similar manner. With uncertainty and limited resources, an analysis is never perfect, so these value estimates are subject to error. We show that if we take these value estimates at face value and select accordingly, we should expect the value of the chosen alternative to be less than its estimate, even if the value estimates are unbiased. Thus, when comparing actual outcomes to value estimates, we should expect to be disappointed on average, not because of any inherent bias in the estimates themselves, but because of the optimization-based selection process. We call this phenomenon the optimizer’s curse and argue that it is not well understood or appreciated in the decision analysis and management science communities. This curse may be a factor in creating skepticism in decision makers who review the results of an analysis. In this paper, we study the optimizer’s curse and show that the resulting expected disappointment may be substantial. We then propose the use of Bayesian methods to adjust value estimates. These Bayesian methods can be viewed as disciplined skepticism and provide a method for avoiding this postdecision disappointment.
[By Edward O. Thorp] The central problem for gamblers is to find positive expectation bets. But the gambler also needs to know how to manage his money, ie., how much to bet. In the stock market (more inclusively, the securities markets) the problem is similar but more complex. The gambler, who is now an “investor”, looks for “excess risk adjusted return”.
In both these settings, we explore the use of the Kelly criterion, which is to maximize the expected value of the logarithm of wealth (“maximize expected logarithmic utility”). The criterion is known to economists and financial theorists by names such as the “geometric mean maximizing portfolio strategy”, maximizing logarithmic utility, the growth-optimal strategy, the capital growth criterion, etc.
The author initiated the practical application of the Kelly criterion by using it for card counting in blackjack. We will present some useful formulas and methods to answer various natural questions about it that arise in blackjack and other gambling games. Then we illustrate its recent use in a successful casino sports betting system. Finally, we discuss its application to the securities markets where it has helped the author to make a 30 year total of 80 billion dollars worth of “bets”.
[Keywords: Kelly criterion, betting, long run investing, portfolio allocation, logarithmic utility, capital growth]
Optimal growth: Kelly criterion formulas for practitioners
The probability of reaching a fixed goal on or before n trials
The probability of ever being reduced to a fraction x of this initial bankroll
The probability of being at or above a specified value at the end of a specified number of trials
Continuous approximation of expected time to reach a goal
Comparing fixed fraction strategies: the probability that one strategy leads another after n trials
The long run: when will the Kelly strategy “dominate”?
In Good and Real, a tour-de-force of metaphysical naturalism, computer scientist Gary Drescher examines a series of provocative paradoxes about consciousness, choice, ethics, quantum mechanics, and other topics, in an effort to reconcile a purely mechanical view of the universe with key aspects of our subjective impressions of our own existence.
Many scientists suspect that the universe can ultimately be described by a simple (perhaps even deterministic) formalism; all that is real unfolds mechanically according to that formalism. But how, then, is it possible for us to be conscious, or to make genuine choices? And how can there be an ethical dimension to such choices? Drescher sketches computational models of consciousness, choice, and subjunctive reasoning—what would happen if this or that were to occur?—to show how such phenomena are compatible with a mechanical, even deterministic universe.
Analyses of Newcomb’s Problem (a paradox about choice) and the Prisoner’s Dilemma (a paradox about self-interest vs altruism, arguably reducible to Newcomb’s Problem) help bring the problems and proposed solutions into focus. Regarding quantum mechanics, Drescher builds on Everett’s relative-state formulation—but presenting a simplified formalism, accessible to laypersons—to argue that, contrary to some popular impressions, quantum mechanics is compatible with an objective, deterministic physical reality, and that there is no special connection between quantum phenomena and consciousness.
In each of several disparate but intertwined topics ranging from physics to ethics, Drescher argues that a missing technical linchpin can make the quest for objectivity seem impossible, until the elusive technical fix is at hand.:
Chapter 2 explores how inanimate, mechanical matter could be conscious, just by virtue of being organized to perform the right kind of computation.
Chapter 3 explains why conscious beings would experience an apparent inexorable forward flow of time, even in a universe who physical principles are time-symmetric and have no such flow, with everything sitting statically in spacetime.
Chapter 4, following [Hugh] Everett, looks closely at the paradoxes of quantum mechanics, showing how some theorists came to conclude—mistakenly, I argue—that consciousness is part of the story of quantum phenomena, or vice versa. Chapter 4 also shows how quantum phenomena are consistent with determinism (even though so-called hidden-variable theories of quantum determinism are provably wrong).
Chapter 5 examines in detail how it can be that we make genuine choices in in a mechanical, deterministic universe.
Chapter 6 analyzes Newcomb’s Problem, a startling paradox that elicits some counterintuitive conclusions about choice and causality.
Chapter 7 considers how our choices can have a moral component—that is, how even a mechanical, deterministic universe can provide a basis for distinguishing right from wrong.
Chapter 8 wraps up the presentation and touches briefly on some concluding metaphysical questions.
It is well known that, for estimating a linear treatment effect with constant variance, the optimal design divides the units equally between the 2 extremes of the design space. If the dose-response relation may be nonlinear, however, intermediate measurements may be useful in order to estimate the effects of partial treatments.
We consider the decision of whether to gather data at an intermediate design point: do the gains from learning about nonlinearity outweigh the loss in efficiency in estimating the linear effect?
Under reasonable assumptions about nonlinearity, we find that, unless sample size is very large, the design with no interior measurements is best, because with moderate total sample sizes, any nonlinearity in the dose-response will be difficult to detect.
We discuss in the context of a simplified version of the problem that motivated this work—a study of pest-control treatments intended to reduce asthma symptoms in children.
[Keywords: asthma, Bayesian inference, dose-response experimental design, pest control, statistical-significance.]
Receiver Operating Characteristic (ROC) curves are popular ways of summarising the performance of two class classification rules.
In fact, however, they are extremely inconvenient. If the relative severity of the two different kinds of misclassification is known, then an awkward projection operation is required to deduce the overall loss. At the other extreme, when the relative severity is unknown, the area under an ROC curve is often used as an index of performance. However, this essentially assumes that nothing whatsoever is known about the relative severity—a situation which is very rare in real problems.
We present an alternative plot which is more revealing than an ROC plot, and we describe a comparative index which allows one to take advantage of anything that may be known about the relative severity of the two kinds of misclassification.
[From Selected Papers of Hirotugu Akaike, pg199–213; Originally published in +Proceeding of the Second International Symposium on Information Theory+, B.N. Petrov and F. Caski, eds., Akademiai Kiado, Budapest, 1973, 267–281]
In this paper it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion. This observation shows an extension of the principle to provide answers to many practical problems of statistical model fitting.
[Keywords: autoregressive model, final prediction error, maximum likelihood principle, statistical model identification, statistical decision function]
Frank Plumpton Ramsey was born in February 1903, and he died in January 1930—just before his 27th birthday. In his short life he produced an extraordinary amount of profound and original work in economics, mathematics and logic as well as in philosophy: work which in all these fields is still, over sixty years on, extremely influential.
This chapter discusses that practical issues arise because weighty decisions often depend on forecasts and opinions communicated from one person or set of individuals to another.
The standard wisdom has been that numerical communication is better than linguistic, and therefore, especially in important contexts, it is to be preferred. A good deal of evidence suggests that this advice is not uniformly correct and is inconsistent with strongly held preferences. A theoretical understanding of the preceding questions is an important step toward the development of means for improving communication, judgment, and decision making under uncertainty. The theoretical issues concern how individuals interpret imprecise linguistic terms, what factors affect their interpretations, and how they combine those terms with other information for the purpose of taking action. The chapter reviews the relevant literature in order to develop a theory of how linguistic information about imprecise continuous quantities is processed in the service of decision making, judgment, and communication.
It provides the current view, which has evolved inductively, to substantiate it where the data allow, and to suggest where additional research is needed. It also summarizes the research on meanings of qualitative probability expressions and compares judgments and decisions made on the basis of vague and precise probabilities.
The observed level of milk yield of a dairy cow or the litter size of a sow is only partially the result of a permanent characteristic of the animal; temporary effects are also involved. Thus, we face a problem concerning the proper definition and measurement of the traits in order to give the best possible prediction of the future revenues from an animal considered for replacement. A trait model describing the underlying effects is built into a model combining a Bayesian approach with a hierarchic Markov process in order to be able to calculate optimal replacement policies under various conditions.
An organization’s promotion decision between 2 workers is modelled as a problem of boundedly-rational learning about ability. The decision-maker can bias noisy rank-order contests sequentially, thereby changing the information they convey.
The optimal final-period bias favours the “leader”, reinforcing his likely ability advantage. When optimally biased rank-order information is a sufficient statistic for cardinal information, the leader is favoured in every period. In other environments, bias in early periods may (1) favour the early loser, (2) be optimal even when the workers are equally rated, and (3) reduce the favoured worker’s promotion chances.
L. J. Savage and I. J. Good have each demonstrated that the expected utility of free information [Value of Information] is never negative for a decision maker who updates her degrees of belief by conditionalization on propositions learned for certain. In this paper Good’s argument is generalized to show the same result for a decision maker who updates her degrees of belief on the basis of uncertain information by Richard Jeffrey’s probability kinematics. The Savage/Good result is shown to be a special case of the more general result.
Can the vague meanings of probability terms such as doubtful, probable, or likely be expressed as membership functions over the [0, 1] probability interval? A function for a given term would assign a membership value of 0 to probabilities not at all in the vague concept represented by the term, a membership value of 1 to probabilities definitely in the concept, and intermediate membership values to probabilities represented by the term to some degree.
A modified pair-comparison procedure was used in 2 experiments to empirically establish and assess membership functions for several probability terms. Subjects performed 2 tasks in both experiments: They judged (1) to what degree one probability rather than another was better described by a given probability term, and (2) to what degree one term rather than another better described a specified probability. Probabilities were displayed as relative areas on spinners.
Task 1 data were analyzed from the perspective of conjoint-measurement theory, and membership function values were obtained for each term according to various scaling models. The conjoint-measurement axioms were well satisfied and goodness-of-fit measures for the scaling procedures were high. Individual differences were large but stable. Furthermore, the derived membership function values satisfactorily predicted the judgments independently obtained in task 2.
The results support the claim that the scaled values represented the vague meanings of the terms to the individual subjects in the present experimental context. Methodological implications are discussed, as are substantive issues raised by the data regarding the vague meanings of probability terms.
Assessed membership functions over the [0,1] probability interval for several vague meanings of probability terms (eg., doubtful, probable, likely), using a modified pair-comparison procedure in 2 experiments with 20 and 8 graduate business students, respectively. Subjects performed 2 tasks in both experiments: They judged (A) to what degree one probability rather than another was better described by a given probability term and (B) to what degree one term rather than another better described a specified probability. Probabilities were displayed as relative areas on spinners. Task A data were analyzed from the perspective of conjoint-measurement theory, and membership function values were obtained for each term according to various scaling models. Findings show that the conjoint-measurement axioms were well satisfied and goodness-of-fit measures for the scaling procedures were high. Individual differences were large but stable, and the derived membership function values satisfactorily predicted the judgments independently obtained in Task B. Results indicated that the scaled values represented the vague meanings of the terms to the individual Ss in the present experimental context.
Two methods for estimating dollar standard deviations were investigated in a simulated environment. 19 graduate students with management experience managed a simulated pharmaceutical firm for 4 quarters. Ss were given information describing the performance of sales representatives on 3 job components. Estimates derived using the method developed by F. L. Schmidt et al 1979 (see record 1981-02231-001) were relatively accurate with objective sales data that could be directly translated to dollars, but resulted in overestimates of means and standard deviations when data were less directly translatable to dollars and involved variable costs. An additional problem with the Schmidt et al procedure involved the presence of outliers, possibly caused by differing interpretations of instructions. The Cascio-Ramos estimate of performance in dollars (CREPID) technique, proposed by W. F. Cascio (1982), yielded smaller dollar standard deviations, but Ss could reliably discriminate among job components in terms of importance and could accurately evaluate employee performance on those components. Problems with the CREPID method included the underlying scale used to obtain performance ratings and a dependency on job component intercorrelations.
Used decision theoretic equations to estimate the impact of the Programmer Aptitude Test (PAT) on productivity if used to select new computer programmers for 1 yr in the federal government and the national economy. A newly developed technique was used to estimate the standard deviation of the dollar value of employee job performance, which in the past has been the most difficult and expensive item of required information. For the federal government and the US economy separately, results are presented for different selection ratios and for different assumed values for the validity of previously used selection procedures. The impact of the PAT on programmer productivity was substantial for all combinations of assumptions. Results support the conclusion that hundreds of millions of dollars in increased productivity could be realized by increasing the validity of selection decisions in this occupation. Similarities between computer programmers and other occupations are discussed. It is concluded that the impact of valid selection procedures on work-force productivity is considerably greater than most personnel psychologists have believed.
Aspects of scientific method are discussed: In particular, its representation as a motivated iteration in which, in succession, practice confronts theory, and theory, practice. Rapid progress requires sufficient flexibility to profit from such confrontations, and the ability to devise parsimonious but effective models, to worry selectively about model inadequacies and to employ mathematics skillfully but appropriately. The development of statistical methods at Rothamsted Experimental Station by Sir Ronald Fisher is used to illustrate these themes.
…Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad… In applying mathematics to subjects such as physics or statistics we make tentative assumptions about the real world which we know are false but which we believe may be useful nonetheless. The physicist knows that particles have mass and yet certain results, approximating what really happens, may be derived from the assumption that they do not. Equally, the statistician knows, for example, that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false, he can often derive results which match, to a useful approximation, those found in the real world.
It follows that, although rigorous derivation of logical consequences is of great importance to statistics, such derivations are necessarily encapsulated in the knowledge that premise, and hence consequence, do not describe natural truth. It follows that we cannot know that any statistical technique we develop is useful unless we use it. Major advances in science and in the science of statistics in particular, usually occur, therefore, as the result of the theory-practice iteration.
When Values Conflict: Essays on Environmental Analysis, Discourse, and Decision is a collection of essays each of which addresses the issue of value conflicts in environmental disputes. These authors discuss the need to integrate such “fragile” values as beauty and naturalness with “hard” values such as economic efficiency in the decision making process. When Values Conflict: Essays on Environmental Analysis, Discourse, and Decision will be of interest to those who seek to include environmentalist values in public policy debates. This work is comprised of seven essays.
In the first chapter, Robert Socolow discusses obstacles to the integration of environmental values into natural resource policy. Technical studies often fail to resolve conflicts, because such conflict rest of the parties’ very different goals and values. Nonetheless, agreement on the technical analysis may serve as a platform from which to more clearly articulate value differences.
Irene Thomson draws on the case of the Tocks Island Dam controversy to explore environmental decision making processes. She describes the impact the various party’s interests and values have on their analyses, and argues that the fragmentation of responsibility among institutional actors contributes to the production of inadequate analyses.
Tribe’s essay suggests that a natural environment has intrinsic value, a value that cannot be reduced to human interests. This recognition may serve as the first step in developing an environmental ethic.
Charles Frankel explores the idea that nature has rights. He first explores the meaning of nature, by contrast to the supernatural, technological and cultural. He suggests that appeals to nature’s rights serves as an appeal for “institutional protection against being carried away by temporary enthusiasms.”
In Chapter Five, Harvey Brooks describes three main functions which analysis serves in the environmental decision-making process: they ground conclusions in neutral, generally accepted principles, they separate means from ends, and they legitimate the final policy decision. If environmental values such as beauty, naturalness and uniqueness are to be incorporated into systems analysis, they must do so in such a way as to preserve the basic function of analysis.
Henry Rowen discusses the use of policy analysis as an aid to making environmental decisions. He describes the characteristics of a good analysis, and argues that good analysis can help clarify the issues, and assist in “the design and invention of objectives and alternatives.” Rowen concludes by suggesting ways of improving the field of policy analysis.
Robert Dorfman provides the Afterword for this collection. This essay distinguishes between value and price, and explores the import of this distinction for cost-benefit analysis. The author concludes that there can be no “formula for measuring a projects contribution to humane values.” Environmental decisions will always require the use of human judgement and wisdom.
When Values Conflict: Essays on Environmental Analysis, Discourse, and Decision offers a series of thoughtful essays on the nature and weight of environmentalist values. The essays range from a philosophic investigation of natural value to a more concrete evaluation of the elements of good policy analysis.
This is a study of what happens to technical analyses in the real world of politics. The Tocks Island Dam project proposed construction of a dam on the Delaware River at Tocks Island, five miles north of the Delaware Water Gap. Planned and developed in the early 1960’s, it was initially considered a model of water resource planning. But it soon became the target of an extended controversy involving a tangle of interconnected concerns—floods and droughts, energy, growth, congestion, recreation, and the uprooting of people and communities. Numerous participants—economists, scientists, planners, technologists, bureaucrats and environmentalists—measured, modeled and studied the Tocks Island proposal. The results were a weighty legacy of technical and economic analyses—and a decade of political stalemate regarding the fate of the dam. These analyses, to a substantial degree, masked the value conflicts at stake in the controversy; they concealed the real political and human issues of who would win and who would lose if the Tocks Island project were undertaken. And, the studies were infected by rigid categories of thought and divisions of bureaucratic responsibilities. This collection of original essays tells the story of the Tocks Island controversy, with a fresh perspective on the environmental issues at stake. Its contributors consider the political decision-making process throughout the controversy and show how economic and technological analyses affected those decisions. Viewed as a whole, the essays show that systematic analysis and an explicit concern for human values need not be mutually exclusive pursuits.
The Kelly (-Bernoulli-Latané or capital growth) criterion is to maximize the expected valueE log X of the logarithm of the random variable X, representing wealth. The chapter presents a treatment of the Kelly criterion and Breiman’s results.
Breiman’s results can be extended to cover many if not most of the more complicated situations which arise in real-world portfolios Specifically, the number and distribution of investments can vary with the time period, the random variables need not be finite or even discrete, and a certain amount of dependence can be introduced between the investment universes for different time periods. The chapter also discusses a few relationships between the max expected log approach and Markowitz’s mean-variance approach.
It highlights a few misconceptions concerning the Kelly criterion, the most notable being the fact that decisions that maximize the expected log of wealth do not necessarily maximize expected utility of terminal wealth for arbitrarily large time horizons.
Cross-modality matching of hypothetical increments of money against loudness recover the previously proposed exponent of the utility function for money within a few percent. Similar cross-modality matching experiments for decrements give a disutility exponent of 0.59, larger than the utility exponent for increments. This disutility exponent was checked by an additional cross-modality matching experiment against the disutility of drinking various concentrations of a bitter solution. The parameter estimated in this fashion was 0.63.
Three experiments were conducted in which monetary increments and decrements were matched to either the loudness of a tone or the bitterness of various concentrations of sucrose octaacetate. An additional experiment involving ratio estimates of monetary loss is also reported. Results confirm that the utility function for both monetary increments and decrements is a power function with exponents less than one. The data further suggest that the exponent of the disutility function is larger than that of the utility function, i.e., the rate of change of ‘unhappiness’ caused by monetary losses is greater than the comparable rate of ‘happiness’ produced by monetary gains.
A simple cost function approach is proposed for designing an optimal clinical trial when a total of n patients with a disease are to be treated with one of two medical treatments.
The cost function is constructed with but one cost, the consequences of treating a patient with the superior or inferior of the two treatments. Fixed sample size and sequential trials are considered. Minimax, maximin, and Bayesian approaches are used for determining the optimal size of a fixed sample trial and the optimal position of the boundaries of a sequential trial.
Comparisons of the different approaches are made as well as comparisons of the results for the fixed and sequential plans.
An analytical development of flight performance optimization according to the method of gradients or ‘method of steepest decent’ is presented. Construction of a minimizing sequence of flight paths by a stepwise process of descent along the local gradient direction is described as a computational scheme. Numerical application of the technique is illustrated in a simple example of orbital transfer via solar sail propulsion. Successive approximations to minimum time planar flight paths from Earth’s orbit to the orbit of Mars are presented for cases corresponding to free and fixed boundary conditions on terminal velocity components.
This book is a non-mathematical introduction to the logical analysis of practical business problems in which a decision must be reached under uncertainty. The analysis which it recommends is based on the modern theory of utility and what has come to be known as the “’personal”’ definition of probability; the author believes, in other words, that when the consequences of various possible courses of action depend on some unpredictable event, the practical way of choosing the “best” act is to assign values to consequences and probabilities to events and then to select the act with the highest expected value. In the author’s experience, thoughtful businessmen intuitively apply exactly this kind of analysis in problems which are simple enough to allow of purely intuitive analysis; and he believes that they will readily accept its formalization once the essential logic of this formalization is presented in a way which can be comprehended by an intelligent layman. Excellent books on the pure mathematical theory of decision under uncertainty already exist; the present text is an endeavor to show how formal analysis of practical decision problems can be made to pay its way.
From the point of view taken in this book, there is no real difference between a “statistical” decision problem in which a part of the available evidence happens to come from a ‘sample’ and a problem in which all the evidence is of a less formal nature. Both kinds of problems are analyzed by use of the same basic principles; and one of the resulting advantages is that it becomes possible to avoid having to assert that nothing useful can be said about a sample which contains an unknown amount of bias while at the same time having to admit that in most practical situations it is totally impossible to draw a sample which does not contain an unknown amount of bias. In the same way and for the same reason there is no real difference between a decision problem in which the long-run-average demand for some commodity is known with certainty and one in which it is not; and not the least of the advantages which result from recognizing this fact is that it becomes possible to analyze a problem of inventory control without having to pretend that a finite amount of experience can ever give anyone perfect knowledge of long-run-average demand. The author is quite ready to admit that in some situations it may be difficult for the businessman to assess the numerical probabilities and utilities which are required for the kind of analysis recommended in this book, but he is confident that the businessman who really tries to make a reasoned analysis of a difficult decision problem will find it far easier to do this than to make a direct determination of, say, the correct risk premium to add to the pure cost of capital or of the correct level at which to conduct a test of statistical-significance.
In sum, the author believes that the modern theories of utility and personal probability have at last made it possible to develop a really complete theory to guide the making of managerial decisions—a theory into which the traditional disciplines of statistics and economics under certainty and the collection of miscellaneous techniques taught under the name of operations research will all enter as constituent parts. He hopes, therefore, that the present book will be of interest and value not only to students and practitioners of inventory control, quality control, marketing research, and other specific business functions but also to students of business and businessmen who are interested in the basic principles of managerial economics and to students of economics who are interested in the theory of the firm. Even the teacher of a course in mathematical decision theory who wishes to include applications as well as complete-class and existence theory may find the book useful as a source of examples of the practical decision problems which do arise in the real world.
[Egon Pearson describes Student, or Gosset, as a statistician: Student corresponded widely with young statisticians/mathematicians, encouraging them, and having an outsized influence not reflected in his publication. Student’s preferred statistical tools were remarkably simple, focused on correlations and standard deviations, but wielded effectively in the analysis and efficient design of experiments (particularly agricultural experiments), and he was an early decision-theorist, focused on practical problems connected to his Guinness Brewery job—which detachment from academia partially explains why he didn’t publish methods or results immediately or often. The need to handle small n of the brewery led to his work on small-sample approximations rather than, like Pearson et al in the Galton biometric tradition, relying on collecting large datasets and using asymptotic methods, and Student carried out one of the first Monte Carlo simulations.]