Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

13
International Statistical Review (2010), 78, 2, 316–328 doi:10.1111/j.1751-5823.2010.00118.x Short Book Reviews Editor: Simo Puntanen Linear Model Methodology Andr´ e I. Khuri Chapman & Hall/CRC, 2010, xix + 542 pages, £ 63.99 / US$ 99.95, hardcover ISBN: 978-1-58488-481-1 Table of contents 1. Linear models: some historical perspectives 8. Balanced linear models 2. Basic elements of linear algebra 9. The adequacy of Satterthwaite’s approximation 3. Basic concepts in matrix algebra 10. Unbalanced fixed-effects models 4. The multivariate normal distribution 11. Unbalanced random and mixed models 5. Quadratic forms in normal variables 12. Additional topics in linear models 6. Full rank linear models 13. Generalized linear models 7. Less-than-full-rank linear models Readership: All readers interested in regression presented with a mix of theory and practice. The material on which this book is based has been taught in a couple of courses at the University of Florida for about 20 years and the author’s skills and experience in doing this are superbly represented in this fine text. The presentation itself leans more toward the theoretical aspects, but there are numerous exercises that reinforce both the theoretical and the practical aspects of regression. (However, no solutions are provided.) “Chapters 11 and 12 can be particularly helpful to graduate students looking for dissertation topics.” (Preface) This is an excellent, reliable, and comprehensive text. Norman R. Draper: [email protected] Department of Statistics, University of Wisconsin – Madison 1300 University Avenue, Madison, WI 53706-1532, USA Knowledge Discovery for Counterterrorism and Law Enforcement David Skillicorn Chapman & Hall/CRC, 2008, xx + 330 pages, £ 49.99 / US$ 79.95, hardcover ISBN: 978-1-4200-7399-7 Table of contents 1. Introduction 6. Looking inside groups – relationship discovery 2. Data 7. Discovery from public textual data 3. High-level principles 8. Discovery in private communication 4. Looking for risk – prediction and anomaly detection 9. Discovering mental and emotional state 5. Looking for similarity – clustering 10. The bottom line Readership: Anyone first venturing into knowledge discovery for counterterrorism. C 2010 The Authors. Journal compilation C 2010 International Statistical Institute. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.

Transcript of Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

Page 1: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

International Statistical Review (2010), 78, 2, 316–328 doi:10.1111/j.1751-5823.2010.00118.x

Short Book ReviewsEditor: Simo Puntanen

Linear Model MethodologyAndre I. KhuriChapman & Hall/CRC, 2010, xix + 542 pages, £ 63.99 / US$ 99.95, hardcoverISBN: 978-1-58488-481-1

Table of contents

1. Linear models: some historical perspectives 8. Balanced linear models2. Basic elements of linear algebra 9. The adequacy of Satterthwaite’s approximation3. Basic concepts in matrix algebra 10. Unbalanced fixed-effects models4. The multivariate normal distribution 11. Unbalanced random and mixed models5. Quadratic forms in normal variables 12. Additional topics in linear models6. Full rank linear models 13. Generalized linear models7. Less-than-full-rank linear models

Readership: All readers interested in regression presented with a mix of theory and practice.

The material on which this book is based has been taught in a couple of courses at the Universityof Florida for about 20 years and the author’s skills and experience in doing this are superblyrepresented in this fine text. The presentation itself leans more toward the theoretical aspects,but there are numerous exercises that reinforce both the theoretical and the practical aspects ofregression. (However, no solutions are provided.) “Chapters 11 and 12 can be particularly helpfulto graduate students looking for dissertation topics.” (Preface) This is an excellent, reliable, andcomprehensive text.

Norman R. Draper: [email protected] of Statistics, University of Wisconsin – Madison

1300 University Avenue, Madison, WI 53706-1532, USA

Knowledge Discovery for Counterterrorism and Law EnforcementDavid SkillicornChapman & Hall/CRC, 2008, xx + 330 pages, £ 49.99 / US$ 79.95, hardcoverISBN: 978-1-4200-7399-7

Table of contents

1. Introduction 6. Looking inside groups – relationship discovery2. Data 7. Discovery from public textual data3. High-level principles 8. Discovery in private communication4. Looking for risk – prediction and anomaly detection 9. Discovering mental and emotional state5. Looking for similarity – clustering 10. The bottom line

Readership: Anyone first venturing into knowledge discovery for counterterrorism.

C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute. Published by Blackwell Publishing Ltd, 9600 Garsington Road,Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.

Page 2: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 317

This is a discursive book, outlining all sorts of methods, which might be used in counterterrorism,and speculating on how they might be employed. There are very few real applied examplesand these are only described in brief. I suppose this should not surprise us for this areaof application, but it detracts from the text’s interest. There is a whole 30-page chapter oncluster analysis, which has just one artificial example of three people and their height andage. Even if no counterterrorism examples can be used, surely something more stimulatingcould have been found. The standard result on the large number of false positives that arise insearching information for terrorists properly appears, though the calculation itself is not given(and strangely enough Bayes’ Theorem does not appear at all). The author’s comment on theresult is worth quoting in full: “. . . the ACM committee assume that 99.999% accuracies areunattainably remote, despite the fact that defect rates well below this are commonplace in manyindustrial settings, not by some kind of magic but by working at the process to reduce defects.”As a statistician, this kind of positive thinking leaves me very skeptical.

Antony Unwin: [email protected] Augsburg, Institut fur Mathematik

D-86135 Augsburg, Germany

Statistical Methods for Categorical Data Analysis, Second EditionDaniel A. Powers, Yu XieEmerald Group, 2008, xvii + 317 pages, £ 39.99 / US$ 69.95, hardcoverISBN: 978-0-1237-2562-2

Table of contents

1. Introduction 6. Statistical models for event occurrence2. Review of linear regression models 7. Models for ordinal dependent variables3. Models for binary data 8. Models for nominal dependent variables4. Loglinear models for contingency tables A. The matrix approach to regression5. Multilevel models for binary data B. Maximum likelihood estimation

Readership: Social science researchers.

There are quite a few books on analyzing categorical data. This one has the expressed aimof integrating the transformational approach familiar to statisticians with the latent variableapproach “often taken by economists.” It covers a fairly wide range of models in a reasonablysuccessful manner. Though it has a certain amount of mathematics, this is not covered in anygreat depth. Linear regression is explained in matrix form in an appendix and the Bayes factor isdescribed as “complicated and beyond the scope of this book.” In keeping with the other booksin this area, there are disappointingly few graphics and mosaic plots that do not get a mention.In contrast with some of the other books, there are not many motivating examples and whenthe examples included are analyzed the results are not discussed in any detail. If they had been(or if graphics had been used) the authors might have noticed the two errors in Table 6.8 onpage 191 that I spotted. Although there are no exercises, there is a supporting website, whichincludes code for the examples using a variety of software packages. The book is now in itssecond edition, so it has already achieved a certain amount of recognition. With better examplesit could get more.

Antony Unwin: [email protected] Augsburg, Institut fur Mathematik

D-86135 Augsburg, Germany

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 3: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

318 SHORT BOOK REVIEWS

From Finite Sample to Asymptotic Methods in StatisticsPranab K. Sen, Julio M. Singer, Antonio C. Pedroso de LimaCambridge University Press, 2009, xii + 386 pages, £ 45.00 / US$ 70.00, hardcoverISBN: 978-0-521-87722-0

Table of contents

1. Motivation and basic tools 7. Asymptotic distributions2. Estimation theory 8. Asymptotic behavior of estimators and tests3. Hypothesis testing 9. Categorical data models4. Elements of statistical decision theory 10. Regression models5. Stochastic processes: an overview 11. Weak convergence and Gaussian processes6. Stochastic convergence and probability inequalities

Readership: Advanced undergraduate or beginning graduate students in statistics, biostatistics,or applied statistics, academic researchers in statistically oriented fields.

The authors point out in the preface that “. . . , our intent is to provide a broad view of finite-sample statistical methods, to examine their merits and caveats, and to judge how far asymptoticresults eliminate some of the detected impasses, providing the basis for sound application ofapproximate statistical inference in large samples.” The book succeeds admirably in its aim ofproviding an overview of finite-sample (exact or small) methods, appraising their scope andintegrating them to asymptotic (approximate or large-sample) inference. The treatment of thematerial is application-oriented and yet mathematically rigorous.

In Chapter 1 the authors motivate their approach through a set of illustrative examples rangingfrom very simple to more complex applications. Also a summary of some basic tools and results(on matrix algebra, real analysis, probability distributions, order statistics, and quantiles) neededin the text is provided. Chapters 2 and 3 lay out the two building blocks of statistical inference,estimation and testing, and in these chapters the authors address the important issues relatingto likelihood, sufficiency and invariance, among others. The chapter titles shown above indicatethe range of topics covered in the text.

The book is very well written and clear. The overall standard of explanation is very good,new ideas are accompanied by several worked examples, although one might have wished alsonumerical examples with some indication of how the theory performs in practice. There arealso a large number of suitable exercises for the reader. In my view, this text can be warmlyrecommended for lecture courses in asymptotic statistics and courses in statistical inference.

Erkki P. Liski: [email protected] of Mathematics and StatisticsFI-33014 University of Tampere, Finland

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 4: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 319

Steps Towards a Unified Basis for Scientific Models and MethodsInge S. HellandWorld Scientific, 2010, xviii + 257 pages, £ 56.00 / US$ 75.00, hardcoverISBN: 978-981-4280-85-3

Table of contents

1. The basic elements 8. Multivariate data analysis and statistics2. Statistical theory and practice 9. Quantum mechanics and the diversity of concepts3. Statistical inference under symmetry 10. Epilogue4. The transition from statistics to quantum theory A.1. Mathematical aspects of basic statistics5. Quantum mechanics from a statistical basis A.2. Transformation groups and group transformations6. Further development of quantum mechanics A.3. Technical aspects of quantum mechanics7. Decisions in statistics A.4. Some aspects of partial least squares regression

Readership: Those interested in the broader aspects of statistical theory and concepts andespecially those with a concern with links to quantum theory.

This wide-ranging book aims to address and link broad conceptual issues, in particular instatistical theory and quantum theory. The introductory chapter discusses complementarity in awide sense and introduces the notion of conceptually defined variables, c-variables. These linkwith counterfactual variables and latent variables in the sense used in statistical theory, but areintended to be broader. There follows a remarkably clear and compact summary of the theoryof statistical inference, limited mainly by a concentration on transformation models. Remarkson a range of more applied issues make an interesting commentary on the more mathematicalparts. Chapters 5 and 6 deal with quantum mechanics, starting with a summary account of theconventional approach and then leading to a development from a new set of axioms claimedto have a clearer intuitive content, an aspect which the author considers important. The finalChapters cover a wide range of topics, mostly statistical. The writing is lucid. Whether a usefulsynthesis has been achieved is unclear to this reviewer.

David Cox: [email protected] College, New Road

Oxford, OX1 1NF, UK

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 5: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

320 SHORT BOOK REVIEWS

A First Course in Probability and StatisticsB. L. S. Prakasa RaoWorld Scientific, 2008, xii + 317 pages, £ 26.00 / US$ 48.00, softcover (also available as hard-cover)ISBN: 978-981-283-654-0

Table of contents

1. Why statistics? 8. Estimation2. Probability on discrete sample spaces 9. Interval estimation and testing of hypotheses3. Discrete probability distributions 10. Linear regression and correlation4. Continuous probability distributions Appendix A. References5. Multivariate probability distributions Appendix B. Answers to selected exercises6. Functions of random vectors Appendix C. Tables7. Approximations to some probability distributions

Readership: Undergraduate courses in statistics and probability, mathematics students who arestudying probability.

This book assumes that the reader has completed a course on calculus and has a thoroughknowledge and understanding of this. The approach is very mathematical, with many proofsincluded. The text while advertised for those doing Social Science and Business Administrationmay find the title misleading as it is certainly suitable for those studying mathematics andstatistics but maybe be difficult for the other subject disciplines.

The book is very comprehensive in its coverage of the topics included and contains a wealthof exercises at the end of each chapter. Solutions to only a selected few of these exercises can befound in the appendix and there are no solutions for Chapters 8–10. For a first course it wouldhave been useful to include more solutions such as question 10.1 asks the reader to determinethe regression line that best fits five given points.

This is a book that is intended for those of a mathematical mind, who have a background incalculus, and a good grasp of mathematics in general.

Susan Starkings: [email protected] for Learning Support and Development, London South Bank University

103 Borough Road, London, SE1 0AA, UK

Statistics for Engineers: An IntroductionS. J. MorrisonWiley, 2009, xiv + 177 pages, € 46.00 / £ 39.95 / US$ 70.00, hardcoverISBN: 978-0-470-74556-4

Table of contents

1. Nature of variability 8. Conclusion2. Basic statistical methods Appendix A: Guidelines3. Production Appendix B: Recommended books4. Engineering design Appendix C: Periodicals5. Research and development Appendix D: Supplementary bibliography6. Background Appendix E: Statistical tables7. Quality management

Readership: Students on or considering courses in engineering.

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 6: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 321

This book is written by an engineer for an engineering readership and contains practical adviceand guidance on the statistical results obtained in a variety of situations. A broad range ofstatistical methods that is relevant to engineering with the minimum of mathematics andmaximum of explanation is the essence of this text.

The book is very comprehensive in its coverage of the engineering topics and fully explainsthe techniques used here. The text focuses on the statistical methods that engineers need, howthey work and how to use them safely. Also included is a wealth of relevant references at theend of each chapter as well as those in the appendices. There are no exercises for the readerto attempt as it is assumed that the lecturer will provide these but it is extremely useful forexplanations.

This is a book that is intended for engineering students and is to be recommended to have inany college or university that has students studying this subject discipline.

Susan Starkings: [email protected] for Learning Support and Development, London South Bank University

103 Borough Road, London, SE1 0AA, UK

Exploring the Origin, Extent, and Future of Life: Philosophical, Ethical andTheological PerspectivesConstance M. Bertka (Editor)Cambridge University Press, 2009, xii + 324 pages, £ 65.00 / US$ 120.00, hardcoverISBN: 978-0-521-86363-6

Table of contents

1. Astrobiology in societal context (Constance M. Bertka) 9. A historical perspective on the extent and search for lifePart I. Origin of Life (Steven J. Dick)

2. Emergence and the experimental pursuit of the origin of 10. The search for extraterrestrial life: epistemology, ethics,life (Robert M. Hazen) and worldviews (Mark Lupisella)

3. From Aristotle to Darwin, to Freeman Dyson: changing 11. The implications of discovering extraterrestrial life:definitions of life viewed in historical context (James different searches, different issues (Margaret S. Race)E. Strick) 12. God, evolution, and astrobiology (Cynthia S.W. Crysdale)

4. Philosophical aspects of the origin-of-life problem: the Part III. Future of Lifeemergence of life and the nature of science (Iris Fry) 13. Planetary ecosynthesis on Mars: restoration ecology and

5. The origin of terrestrial life: a Christian perspective environmental ethics (Christopher P. McKay)(Ernan McMullin) 14. The trouble with intrinsic value: an ethical primer for

6. The alpha and the omega: reflections on the origin and astrobiology (Kelly C. Smith)future of life from the perspective of Christian theology 15. God’s preferential option for life: a Christian perspectiveand ethics (Celia Deane-Drummond) on astrobiology (Richard O. Randolph)

Part II. Extent of Life 16. Comparing stories about the origin, extent, and future of7. A biologist’s guide to the Solar System (Lynn J. life: an Asian religious perspective (Francisca Cho)

Rothschild)8. The quest for habitable worlds and life beyond the Solar

System (Carl Pilcher, Jack J. Lissauer)

Readership: Readers interested in Astrobiology.

I was intrigued with the title to see how statistics was to be used in a book of this title andindeed why it was sent to the ISI for a review. The book is divided into three parts namely (i)Origin of life, (ii) Extent of life, and (iii) Future of life. The text contains very little statistics;however, those that are present show an interesting use of statistics. I would not suggest that thisis a statistics book in any way, shape or form just that it has some, albeit very little, statisticscontained within its pages. So unless one is interested in the area of astrobiology, religion, andethics or has a philosophical interest then this is not one for you. Having said that I found the

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 7: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

322 SHORT BOOK REVIEWS

book to be very interesting and refreshingly different form the usual academic books that comesmy way.

The book was completed with support from the National Aeronautics and Space Administra-tion and the John Temple Foundation. It is a valuable text for graduate students and researcherswith an interest in astrobiology.

Susan Starkings: [email protected] for Learning Support and Development, London South Bank University

103 Borough Road, London, SE1 0AA, UK

Philosophical Transactions of The Royal Society A, 367 (1906)Theme Issue ‘Statistical Challenges of High-dimensional Data’David L. Banks, Peter J. Bickel, Iain M. Johnstone, D. Michael Titterington (Editors)Royal Society Publishing, 2009, 236 pages, £ 58.00, softcoverISBN: 978-0-85403-779-7

Table of contents

Introduction (I.M. Johnstone, D.M. Titterington) Statistical inference for exploratory data analysis and modelSelective inference in complex research (Y. Benjamini, diagnostics (A. Buja, D. Cook, H. Hofmann, M. Lawrence,

R. Heller, D. Yekutieli) E.-K. Lee, D.F. Swayne, H. Wickham)Observed universality of phase transitions in high- Sufficient dimension reduction and prediction in regression

dimensional geometry, with implications for modern data (K.P. Adragni, R.D. Cook)analysis and signal processing (D. Donoho, J. Tanner) Identifying graph clusters using variational inference and

On landmark selection and sampling in high-dimensional links to covariance parametrization (D. Barber)data analysis (M.-A. Belabbas, P.J. Wolfe) Classification of sparse high-dimensional vectors (Yu. I.

An overview of recent developments in genomics and Ingster, C. Pouet, A.B. Tsybakov)associated statistical methods (P.J. Bickel, J.B. Brown, Feature selection by higher criticism thresholding achievesH. Huang, Q. Li) the optimal phase diagram (D. Donoho, J. Jin)

Cherry-picking for complex data: robust structure discovery(D.L. Banks, L. House, K. Killourhy)

Readership: A very good book for those who are interested in knowing what is meant by high-dimensional problems, where they are coming from, and how statisticians and computer andinformation scientists are solving them.

The book under review is a collection of 11 excellent articles reprinted from the PhilosophicalTransactions of the Royal Society, vol. 367, pages 4235 through 4470. The front cover remindsus that the Royal Society is the world’s longest running science journal, from the back coverwe learn the Society was founded in 1660. Another great name is associated with these papers,they were prepared as a part of the program Statistical Theory and Methods for Complex High-Dimensional Data at the Isaac Newton Institute for Mathematical Sciences in Cambridge, UK.

For anyone working or wishing to work in this area or just learn what is going on in the mosthappening part of our subject, this is a wonderful book, providing introduction and overviewof what is now known as well as new emerging ideas, methods, theorems, and conjectures.Topics covered include variable/feature selection, regression, classification, visual explorationand novel visual confirmatory analysis, multiple tests, robust structure hunting, graph clusters,model selection, and sufficient dimension regression. Among all these exciting theoreticaland practical developments, perhaps the most wonderful are the conjectures on and partialverification of phase transitions in high-dimensional multiple testing by Donoho and Tanner(pp. 4273–4294), and the partial verification of optimality of Tukey’s Higher Criticism underphase transition by Donoho and Jin (pp. 4449–4470).

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 8: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 323

A brief review of the papers follows. In their introductory as well as survey of high-dimensionalproblems, Johnstone and Titterington explain that high dimension means a high-dimensionalparameter, usually but not always accompanied by a relatively small replication. Such problemsdefy the old requirement that the number of sampling units should exceed the number ofparameters, more so the better. These problems arise in molecular biology, image processing,communication, and other diverse areas. These problems are usually solved by a sparsityassumption that only a small number of many parameters are nonzero. But one does not knowwhich ones are nonzero, so the problem remains even with the sparsity assumption. It turns outthat the signals, that is, the nonzero parameters should be sufficiently large in magnitude to bedetected. One could describe this as the first stage of the high-dimensional statistics. This stagestill continues but a second stage has begun too. People working in this area have begun to worryabout what happens when the sparsity assumption fails at least partly, that is, there are fewersignals than what is provided for by assumed levels of sparsity and the signals may be bothrare and weak. Then the methods for high dimensions developed in the first stage breaks down.Donoho and Tanner, in what is perhaps the most stunning article, show the level of sparsity atwhich breakdown begins to show up is surprisingly stable over different examples and domainsand relate this to combinatorial geometry of polytopes. This is very important work still in itsinfancy, there will be many beautiful as well as useful results.

Since many of these problems originate in molecular biology, in fact in Genomics specifically,Bickel et al. provide a very useful survey of old and very new problems in different subareasof Genomics and the solutions being offered within Classical Statistics, Machine Learning andBayesian Analysis.

Buja et al. show how visual display, generally considered part of exploratory analysis, can alsobe used for confirmatory analysis like testing of hypotheses, this seems a very novel idea. Bankset al. discuss robust structure discovery, which is important since robustness of high-dimensionalanalysis is rarely studied. Adragni and Cook discuss (sufficient) dimension reduction, whichis somewhat like principal components, but is applied to inverse regression and aims at beingnonparametric. The idea is due to K.C. Li but developed a lot by these authors. Benjamini etal. provide a lovely introduction to the famous Benjamini–Hochberg multiple test and two veryuseful new techniques to cope with selection bias and problems of multiple tests, for example todiscover genes influencing some disease or resistance to it, conducted independently at differentsites.

This is only a sparse picture of a complex high-dimensional landscape that unfolds in theseeleven articles. In the introductory survey of the first article, among other things there is mentionof a “brief encounter with Bayesian Statistics.” If the followers of Reverend Bayes are invitedto contribute to another volume, I am sure the brief encounter would explode into another setof new concepts like “shotguns,” “horseshoes,” and other quite new and successful principlesfor variable selection in very complex problems that do not violate the very strict scientificstandards laid down by Sir Isaac.

Jayanta K. Ghosh: [email protected] of Statistics, Purdue University

West Lafayette, IN 47909, USA

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 9: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

324 SHORT BOOK REVIEWS

R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and GraphicsRichard M. Heiberger, Erich NeuwirthSpringer, 2009, xxiv + 342 pages, € 54.95 / £ 49.99 / US$ 64.95, softcoverISBN: 978-1-4419-0051-7

Table of contents

1. Getting started 9. What is least squares?2. Using RExcel and R Commander 10. Multiple regression – two X -variables3. Getting data into R 11. Polynomial regression4. Normal and t distributions 12. Multiple regression – three or more X -variables5. Normal and t workbook 13. Contingency tables and the chi-square test6. t-Tests A. Installation of RExcel7. One-way ANOVA B. Nuisances – installation, startup, or execution8. Simple linear regression

Readership: Students, researchers, and others who wish to use R but avoid the command line.

This book is essentially a manual for the RExcel software. RExcel is an add-in to Excel whichallows access to the statistical functionality of R including user-contributed packages via theExcel interface. Although the book contains 342 pages, there is limited text. Most commonlya page consists of one or more screenshots showing how to use RExcel. The whole book isreproduced in color, on glossy paper.

Readers are guided through the menu system (which is based on R Commander) to seehow to carry out common statistical procedures. The level of statistical understanding requiredis roughly that of an introductory applied statistics subject at university level (t-tests, one-way ANOVA, multiple regression, contingency tables). A number of workbooks demonstratingvarious statistical calculations and procedures come with RExcel. These are intended to supportstatistics courses. A workbook titled Demo Files for the book R through Excel is described inthe book. It covers topics such as data formats, normal and t distributions, and linear regression.

Two appendices deal with installation and possible problems.I found very few errors. Unfortunately, the help system changed with R 2.10, so the help

window shown on p.36 no longer applies. All the author names and dates in the bibliographyare duplicated. The index is rather limited, with no entry for “workbook,” for example, when Iwanted to look that up.

For anyone wishing to learn RExcel this book would be a useful purchase.

David J. Scott: [email protected] of Statistics, The University of Auckland

Private Bag 92019, Auckland 1142, New Zealand

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 10: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 325

Machine Learning: An Algorithmic PerspectiveStephen MarslandChapman & Hall/CRC, 2009, xvi + 390 pages, £ 38.69 / US$ 62.96, hardcoverISBN: 978-1-4200-6718-7

Table of contents

1. Introduction 9. Unsupervised learning2. Linear discriminants 10. Dimensionality reduction3. The multi-layer perception 11. Optimization and search4. Radial basis functions and splines 12. Evolutionary learning5. Support vector machines 13. Reinforcement learning6. Learning with trees 14. Markov Chain Monte Carlo (MCMC) methods7. Decision by committee: ensemble learning 15. Graphical models8. Probability and learning 16. Python

Readership: Undergraduate computer science and engineering students.

This book is intended to be a practical first introduction to the ideas and methods of machinelearning, for those without substantial mathematical machinery to hand. It thus places emphasison algorithms, rather than the mathematics behind them, and is liberally illustrated withmany programming examples, using Python. It includes a basic primer on Python and hasan accompanying website.

It has excellent breadth, and is comprehensive in terms of the topics it covers, both in termsof methods (e.g., including neural networks, support vector machines, ensemble methods, treeclassifiers, reinforcement learning, stochastic methods, tracking, belief networks, etc.) and interms of concepts and theory (e.g., dimensionality issues, optimization, etc.).

There is a “further reading” section at the end of each chapter, which is useful, but thereferences have not been collected together in a list at the end of the book, which can be adisadvantage – something to consider for the second edition.

Overall, I think the author has succeeded in his aim: the book provides an accessibleintroduction to machine learning. It would be excellent as a first exposure to the subject, andwould put the various ideas in context, before moving on to a more elaborate and deep treatment,such as that in Hastie, Tibshirani, and Friedman’s The Elements of Statistical Learning.

This book also includes the first occurrence I have seen in print of a reference to a zettabyte ofdata (1021 bytes) – a reference to “all the world’s computers” being estimated to contain almosta zettabyte by 2010.

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 11: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

326 SHORT BOOK REVIEWS

Introduction to Social Statistics: The Logic of Statistical ReasoningThomas Dietz, Linda KalofWiley-Blackwell, 2009, xxxviii + 569 pages, € 32.20 / £ 27.99 / US$ 94.95, hardcoverISBN: 978-1-4051-6902-8

Table of contents

1. An introduction to quantitative analysis 10. Using sampling distributions: hypothesis tests2. Some basic concepts 11. The subtle logic of analysis of variance3. Displaying data one variable at a time 12. Goodness of fit and models of frequency tables4. Describing data 13. Bivariate regression and correlation5. Plotting relationships and conditional distributions 14. Basics of multiple regression6. Causation and models of causal effects Appendix A. Summary of variables in examples7. Probability Appendix B. Mathematics review8. Sampling distributions and inference Appendix C. Statistical tables9. Using sampling distributions: confidence intervals

Readership: Undergraduate students in the social sciences.

The book opens with the statement that statistics is hard. While I am not sure that that is thebest way to encourage readership and sales, it is certainly a message I endorse, especially forthe intended readership of the book. People who are studying statistics as a necessary sidelineto their real interests often feel discouraged by the effort needed to master it, and reflect thisback on themselves (as in “I am unable to do it”), with the implication that they are inadequatein some way. Instead it is healthier to take the attitude of this book – that the subject matter isintrinsically hard, and one should expect to have to work at it.

Unfortunately, to illustrate the hardness of statistics, the authors quote Persi Diaconis(misspelling his name) saying “Our brains are just not wired to do probability problems verywell.” Diaconis was, as he said, speaking of probability (he was discussing the Monty Hallproblem), and probability and statistics are very different kinds of beast. One is a branch ofmathematics, describing the empirical consequences of given models, and the other is technologyof inferring likely underlying structures from empirical data.

The authors then go on to say “it was not until the 1600s, when Galileo correctly analyzedchances in games based on dice, that people began to understand the probabilities that underpinrandom processes.” It is true that Galileo did investigate such matters, in the early part of theseventeenth century, but his paper was not published until 1718 and people normally date thestart of a formal understanding of probability to the correspondence between Pascal and Fermatin 1654, again about a gambling problem.

The book criticizes the cookbook approach to statistics. This is a criticism with which I agreeentirely. However, I think the authors have not moved as far from this approach as they could.One illustration is provided by Chapter 4, which essentially lists various simple descriptivestatistics, with very little comparative evaluation. In fact the appropriate statistic to use (e.g.,mean or median) depends on the question one is trying to answer, and that fact can be used todevelop statistical ideas and tools from a clearly noncookbook orientation.

I liked the description of quantitative data analysis as a craft, and also the notion of learningby statistics by apprenticeship. Elsewhere this has been described as a strategy for learningstatistical thinking, as opposed to the statistical machinery taught on most courses.

The book is a wealth of effective and helpful real examples, and includes exercises aftereach substantive chapter. It has been beautifully produced. I think the pace and presentation areexactly right for the intended audience.

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 12: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

SHORT BOOK REVIEWS 327

So it has some good things about it. But (in my opinion!) it also has some bad things. Theauthors appear to regard the law of large numbers and the central limit theorem as the same thing(e.g., p266 “the Law of Large Numbers, which is also known as the Central Limit Theorem”).The words “nonparametric” and “distribution-free” do not appear in the index, and I did notspot them in the body of the book. Surely this is a major omission, as nonparametric methodsare widely used in the social sciences. The book refers to the “controversy” over the relationshipbetween levels of measurement and choice of statistical technique. But I would suggest thatthis controversy has evaporated with the recognition of the distinction between pragmatic andrepresentational aspects of measurement. Of all the sections of the book, I found this discussionthe weakest – which is a little disappointing since measurements of both types figure largein the social sciences. The book adopts an entirely frequentist perspective, with just one pagementioning the alternative Bayesian view – but I am afraid I found that description unconvincingsince the description of subjective probability was rather confused. Given the huge progress inpractical application of Bayesian methods, I think this topic deserved better. Put together, suchcriticisms mean the book has a rather old-fashioned feel, in terms of statistical methodology.

There is some discussion of missing data, but I would have liked more on this and indeed ondata quality in general. I recognize that this is something of a hobby horse of mine but, afterall, the first thing students discover when they step out of the classroom and have to apply theirhard-won statistical expertise in practice is that the data facing them are not as clean as the dataon which they have been practicing. The real world is a messy place, and so are real data.

Overall, as will probably be obvious, I found the book rather frustrating. While there is a lotI like about it, I found the lack of rigor – to the extent of quite often saying things that I wouldargue are wrong – grating. It might not matter to a student for whom this is the only statisticscourse they ever study, and who never uses statistics in later life, but presumably the authorshope that many of the readers will go on to use the ideas and tools.

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

Foundations of Factor Analysis, Second EditionStanley A. MulaikChapman & Hall/CRC, 2009, xxiv + 524 pages, £ 39.99 / US$ 79.95, hardcoverISBN: 978-1-4200-9961-4

Table of contents

1. Introduction 9. Other models of factor analysis2. Mathematical foundations for factor analysis 10. Factor rotation3. Composite variables and linear transformations 11. Orthogonal analytic rotation4. Multiple and partial correlations 12. Oblique analytic rotation5. Multivariate normal distribution 13. Factor scores and factor indeterminacy6. Fundamental equations of factor analysis 14. Factorial invariance7. Methods of factor extraction 15. Confirmatory-factor analysis8. Common-factor analysis

Readership: Researchers and graduate students interested in fundamental issues of factoranalysis.

The first sentence of the Preface says it all: “This is a book for those who want or need to getto the bottom of things.” The first edition of this book appeared almost 40 years ago. It began

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute

Page 13: Foundations of Factor Analysis, Second Edition by Stanley A. Mulaik

328 SHORT BOOK REVIEWS

with a bit more mysterious sentence: “When I was nine years old, I dismantled the family alarmclock.”

I must say that I am very happy that the author has taken the challenge to update and revisethis precious book into the second edition. It will be an important source for decades to come.Although many topics in the first version still remain relevant, there are good grounds forthe new edition, mostly due to the development of factor analysis, which has (again) taken hugesteps since 1972. The history of factor analysis is very long and rather complicated. Thereforeit is quite natural that the usual books do not necessarily help in understanding where all theequations and different procedures actually came from. But, this is not a usual book.

As its title suggests, it digs deep into the foundations of factor analysis, shedding lighton dozens of questions concerning models, estimation, interpretation, generally applied rules,various algorithms and backgrounds of things, even philosophical and historical notes, etc. Allthis, together with some jokes here and there, makes reading the book like following a series ofenjoyable lectures.

All the way the topics are explained clearly, and mathematics is taught, as it is neededto understand a derivation of an equation or some procedure. Although there are numerousequations and formulas, there is also a great deal of words explaining them and offering moreinsight. Overall, the book is worth having nearby if you find yourself facing serious questionslike “why?” or “who?” or “how?” related to factor analysis.

Kimmo Vehkalahti: [email protected] of Social Research

FI-00014 University of Helsinki, Finland

International Statistical Review (2010), 78, 2, 316–328C© 2010 The Authors. Journal compilation C© 2010 International Statistical Institute