Statistical Pattern Recognition, Second Edition

معرفی کتاب «Statistical Pattern Recognition, Second Edition» نوشتهٔ Andrew R. Webb، منتشرشده توسط نشر Wiley & Sons در سال 2002. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Statistical Pattern Recognition, Second Edition» در دستهٔ بدون دسته‌بندی قرار دارد.

Statistical pattern recognition is a very active area of study and research, which has seen many advances in recent years. New and emerging applications - such as data mining, web searching, multimedia data retrieval, face recognition, and cursive handwriting recognition - require robust and efficient pattern recognition techniques. Statistical decision making and estimation are regarded as fundamental to the study of pattern recognition. Statistical Pattern Recognition, Second Edition has been fully updated with new methods, applications and references. It provides a comprehensive introduction to this vibrant area - with material drawn from engineering, statistics, computer science and the social sciences - and covers many application areas, such as database design, artificial neural networks, and decision support systems. \* Provides a self-contained introduction to statistical pattern recognition. \* Each technique described is illustrated by real examples. \* Covers Bayesian methods, neural networks, support vector machines, and unsupervised classification. \* Each section concludes with a description of the applications that have been addressed and with further developments of the theory. \* Includes background material on dissimilarity, parameter estimation, data, linear algebra and probability. \* Features a variety of exercises, from 'open-book' questions to more lengthy projects. The book is aimed primarily at senior undergraduate and graduate students studying statistical pattern recognition, pattern processing, neural networks, and data mining, in both statistics and engineering departments. It is also an excellent source of reference for technical professionals working in advanced information development environments. For further information on the techniques and applications discussed in this book please visit [www.statistical-pattern-recognition.net](http://www.statistical-pattern-recognition.net/) Content: Chapter 1 Introduction to Statistical Pattern Recognition (pages 1–31): Chapter 2 Density Estimation – Parametric (pages 33–80): Chapter 3 Density Estimation – Nonparametric (pages 81–122): Chapter 4 Linear Discriminant Analysis (pages 123–168): Chapter 5 Nonlinear Discriminant Analysis – Kernel Methods (pages 169–202): Chapter 6 Nonlinear Discriminant Analysis – Projection Methods (pages 203–224): Chapter 7 Tree?Based Methods (pages 225–249): Chapter 8 Performance (pages 251–303): Chapter 9 Feature Selection and Extraction (pages 305–360): Chapter 10 Clustering (pages 361–407): Chapter 11 Additional Topics (pages 409–418): Preface. Notation. 1 Introduction to statistical pattern recognition. 1.1 Statistical pattern recognition. 1.1.1 Introduction. 1.1.2 The basic model. 1.2 Stages in a pattern recognition problem. 1.3 Issues. 1.4 Supervised versus unsupervised. 1.5 Approaches to statistical pattern recognition. 1.5.1 Elementary decision theory. 1.5.2 Discriminant functions. 1.6 Multiple regression. 1.7 Outline of book. 1.8 Notes and references. Exercises. 2 Density estimation - parametric. 2.1 Introduction. 2.2 Normal-based models. 2.2.1 Linear and quadratic discriminant functions. 2.2.2 Regularised discriminant analysis. 2.2.3 Example application study. 2.2.4 Further developments. 2.2.5 Summary. 2.3 Normal mixture models. 2.3.1 Maximum likelihood estimation via EM. 2.3.2 Mixture models for discrimination. 2.3.3 How many components? 2.3.4 Example application study. 2.3.5 Further developments. 2.3.6 Summary. 2.4 Bayesian estimates. 2.4.1 Bayesian learning methods. 2.4.2 Markov chain Monte Carlo. 2.4.3 Bayesian approaches to discrimination. 2.4.4 Example application study. 2.4.5 Further developments. 2.4.6 Summary. 2.5 Application studies. 2.6 Summary and discussion. 2.7 Recommendations. 2.8 Notes and references. Exercises. 3 Density estimation - nonparametric. 3.1 Introduction. 3.2 Histogram method. 3.2.1 Data-adaptive histograms. 3.2.2 Independence assumption 3.2.3 Lancaster models. 3.2.4 Maximum weight dependence trees. 3.2.5 Bayesian networks. 3.2.6 Example application study. 3.2.7 Further developments. 3.2.8 Summary. 3.3 k-nearest-neighbour method. 3.3.1 k-nearest-neighbour decision rule. 3.3.2 Properties of the nearest-neighbour rule. 3.3.3 Algorithms. 3.3.4 Editing techniques. 3.3.5 Choice of distance metric. 3.3.6 Example application study. 3.3.7 Further developments. 3.3.8 Summary. 3.4 Expansion by basis functions. 3.5 Kernel methods. 3.5.1 Choice of smoothing parameter. 3.5.2 Choice of kernel. 3.5.3 Example application study. 3.5.4 Further developments. 3.5.5 Summary. 3.6 Application studies. 3.7 Summary and discussion. 3.8 Recommendations. 3.9 Notes and references. Exercises. 4 Linear discriminant analysis. 4.1 Introduction. 4.2 Two-class algorithms. 4.2.1 General ideas. 4.2.2 Perceptron criterion. 4.2.3 Fisher's criterion. 4.2.4 Least mean squared error procedures. 4.2.5 Support vector machines. 4.2.6 Example application study. 4.2.7 Further developments. 4.2.8 Summary. 4.3 Multiclass algorithms. 4.3.1 General ideas. 4.3.2 Error-correction procedure. 4.3.3 Fisher's criterion - linear discriminant analysis. 4.3.4 Least mean squared error procedures. 4.3.5 Optimal scaling. 4.3.6 Regularisation. 4.3.7 Multiclass support vector machines. 4.3.8 Example application study. 4.3.9 Further developments. 4.3.10 Summary. 4.4 Logistic discrimination. 4.4.1 Two-group case. 4.4.2 Maximum likelihood estimation. 4.4.3 Multiclass logistic discrimination. 4.4.4 Example application study. 4.4.5 Further developments. 4.4.6 Summary. 4.5 Application studies. 4.6 Summary and discussion. 4.7 Recommendations. 4.8 Notes and references. Exercises. 5 Nonlinear discriminant analysis - kernel methods. 5.1 Introduction. 5.2 Optimisation criteria. 5.2.1 Least squares error measure. 5.2.2 Maximum likelihood. 5.2.3 Entropy. 5.3 Radial basis functions. 5.3.1 Introduction. 5.3.2 Motivation. 5.3.3 Specifying the model. 5.3.4 Radial basis function properties. 5.3.5 Simple radial basis function. 5.3.6 Example application study. 5.3.7 Further developments. 5.3.8 Summary. 5.4 Nonlinear support vector machines. 5.4.1 Types of kernel. 5.4.2 Model selection. 5.4.3 Support vector machines for regression. 5.4.4 Example application study. 5.4.5 Further developments. 5.4.6 Summary. 5.5 Application studies. 5.6 Summary and discussion. 5.7 Recommendations. 5.8 Notes and references. Exercises. 6 Nonlinear discriminant analysis - projection methods. 6.1 Introduction. 6.2 The multilayer perceptron. 6.2.1 Introduction. 6.2.2 Specifying the multilayer perceptron structure. 6.2.3 Determining the multilayer perceptron weights. 6.2.4 Properties. 6.2.5 Example application study. 6.2.6 Further developments. 6.2.7 Summary. 6.3 Projection pursuit. 6.3.1 Introduction. 6.3.2 Projection pursuit for discrimination. 6.3.3 Example application study. 6.3.4 Further developments. 6.3.5 Summary. 6.4 Application studies. 6.5 Summary and discussion. 6.6 Recommendations. 6.7 Notes and references. Exercises. 7 Tree-based methods. 7.1 Introduction. 7.2 Classification trees. 7.2.1 Introduction. 7.2.2 Classifier tree construction. 7.2.3 Other issues. 7.2.4 Example application study. 7.2.5 Further developments. 7.2.6 Summary. 7.3 Multivariate adaptive regression splines. 7.3.1 Introduction. 7.3.2 Recursive partitioning model. 7.3.3 Example application study. 7.3.4 Further developments. 7.3.5 Summary. 7.4 Application studies. 7.5 Summary and discussion. 7.6 Recommendations. 7.7 Notes and references. Exercises. 8 Performance. 8.1 Introduction. 8.2 Performance assessment. 8.2.1 Discriminability. 8.2.2 Reliability. 8.2.3 ROC curves for two-class rules. 8.2.4 Example application study. 8.2.5 Further developments. 8.2.6 Summary. 8.3 Comparing classifier performance. 8.3.1 Which technique is best? 8.3.2 Statistical tests. 8.3.3 Comparing rules when misclassification costs are uncertain 8.3.4 Example application study. 8.3.5 Further developments. 8.3.6 Summary. 8.4 Combining classifiers. 8.4.1 Introduction. 8.4.2 Motivation. 8.4.3 Characteristics of a combination scheme. 8.4.4 Data fusion. 8.4.5 Classifier combination methods. 8.4.6 Example application study. 8.4.7 Further developments. 8.4.8 Summary. 8.5 Application studies. 8.6 Summary and discussion. 8.7 Recommendations. 8.8 Notes and references. Exercises. 9 Feature selection and extraction. 9.1 Introduction. 9.2 Feature selection. 9.2.1 Feature selection criteria. 9.2.2 Search algorithms for feature selection. 9.2.3 Suboptimal search algorithms. 9.2.4 Example application study. 9.2.5 Further developments. 9.2.6 Summary. 9.3 Linear feature extraction. 9.3.1 Principal components analysis. 9.3.2 Karhunen-Loeve transformation. 9.3.3 Factor analysis. 9.3.4 Example application study. 9.3.5 Further developments. 9.3.6 Summary. 9.4 Multidimensional scaling. 9.4.1 Classical scaling. 9.4.2 Metric multidimensional scaling. 9.4.3 Ordinal scaling. 9.4.4 Algorithms. 9.4.5 Multidimensional scaling for feature extraction. 9.4.6 Example application study. 9.4.7 Further developments. 9.4.8 Summary. 9.5 Application studies. 9.6 Summary and discussion. 9.7 Recommendations. 9.8 Notes and references. Exercises. 10 Clustering. 10.1 Introduction. 10.2 Hierarchical methods. 10.2.1 Single-link method. 10.2.2 Complete-link method. 10.2.3 Sum-of-squares method. 10.2.4 General agglomerative algorithm. 10.2.5 Properties of a hierarchical classification. 10.2.6 Example application study. 10.2.7 Summary. 10.3 Quick partitions. 10.4 Mixture models. 10.4.1 Model description. 10.4.2 Example application study. 10.5 Statistical pattern recognition is a very active area of study and research, which has seen many advances in recent years. New and emerging applications - such as data mining, web searching, multimedia data retrieval, face recognition, and cursive handwriting recognition - require robust and efficient pattern recognition techniques. Statistical decision making and estimation are regarded as fundamental to the study of pattern recognition. Statistical Pattern Recognition, Second Edition has been fully updated with new methods, applications and references. It provides a comprehensive introduction to this vibrant area - with material drawn from engineering, statistics, computer science and the social sciences - and covers many application areas, such as database design, artificial neural networks, and decision support systems. * Provides a self-contained introduction to statistical pattern recognition. * Each technique described is illustrated by real examples. * Covers Bayesian methods, neural networks, support vector machines, and unsupervised classification. * Each section concludes with a description of the applications that have been addressed and with further developments of the theory. * Includes background material on dissimilarity, parameter estimation, data, linear algebra and probability. * Features a variety of exercises, from 'open-book' questions to more lengthy projects. The book is aimed primarily at senior undergraduate and graduate students studying statistical pattern recognition, pattern processing, neural networks, and data mining, in both statistics and engineering departments. It is also an excellent source of reference for technical professionals working in advanced information development environments. For further information on the techniques and applications discussed in this book please visit(http://www.statistical-pattern-recognition.net) www.statistical-pattern-recognition.net

Methods. 10.5.1 Clustering criteria. 10.5.2 Clustering algorithms. 10.5.3 Vector quantisation. 10.5.4 Example application study. 10.5.5 Further developments. 10.5.6 Summary. 10.6 Cluster validity. 10.6.1 Introduction. 10.6.2 Distortion measures. 10.6.3 Choosing the number of clusters. 10.6.4 Identifying genuine clusters. 10.7 Application studies. 10.8 Summary and discussion. 10.9 Recommendations. 10.10 Notes and references. Exercises. 11 Additional topics. 11.1 Model selection. 11.1.1 Separate training and test sets. 11.1.2 Cross-validation. 11.1.3 The Bayesian viewpoint. 11.1.4 Akaike's information criterion. 11.2 Learning with unreliable classification. 11.3 Missing data. 11.4 Outlier detection and robust procedures. 11.5 Mixed continuous and discrete variables. 11.6 Structural risk minimisation and the Vapnik-Chervonenkis dimension. 11.6.1 Bounds on the expected risk. 11.6.2 The Vapnik-Chervonenkis dimension. A Measures of dissimilarity. A.1 Measures of dissimilarity. A.1.1 Numeric variables. A.1.2 Nominal and ordinal variables. A.1.3 Binary variables. A.1.4 Summary. A.2 Distances between distributions. A.2.1 Methods based on prototype vectors. A.2.2 Methods based on probabilistic distance. A.2.3 Probabilistic dependence. A.3 Discussion. B Parameter estimation. B.1 Parameter estimation. B.1.1 Properties of estimators. B.1.2 Maximum likelihood. B.1.3 Problems with maximum likelihood. B.1.4 Bayesian estimates. C Linear algebra. C.1 Basic properties and definitions. C.2 Notes and references. D Data. D.1 Introduction. D.2 Formulating the problem. D.3 Data collection. D.4 Initial examination of data. D.5 Data sets. D.6 Notes and references. E Probability theory. E.1 Definitions and terminology. E.2 Normal distribution. E.3 Probability distributions. References. Index The book written by Andrew Webb is certainly the most comprehensive book related to machine learning. I have not been able to find any machine learning topic which is not treated in this book. According to me, this book is more for a scientific audience for the simplest reason that the presentation gives more importance to equations than to application examples. It does not explain how to program machine learning algorithm but rather which algorithms exist and what is their mathematical background. Every technique is presented first using text and only then mathematical development is shown. Therefore, it is convenient for people preferring textual description as well as the ones preferring equations. The book is very well structured. Every chapter starts with a textual introduction on the related issue and then describes several techniques to solve it. At the end, specific application examples are given. A large part is then devoted to summary, discussion, recommendations (not always), notes and references, and finally exercises. Topics are covered in a non standard way for people used to data mining practical books. After an introduction, density estimation techniques are explained. Then linear and non-linear discriminant analyzes. It goes on with decision trees, performance and feature selection to finish with clustering and some other additional topics. Although this book is written in a statistical point of view, it is certainly one of the most comprehensive resource for machine learning and data mining. This book describes basic pattern recognition procedures, together with practical applications of the techniques on real-world problems.

دانلود کتاب Statistical Pattern Recognition, Second Edition