Principles and Theory for Data Mining and Machine Learning
by Clarke, Bertrand; Fokoue, Ernest; Zhang, Hao HelenRent Textbook
Rent Digital
New Textbook
We're Sorry
Sold Out
Used Textbook
We're Sorry
Sold Out
How Marketplace Works:
- This item is offered by an independent seller and not shipped from our warehouse
- Item details like edition and cover design may differ from our description; see seller's comments before ordering.
- Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
- Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
- Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.
Summary
Author Biography
Table of Contents
| Preface | p. v |
| Variability, Information, and Prediction | p. 1 |
| The Curse of Dimensionality | p. 3 |
| The Two Extremes | p. 4 |
| Perspectives on the Curse | p. 5 |
| Sparsity | p. 6 |
| Exploding Numbers of Models | p. 8 |
| Multicollinearity and Concurvity | p. 9 |
| The Effect of Noise | p. 10 |
| Coping with the Curse | p. 11 |
| Selecting Design Points | p. 11 |
| Local Dimension | p. 12 |
| Parsimony | p. 17 |
| Two Techniques | p. 18 |
| The Bootstrap | p. 18 |
| Cross-Validation | p. 27 |
| Optimization and Search | p. 32 |
| Univariate Search | p. 32 |
| Multivariate Search | p. 33 |
| General Searches | p. 34 |
| Constraint Satisfaction and Combinatorial Search | p. 35 |
| Notes | p. 38 |
| Hammersley Points | p. 38 |
| Edgeworth Expansions for the Mean | p. 39 |
| Bootstrap Asymptotics for the Studentized Mean | p. 41 |
| Exercises | p. 43 |
| Local Smoothers | p. 53 |
| Early Smoothers | p. 55 |
| Transition to Classical Smoothers | p. 59 |
| Global Versus Local Approximations | p. 60 |
| LOESS | p. 64 |
| Kernel Smoothers | p. 67 |
| Statistical Function Approximation | p. 68 |
| The Concept of Kernel Methods and the Discrete Case | p. 73 |
| Kernels and Stochastic Designs: Density Estimation | p. 78 |
| Stochastic Designs: Asymptotics for Kernel Smoothers | p. 81 |
| Convergence Theorems and Rates for Kernel Smoothers | p. 86 |
| Kernel and Bandwidth Selection | p. 90 |
| Linear Smoothers | p. 95 |
| Nearest Neighbors | p. 96 |
| Applications of Kernel Regression | p. 100 |
| A Simulated Example | p. 100 |
| Ethanol Data | p. 102 |
| Exercises | p. 107 |
| Spline Smoothing | p. 117 |
| Interpolating Splines | p. 117 |
| Natural Cubic Splines | p. 123 |
| Smoothing Splines for Regression | p. 126 |
| Model Selection for Spline Smoothing | p. 129 |
| Spline Smoothing Meets Kernel Smoothing | p. 130 |
| Asymptotic Bias, Variance, and MISE for Spline Smoothers | p. 131 |
| Ethanol Data Example - Continued | p. 133 |
| Splines Redux: Hilbert Space Formulation | p. 136 |
| Reproducing Kernels | p. 138 |
| Constructing an RKHS | p. 141 |
| Direct Sum Construction for Splines | p. 146 |
| Explicit Forms | p. 149 |
| Nonparametrics in Data Mining and Machine Learning | p. 152 |
| Simulated Comparisons | p. 154 |
| What Happens with Dependent Noise Models? | p. 157 |
| Higher Dimensions and the Curse of Dimensionality | p. 159 |
| Notes | p. 163 |
| Sobolev Spaces: Definition | p. 163 |
| Exercises | p. 164 |
| New Wave Nonparametrics | p. 171 |
| Additive Models | p. 172 |
| The Backfitting Algorithm | p. 173 |
| Concurvity and Inference | p. 177 |
| Nonparametric Optimality | p. 180 |
| Generalized Additive Models | p. 181 |
| Projection Pursuit Regression | p. 184 |
| Neural Networks | p. 189 |
| Backpropagation and Inference | p. 192 |
| Barron's Result and the Curse | p. 197 |
| Approximation Properties | p. 198 |
| Barron's Theorem: Formal Statement | p. 200 |
| Recursive Partitioning Regression | p. 202 |
| Growing Trees | p. 204 |
| Pruning and Selection | p. 207 |
| Regression | p. 208 |
| Bayesian Additive Regression Trees: BART | p. 210 |
| MARS | p. 210 |
| Sliced Inverse Regression | p. 215 |
| ACE and AVAS | p. 218 |
| Notes | p. 220 |
| Proof of Barron's Theorem | p. 220 |
| Exercises | p. 224 |
| Supervised Learning: Partition Methods | p. 231 |
| Multiclass Learning | p. 233 |
| Discriminant Analysis | p. 235 |
| Distance-Based Discriminant Analysis | p. 236 |
| Bayes Rules | p. 241 |
| Probability-Based Discriminant Analysis | p. 245 |
| Tree-Based Classifiers | p. 249 |
| Splitting Rules | p. 249 |
| Logic Trees | p. 253 |
| Random Forests | p. 254 |
| Support Vector Machines | p. 262 |
| Margins and Distances | p. 262 |
| Binary Classification and Risk | p. 265 |
| Prediction Bounds for Function Classes | p. 268 |
| Constructing SVM Classifiers | p. 271 |
| SVM Classification for Nonlinearly Separable Populations | p. 279 |
| SVMs in the General Nonlinear Case | p. 282 |
| Some Kernels Used in SVM Classification | p. 288 |
| Kernel Choice, SVMs and Model Selection | p. 289 |
| Support Vector Regression | p. 290 |
| Multiclass Support Vector Machines | p. 293 |
| Neural Networks | p. 294 |
| Notes | p. 296 |
| Hoeffding's Inequality | p. 296 |
| VC Dimension | p. 297 |
| Exercises | p. 300 |
| Alternative Nonparametrics | p. 307 |
| Ensemble Methods | p. 308 |
| Bayes Model Averaging | p. 310 |
| Bagging | p. 312 |
| Stacking | p. 316 |
| Boosting | p. 318 |
| Other Averaging Methods | p. 326 |
| Oracle Inequalities | p. 328 |
| Bayes Nonparametrics | p. 334 |
| Dirichlet Process Priors | p. 334 |
| Polya Tree Priors | p. 336 |
| Gaussian Process Priors | p. 338 |
| The Relevance Vector Machine | p. 344 |
| RVM Regression: Formal Description | p. 345 |
| RVM Classification | p. 349 |
| Hidden Markov Models - Sequential Classification | p. 352 |
| Notes | p. 354 |
| Proof of Yang's Oracle Inequality | p. 354 |
| Proof of Lecue's Oracle Inequality | p. 357 |
| Exercises | p. 359 |
| Computational Comparisons | p. 365 |
| Computational Results: Classification | p. 366 |
| Comparison on Fisher's Iris Data | p. 366 |
| Comparison on Ripley's Data | p. 369 |
| Computational Results: Regression | p. 376 |
| Vapnik's sinc Function | p. 377 |
| Friedman's Function | p. 389 |
| Conclusions | p. 392 |
| Systematic Simulation Study | p. 397 |
| No Free Lunch | p. 400 |
| Exercises | p. 402 |
| Unsupervised Learning: Clustering | p. 405 |
| Centroid-Based Clustering | p. 408 |
| K-Means Clustering | p. 409 |
| Variants | p. 412 |
| Hierarchical Clustering | p. 413 |
| Agglomerative Hierarchical Clustering | p. 414 |
| Divisive Hierarchical Clustering | p. 422 |
| Theory for Hierarchical Clustering | p. 426 |
| Partitional Clustering | p. 430 |
| Model-Based Clustering | p. 432 |
| Graph-Theoretic Clustering | p. 447 |
| Spectral Clustering | p. 452 |
| Bayesian Clustering | p. 458 |
| Probabilistic Clustering | p. 458 |
| Hypothesis Testing | p. 461 |
| Computed Examples | p. 463 |
| Ripley's Data | p. 465 |
| Iris Data | p. 475 |
| Cluster Validation | p. 480 |
| Notes | p. 484 |
| Derivatives of Functions of a Matrix | p. 484 |
| Kruskal's Algorithm: Proof | p. 484 |
| Prim's Algorithm: Proof | p. 485 |
| Exercises | p. 485 |
| Learning in High Dimensions | p. 493 |
| Principal Components | p. 495 |
| Main Theorem | p. 496 |
| Key Properties | p. 498 |
| Extensions | p. 500 |
| Factor Analysis | p. 502 |
| Finding ¿ and ¿ | p. 504 |
| Finding K | p. 506 |
| Estimating Factor Scores | p. 507 |
| Projection Pursuit | p. 508 |
| Independent Components Analysis | p. 511 |
| Main Definitions | p. 511 |
| Key Results | p. 513 |
| Computational Approach | p. 515 |
| Nonlinear PCs and ICA | p. 516 |
| Nonlinear PCs | p. 517 |
| Nonlinear ICA | p. 518 |
| Geometric Summarization | p. 518 |
| Measuring Distances to an Algebraic Shape | p. 519 |
| Principal Curves and Surfaces | p. 520 |
| Supervised Dimension Reduction: Partial Least Squares | p. 523 |
| Simple PLS | p. 523 |
| PLS Procedures | p. 524 |
| Properties of PLS | p. 526 |
| Supervised Dimension Reduction: Sufficient Dimensions in Regression | p. 527 |
| Visualization I: Basic Plots | p. 531 |
| Elementary Visualization | p. 534 |
| Projections | p. 541 |
| Time Dependence | p. 543 |
| Visualization II: Transformations | p. 546 |
| Chernoff Faces | p. 546 |
| Multidimensional Scaling | p. 547 |
| Self-Organizing Maps | p. 553 |
| Exercises | p. 560 |
| Variable Selection | p. 569 |
| Concepts from Linear Regression | p. 570 |
| Subset Selection | p. 572 |
| Variable Ranking | p. 575 |
| Overview | p. 577 |
| Traditional Criteria | p. 578 |
| Akaike Information Criterion (AIC) | p. 580 |
| Bayesian Information Criterion (BIC) | p. 583 |
| Choices of Information Criteria | p. 585 |
| Cross Validation | p. 587 |
| Shrinkage Methods | p. 599 |
| Shrinkage Methods for Linear Models | p. 601 |
| Grouping in Variable Selection | p. 615 |
| Least Angle Regression | p. 617 |
| Shrinkage Methods for Model Classes | p. 620 |
| Cautionary Notes | p. 631 |
| Bayes Variable Selection | p. 632 |
| Prior Specification | p. 635 |
| Posterior Calculation and Exploration | p. 643 |
| Evaluating Evidence | p. 647 |
| Connections Between Bayesian and Frequentist Methods | p. 650 |
| Computational Comparisons | p. 653 |
| The n>p Case | p. 653 |
| When p>n | p. 665 |
| Notes | p. 667 |
| Code for Generating Data in Section 10.5 | p. 667 |
| Exercises | p. 671 |
| Multiple Testing | p. 679 |
| Analyzing the Hypothesis Testing Problem | p. 681 |
| A Paradigmatic Setting | p. 681 |
| Counts for Multiple Tests | p. 684 |
| Measures of Error in Multiple Testing | p. 685 |
| Aspects of Error Control | p. 687 |
| Controlling the Familywise Error Rate | p. 690 |
| One-Step Adjustments | p. 690 |
| Stepwise p-Value Adjustments | p. 693 |
| PCER and PFER | p. 695 |
| Null Domination | p. 696 |
| Two Procedures | p. 697 |
| Controlling the Type I Error Rate | p. 702 |
| Adjusted p-Values for PFER/PCER | p. 706 |
| Controlling the False Discovery Rate | p. 707 |
| FDR and other Measures of Error | p. 709 |
| The Benjamini-Hochberg Procedure | p. 710 |
| A BH Theorem for a Dependent Setting | p. 711 |
| Variations on BH | p. 713 |
| Controlling the Positive False Discovery Rate | p. 719 |
| Bayesian Interpretations | p. 719 |
| Aspects of Implementation | p. 723 |
| Bayesian Multiple Testing | p. 727 |
| Fully Bayes: Hierarchical | p. 728 |
| Fully Bayes: Decision theory | p. 731 |
| Notes | p. 736 |
| Proof of the Benjamini-Hochberg Theorem | p. 736 |
| Proof of the Benjamini-Yekutieli Theorem | p. 739 |
| References | p. 743 |
| Index | p. 773 |
| Table of Contents provided by Ingram. All Rights Reserved. |
An electronic version of this book is available through VitalSource.
This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.
By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.
Digital License
You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.
More details can be found here.
A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.
Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.
Please view the compatibility matrix prior to purchase.