Applied Data Mining

معرفی کتاب «Applied Data Mining» نوشتهٔ Xu, Guandong ;Zong, Yu ;Yang, Zhenglu، منتشرشده توسط نشر CRC Press (an imprint of Taylor & Francis) در سال 2013. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Applied Data Mining» در دستهٔ بدون دسته‌بندی قرار دارد.

Data mining has witnessed substantial advances in recent decades. New research questions and practical challenges have arisen from emerging areas and applications within the various fields closely related to human daily life, e.g. social media and social networking. This book aims to bridge the gap between traditional data mining and the latest advances in newly emerging information services. It explores the extension of well-studied algorithms and approaches into these new research arenas. Read more... Pt. I Fundamentals 1.Introduction 1.1.Background 1.1.1.Data Mining-Definitions and Concepts 1.1.2.Data Mining Process 1.1.3.Data Mining Algorithms 1.2.Organization of the Book 1.2.1.Part 1: Fundamentals 1.2.2.Part 2: Advanced Data Mining 1.2.3.Part 3: Emerging Applications 1.3.The Audience of the Book 2.Mathematical Foundations 2.1.Organization of Data 2.1.1.Boolean Model 2.1.2.Vector Space Model 2.1.3.Graph Model 2.1.4.Other Data Structures 2.2.Data Distribution 2.2.1.Univariate Distribution 2.2.2.Multivariate Distribution 2.3.Distance Measures 2.3.1.Jaccard distance 2.3.2.Euclidean Distance 2.3.3.Minkowski Distance 2.3.4.Chebyshev Distance 2.3.5.Mahalanobis Distance 2.4.Similarity Measures 2.4.1.Cosine Similarity 2.4.2.Adjusted Cosine Similarity 2.4.3.Kullback-Leibler Divergence 2.4.4.Model-based Measures 2.5.Dimensionality Reduction Contents note continued: 2.5.1.Principal Component Analysis 2.5.2.Independent Component Analysis 2.5.3.Non-negative Matrix Factorization 2.5.4.Singular Value Decomposition 2.6.Chapter Summary 3.Data Preparation 3.1.Attribute Selection 3.1.1.Feature Selection 3.1.2.Discretizing Numeric Attributes 3.2.Data Cleaning and Integrity 3.2.1.Missing Values 3.2.2.Detecting Anomalies 3.2.3.Applications 3.3.Multiple Model Integration 3.3.1.Data Federation 3.3.2.Bagging and Boosting 3.4.Chapter Summary 4.Clustering Analysis 4.1.Clustering Analysis 4.2.Types of Data in Clustering Analysis 4.2.1.Data Matrix 4.2.2.The Proximity Matrix 4.3.Traditional Clustering Algorithms 4.3.1.Partitional methods 4.3.2.Hierarchical Methods 4.3.3.Density-based methods 4.3.4.Grid-based Methods 4.3.5.Model-based Methods 4.4.High-dimensional clustering algorithm 4.4.1.Bottom-up Approaches 4.4.2.Top-down Approaches Contents note continued: 4.4.3.Other Methods 4.5.Constraint-based Clustering Algorithm 4.5.1.COP K-means 4.5.2.MPCK-means 4.5.3.AFCC 4.6.Consensus Clustering Algorithm 4.6.1.Consensus Clustering Framework 4.6.2.Some Consensus Clustering Methods 4.7.Chapter Summary 5.Classification 5.1.Classification Definition and Related Issues 5.2.Decision Tree and Classification 5.2.1.Decision Tree 5.2.2.Decision Tree Classification 5.2.3.Hunt's Algorithm 5.3.Bayesian Network and Classification 5.3.1.Bayesian Network 5.3.2.Backpropagation and Classification 5.3.3.Association-based Classification 5.3.4.Support Vector Machines and Classification 5.4.Chapter Summary 6.Frequent Pattern Mining 6.1.Association Rule Mining 6.1.1.Association Rule Mining Problem 6.1.2.Basic Algorithms for Association Rule Mining 6.2.Sequential Pattern Mining 6.2.1.Sequential Pattern Mining Problem Contents note continued: 6.2.2.Existing Sequential Pattern Mining Algorithms 6.3.Frequent Subtree Mining 6.3.1.Frequent Subtree Mining Problem 6.3.2.Data Structures for Storing Trees 6.3.3.Maximal and closed frequent subtrees 6.4.Frequent Subgraph Mining 6.4.1.Problem Definition 6.4.2.Graph Representation 6.4.3.Candidate Generation 6.4.4.Frequent Subgraph Mining Algorithms 6.5.Chapter Summary pt. II Advanced Data Mining 7.Advanced Clustering Analysis 7.1.Introduction 7.2.Space Smoothing Search Methods in Heuristic Clustering 7.2.1.Smoothing Search Space and Smoothing Operator 7.2.2.Clustering Algorithm based on Smoothed Search Space 7.3.Using Approximate Backbone for Initializations in Clustering 7.3.1.Definitions and Background of Approximate Backbone 7.3.2.Heuristic Clustering Algorithm based on Approximate Backbone 7.4.Improving Clustering Quality in High Dimensional Space Contents note continued: 7.4.1.Overview of High Dimensional Clustering 7.4.2.Motivation of our Method 7.4.3.Significant Local Dense Area 7.4.4.Projective Clustering based on SLDAs 7.5.Chapter Summary 8.Multi-Label Classification 8.1.Introduction 8.2.What is Multi-label Classification 8.3.Problem Transformation 8.3.1.Binary Relevance and Label Powerset 8.3.2.Classifier Chains and Probabilistic Classifier Chains 8.3.3.Decompose the Label Set 8.3.4.Transform Original Label Space to Another Space 8.4.Algorithm Adaptation 8.4.1.KNN-based methods 8.4.2.Learn the Label Dependencies by the Statistical Models 8.5.Evaluation Metrics and Datasets 8.5.1.Evaluation Metrics 8.5.2.Benchmark Datasets and the Statistics 8.6.Chapter Summary 9.Privacy Preserving in Data Mining 9.1.The K-Anonymity Method 9.2.The 1-Diversity Method 9.3.The t-Closeness Method 9.4.Discussion and Challenges 9.5.Chapter Summary Contents note continued: pt. III Emerging Applications 10.Data Stream 10.1.General Data Stream Models 10.2.Sampling Approach 10.2.1.Random Sampling 10.2.2.Cluster Sampling 10.3.Wavelet Method 10.4.Sketch Method 10.4.1.Sliding Window-based Sketch 10.4.2.Count Sketch 10.4.3.Fast Count Sketch 10.4.4.Count Min Sketch 10.4.5.Some Related Issues on Sketches 10.4.6.Applications of Sketches 10.4.7.Advantages and Limitations of Sketch Strategies 10.5.Histogram Method 10.5.1.Dynamic Construction of Histograms 10.6.Discussion 10.7.Chapter Summary 11.Recommendation Systems 11.1.Collaborative Filtering 11.1.1.Memory-based Collaborative Recommendation 11.1.2.Model-based Recommendation 11.2.PLSA Method 11.2.1.User Pattern Extraction and Latent Factor Recognition 11.3.Tensor Model 11.4.Discussion and Challenges 11.4.1.Security and Privacy Issues 11.4.2.Effectiveness Issue 11.5.Chapter Summary Contents note continued: 12.Social Tagging Systems 12.1.Data Mining and Information Retrieval 12.2.Recommender Systems 12.2.1.Recommendation Algorithms 12.2.2.Tag-Based Recommender Systems 12.3.Clustering Algorithms in Recommendation 12.3.1.K-means Algorithm 12.3.2.Hierarchical Clustering 12.3.3.Spectral Clustering 12.3.4.Quality of Clusters and Modularity Method 12.3.5.K-Nearest-Neighboring 12.4.Clustering Algorithms in Tag-Based Recommender Systems 12.5.Chapter Summary. Content: FUNDAMENTALS Introduction Mathematical Foundations Data Preparation Cluster Analysis Classification Frequent Pattern Mining ADVANCED DATA MINING Advanced Clustering Analysis Privacy Preserving in Data Mining Data Stream EMERGING APPLICATIONS Web Clustering and Web Community Recommender Systems Data Mining in Social Tagging Systems Social Network Mining Abstract: Data mining has witnessed substantial advances in recent decades. New research questions and practical challenges have arisen from emerging areas and applications within the various fields closely related to human daily life, e.g. social media and social networking. This book aims to bridge the gap between traditional data mining and the latest advances in newly emerging information services. It explores the extension of well-studied algorithms and approaches into these new research arenas "In past decades, data mining has witnessed substantial advances by efforts from various communities. On the other hand, new research questions and practical challenges are continuously presented due to newly emerging topics and applications within the various fields closely related to human daily life, e.g. social media and social networking. This book aims to bridge the gap between the existing research and application progresses in traditional data mining and the latest advances in newly emerging information services. It explores the extension of well-studied algorithms and approaches into these new research arenas"-- Provided by publisher

دانلود کتاب Applied Data Mining