Machine Learning Projects for .NET Developers

معرفی کتاب «Machine Learning Projects for .NET Developers» نوشتهٔ Mathias Brandewinder (auth.)، منتشرشده توسط نشر Apress L. P. در سال 2015. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Machine Learning Projects for .NET Developers» در دستهٔ بدون دسته‌بندی قرار دارد.

Machine Learning Projects for .NET Developers shows you how to build smarter .NET applications that learn from data, using simple algorithms and techniques that can be applied to a wide range of real-world problems. Youℓ́ℓll code each project in the familiar setting of Visual Studio, while the machine learning logic uses F#, a language ideally suited to machine learning applications in .NET. If youℓ́ℓre new to F#, this book will give you everything you need to get started. If youℓ́ℓre already familiar with F#, this is your chance to put the language into action in an exciting new context. In a series of fascinating projects, youℓ́ℓll learn how to: Build an optical character recognition (OCR) system from scratch Code a spam filter that learns by example Use F#ℓ́ℓs powerful type providers to interface with external resources (in this case, data analysis tools from the R programming language) Transform your data into informative features, and use them to make accurate predictions Find patterns in data when you donℓ́ℓt know what youℓ́ℓre looking for Predict numerical values using regression models Implement an intelligent game that learns how to play from experience Along the way, youℓ́ℓll learn fundamental ideas that can be applied in all kinds of real-world contexts and industries, from advertising to finance, medicine, and scientific research. While some machine learning algorithms use fairly advanced mathematics, this book focuses on simple but effective approaches. If you enjoy hacking code and data, this book is for you Contents at a Glance 3 Contents 280 About the Author 288 About the Technical Reviewer 289 Acknowledgments 290 Introduction 4 Chapter 1: 256 Shades of Gray 5 What Is Machine Learning? 6 A Classic Machine Learning Problem: Classifying Images 6 Our Challenge: Build a Digit Recognizer 7 Distance Functions in Machine Learning 9 Start with Something Simple 9 Our First Model, C# Version 10 Dataset Organization 10 Reading the Data 11 Computing Distance between Images 13 Writing a Classifier 15 So, How Do We Know It Works? 16 Cross-validation 16 Evaluating the Quality of Our Model 17 Improving Your Model 18 Introducing F# for Machine Learning 19 Live Scripting and Data Exploration with F# Interactive 19 Creating our First F# Script 22 Dissecting Our First F# Script 24 Creating Pipelines of Functions 26 Manipulating Data with Tuples and Pattern Matching 28 Training and Evaluating a Classifier Function 29 Improving Our Model 30 Experimenting with Another Definition of Distance 30 Factoring Out the Distance Function 31 So, What Have We Learned? 34 What to Look for in a Good Distance Function 34 Models Don’t Have to Be Complicated 35 Why F#? 35 Going Further 36 Chapter 2: Spam or Ham? 37 Our Challenge: Build a Spam-Detection Engine 38 Getting to Know Our Dataset 38 Using Discriminated Unions to Model Labels 39 Reading Our Dataset 40 Deciding on a Single Word 42 Using Words as Clues 42 Putting a Number on How Certain We Are 43 Bayes’ Theorem 44 Dealing with Rare Words 46 Combining Multiple Words 46 Breaking Text into Tokens 46 Naïvely Combining Scores 48 Simplified Document Score 48 Implementing the Classifier 49 Extracting Code into Modules 51 Scoring and Classifying a Document 51 Introducing Sets and Sequences 53 Learning from a Corpus of Documents 56 Training Our First Classifier 58 Implementing Our First Tokenizer 58 Validating Our Design Interactively 59 Establishing a Baseline with Cross-validation 59 Improving Our Classifier 60 Using Every Single Word 60 Does Capitalization Matter? 61 Less Is more 62 Choosing Our Words Carefully 63 Creating New Features 65 Dealing with Numeric Values 67 Understanding Errors 68 So What Have We Learned? 69 Chapter 3: The Joy of Type Providers 71 Exploring StackOverflow data 72 The StackExchange API 72 Using the JSON Type Provider 74 Building a Minimal DSL to Query Questions 78 All the Data in the World 80 The World Bank Type Provider 80 The R Type Provider 82 Analyzing Data Together with R Data Frames 85 Deedle, a .NET Data Frame 89 Data of the World, Unite! 90 So, What Have We Learned? 94 Going Further 95 Chapter 4: Of Bikes and Men 97 Getting to Know the Data 98 What’s in the Dataset? 98 Inspecting the Data with FSharp.Charting 100 Spotting Trends with Moving Averages 101 Fitting a Model to the Data 102 Defining a Basic Straight-Line Model 102 Finding the Lowest-Cost Model 104 Finding the Minimum of a Function with Gradient Descent 105 Using Gradient Descent to Fit a Curve 107 A More General Model Formulation 107 Implementing Gradient Descent 108 Stochastic Gradient Descent 109 Analyzing Model Improvements 111 Batch Gradient Descent 113 Linear Algebra to the Rescue 115 Honey, I Shrunk the Formula! 116 Linear Algebra with Math.NET 117 Normal Form 118 Pedal to the Metal with MKL 119 Evolving and Validating Models Rapidly 120 Cross-Validation and Over-Fitting, Again 120 Simplifying the Creation of Models 122 Adding Continuous Features to the Model 123 Refining Predictions with More Features 126 Handling Categorical Features 126 Non-linear Features 128 Regularization 130 So, What Have We Learned? 132 Minimizing Cost with Gradient Descent 132 Predicting a Number with Regression 133 Chapter 5: You Are Not a Unique Snowflake 134 Detecting Patterns in Data 135 Our Challenge: Understanding Topics on StackOverflow 137 Getting to Know Our Data 139 Finding Clusters with K-Means Clustering 142 Improving Clusters and Centroids 143 Implementing K-Means Clustering 145 Clustering StackOverflow Tags 147 Running the Clustering Analysis 148 Analyzing the Results 149 Good Clusters, Bad Clusters 151 Rescaling Our Dataset to Improve Clusters 154 Identifying How Many Clusters to Search For 157 What Are Good Clusters? 157 Identifying k on the StackOverflow Dataset 158 Our Final Clusters 160 Detecting How Features Are Related 161 Covariance and Correlation 161 Correlations Between StackOverflow Tags 163 Identifying Better Features with Principal Component Analysis 165 Recombining Features with Algebra 165 A Small Preview of PCA in Action 166 Implementing PCA 168 Applying PCA to the StackOverflow Dataset 170 Analyzing the Extracted Features 171 Making Recommendations 175 A Primitive Tag Recommender 176 Implementing the Recommender 176 Validating the Recommendations 178 So What Have We Learned? 180 Chapter 6: Trees and Forests 182 Our Challenge: Sink or Swim on the Titanic 182 Getting to Know the Dataset 183 Taking a Look at Features 184 Building a Decision Stump 185 Training the Stump 187 Features That Don’t Fit 188 How About Numbers? 188 What about Missing Data? 189 Measuring Information in Data 191 Measuring Uncertainty with Entropy 191 Information Gain 193 Implementing the Best Feature Identification 195 Using Entropy to Discretize Numeric Features 198 Growing a Tree from Data 198 Modeling the Tree 199 Constructing the Tree 200 A Prettier Tree 202 Improving the Tree 203 Why Are We Over-Fitting? 203 Limiting Over-Confidence with Filters 204 From Trees to Forests 206 Deeper Cross-Validation with k-folds 206 Combining Fragile Trees into Robust Forests 208 Implementing the Missing Blocks 209 Growing a Forest 210 Trying Out the Forest 210 So, What Have We Learned? 212 Chapter 7: A Strange Game 214 Building a Simple Game 215 Modeling Game Elements 215 Modeling the Game Logic 216 Running the Game as a Console App 218 Rendering the Game 220 Building a Primitive Brain 222 Modeling the Decision Making Process 223 Learning a Winning Strategy from Experience 224 Implementing the Brain 225 Testing Our Brain 227 Can We Learn More Effectively? 230 Exploration vs. Exploitation 230 Is a Red Door Different from a Blue Door? 231 Greed vs. Planning 232 A World of Never-Ending Tiles 233 Implementing Brain 2.0 236 Simplifying the World 236 Planning Ahead 237 Epsilon Learning 238 So, What Have We Learned? 239 A Simple Model That Fits Intuition 240 An Adaptive Mechanism 240 Chapter 8: Digits, Revisited 242 Optimizing and Scaling Your Algorithm Code 242 Tuning Your Code 242 What to Search For 243 Tuning the Distance 244 Using Array.Parallel 247 Different Classifiers with Accord.NET 249 Logistic Regression 249 Simple Logistic Regression with Accord 251 One-vs-One, One-vs-All Classification 253 Support Vector Machines 254 Neural Networks 257 Creating and Training a Neural Network with Accord 259 Scaling with m-brace.net 261 Getting Started with MBrace on Azure with Brisk 261 Processing Large Datasets with MBrace 265 So What Did We Learn? 267 Chapter 9: Conclusion 269 Mapping Our Journey 269 Science! 270 F#: Being Productive in a Functional Style 271 What’s Next? 272 Index 273 Machine Learning Projects for .NET Developers shows you how to build smarter .NET applications that learn from data, using simple algorithms and techniques that can be applied to a wide range of real-world problems. Youĺlll code each project in the familiar setting of Visual Studio, while the machine learning logic uses F#, a language ideally suited to machine learning applications in .NET. If youĺlre new to F#, this book will give you everything you need to get started. If youĺlre already familiar with F#, this is your chance to put the language into action in an exciting new context. In a series of fascinating projects, youĺlll learn how to: Build an optical character recognition (OCR) system from scratch Code a spam filter that learns by example Use F#ĺls powerful type providers to interface with external resources (in this case, data analysis tools from the R programming language) Transform your data into informative features, and use them to make accurate predictions Find patterns in data when you donĺlt know what youĺlre looking for Predict numerical values using regression models Implement an intelligent game that learns how to play from experience Along the way, youĺlll learn fundamental ideas that can be applied in all kinds of real-world contexts and industries, from advertising to finance, medicine, and scientific research. While some machine learning algorithms use fairly advanced mathematics, this book focuses on simple but effective approaches. If you enjoy hacking code and data, this book is for you Front Matter....Pages i-xix 256 Shades of Gray....Pages 1-32 Spam or Ham?....Pages 33-66 The Joy of Type Providers....Pages 67-92 Of Bikes and Men....Pages 93-129 You Are Not a Unique Snowflake....Pages 131-178 Trees and Forests....Pages 179-210 A Strange Game....Pages 211-238 Digits, Revisited....Pages 239-265 Conclusion....Pages 267-270 Back Matter....Pages 271-275

دانلود کتاب Machine Learning Projects for .NET Developers