وبلاگ بلیان

Data Science with Julia

معرفی کتاب «Data Science with Julia» نوشتهٔ McNicholas, Paul D.;Tait, Peter A، منتشرشده توسط نشر Chapman and Hall/CRC در سال 2019. این کتاب در 20 صفحه، فرمت pdf، زبان انگلیسی ارائه شده است. «Data Science with Julia» در دستهٔ بدون دسته‌بندی قرار دارد.

"This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist."- Professor Charles Bouveyron, INRIA Chair in Data Science, Université Côte d’Azur, Nice, France Julia, an open-source programming language, was created to be as easy to use as languages such as R and Python while also as fast as C and Fortran. An accessible, intuitive, and highly efficient base language with speed that exceeds R and Python, makes Julia a formidable language for data science. Using well known data science methods that will motivate the reader, __Data Science with Julia__ will get readers up to speed on key features of the Julia language and illustrate its facilities for data science and machine learning work. Features: * Covers the core components of Julia as well as packages relevant to the input, manipulation and representation of data. * Discusses several important topics in data science including supervised and unsupervised learning. * Reviews data visualization using the Gadfly package, which was designed to emulate the very popular ggplot2 package in R. Readers will learn how to make many common plots and how to visualize model results. * Presents how to optimize Julia code for performance. * Will be an ideal source for people who already know R and want to learn how to use Julia (though no previous knowledge of R or any other programming language is required). The advantages of Julia for data science cannot be understated. Besides speed and ease of use, there are already over 1,900 packages available and Julia can interface (either directly or through packages) with libraries written in R, Python, Matlab, C, C++ or Fortran. The book is for senior undergraduates, beginning graduate students, or practicing data scientists who want to learn how to use Julia for data science. "This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist." Professor Charles Bouveyron INRIA Chair in Data Science Université Côte d’Azur, Nice, France Cover Half Title Title Page Copyright Page Dedication Table of Contents Foreword Preface About the Authors CHAPTER 1: Introduction 1.1 DATA SCIENCE 1.2 BIG DATA 1.3 JULIA 1.4 JULIA AND R PACKAGES 1.5 DATASETS 1.5.1 Overview 1.5.2 Beer Data 1.5.3 Coffee Data 1.5.4 Leptograpsus Crabs Data 1.5.5 Food Preferences Data 1.5.6 x2 Data 1.5.7 Iris Data 1.6 OUTLINE OF THE CONTENTS OF THIS MONOGRAPH CHAPTER 2: Core Julia 2.1 VARIABLE NAMES 2.2 OPERATORS 2.3 TYPES 2.3.1 Numeric 2.3.2 Floats 2.3.3 Strings 2.3.4 Tuples 2.4 DATA STRUCTURES 2.4.1 Arrays 2.4.2 Dictionaries 2.5 CONTROL FLOW 2.5.1 Compound Expressions 2.5.2 Conditional Evaluation 2.5.3 Loops 2.5.3.1 Basics 2.5.3.2 Loop termination 2.5.3.3 Exception handling 2.6 FUNCTIONS CHAPTER 3: Working with Data 3.1 DATAFRAMES 3.2 CATEGORICAL DATA 3.3 INPUT/OUTPUT 3.4 USEFUL DATAFRAME FUNCTIONS 3.5 SPLIT-APPLY-COMBINE STRATEGY 3.6 QUERY.JL CHAPTER 4: Visualizing Data 4.1 GADFLY.JL 4.2 VISUALIZING UNIVARIATE DATA 4.3 DISTRIBUTIONS 4.4 VISUALIZING BIVARIATE DATA 4.5 ERROR BARS 4.6 FACETS 4.7 SAVING PLOTS CHAPTER 5: Supervised Learning 5.1 INTRODUCTION 5.2 CROSS-VALIDATION 5.2.1 Overview 5.2.2 K-Fold Cross-Validation 5.3 K-NEAREST NEIGHBOURS CLASSIFICATION 5.4 CLASSIFICATION AND REGRESSION TREES 5.4.1 Overview 5.4.2 Classification Trees 5.4.3 Regression Trees 5.4.4 Comments 5.5 BOOTSTRAP 5.6 RANDOM FORESTS 5.7 GRADIENT BOOSTING 5.7.1 Overview 5.7.2 Beer Data 5.7.3 Food Data 5.8 COMMENTS CHAPTER 6: Unsupervised Learning 6.1 INTRODUCTION 6.2 PRINCIPAL COMPONENTS ANALYSIS 6.3 PROBABILISTIC PRINCIPAL COMPONENTS ANALYSIS 6.4 EM ALGORITHM FOR PPCA 6.4.1 Background: EM Algorithm 6.4.2 E-step 6.4.3 M-step 6.4.4 Woodbury Identity 6.4.5 Initialization 6.4.6 Stopping Rule 6.4.7 Implementing the EM Algorithm for PPCA 6.4.8 Comments 6.5 K-MEANS CLUSTERING 6.6 MIXTURE OF PROBABILISTIC PRINCIPAL COMPONENTS ANALYZERS 6.6.1 Model 6.6.2 Parameter Estimation 6.6.3 Illustrative Example: Coffee Data 6.7 COMMENTS CHAPTER 7: R Interoperability 7.1 ACCESSING R DATASETS 7.2 INTERACTING WITH R 7.3 EXAMPLE: CLUSTERING AND DATA REDUCTION FOR THE COFFEE DATA 7.3.1 Coffee Data 7.3.2 PGMM Analysis 7.3.3 VSCC Analysis 7.4 EXAMPLE: FOOD DATA 7.4.1 Overview 7.4.2 Random Forests APPENDIX A: Julia and R Packages Used Herein APPENDIX B: Variables for Food Data APPENDIX C: Useful Mathematical Results C.1 BRIEF OVERVIEW OF EIGENVALUES C.2 SELECTED LINEAR ALGEBRA RESULTS C.3 MATRIX CALCULUS RESULTS APPENDIX D: Performance Tips D.1 FLOATING POINT NUMBERS D.1.1 Do Not Test for Equality D.1.2 Use Logarithms for Division D.1.3 Subtracting Two Nearly Equal Numbers D.2 JULIA PERFORMANCE D.2.1 General Tips D.2.2 Array Processing D.2.3 Separate Core Computations APPENDIX E: Linear Algebra Functions E.1 VECTOR OPERATIONS E.2 MATRIX OPERATIONS E.3 MATRIX DECOMPOSITIONS References Index There is a dearth of resources for data scientists, statisticians, etc., wishing to learn about Julia. Using well known data science methods, this book will both motivate the reader and assuage any unease. The book will get readers up to speed on key features of the Julia language and illustrate some of its advantages for data science work.
دانلود کتاب Data Science with Julia