وبلاگ بلیان

Data Science From Scratch : First Principles with Python

معرفی کتاب «Data Science From Scratch : First Principles with Python» نوشتهٔ Grus, Joel [Grus, Joel]، منتشرشده توسط نشر O'Reilly Media در سال 2015. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Data Science From Scratch : First Principles with Python» در دستهٔ بدون دسته‌بندی قرار دارد.

"Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they're also a good way to dive into the discipline without actually understanding data science. In this book, you'll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today's messy glut of data holds answers to questions no one's even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python. Learn the basics of linear algebra, statistics, and probability--and understand how and when they're used in data science. Collect, explore, clean, munge, and manipulate data. Dive into the fundamentals of machine learning. Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering. Explore recommender systems, natural language processing, network analysis, MapReduce, and databases."--Provided by publisher Preface......Page 7 Data Science......Page 8 From Scratch......Page 10 Conventions Used in This Book......Page 11 Using Code Examples......Page 12 Safari® Books Online......Page 13 How to Contact Us......Page 14 Acknowledgments......Page 15 1. Introduction......Page 17 The Ascendance of Data......Page 18 What Is Data Science?......Page 19 Motivating Hypothetical: DataSciencester......Page 20 Finding Key Connectors......Page 21 Data Scientists You May Know......Page 24 Salaries and Experience......Page 27 Paid Accounts......Page 30 Topics of Interest......Page 31 Onward......Page 33 2. A Crash Course in Python......Page 35 The Basics......Page 36 Getting Python......Page 37 The Zen of Python......Page 38 Whitespace Formatting......Page 39 Arithmetic......Page 41 Functions......Page 42 Exceptions......Page 44 Lists......Page 45 Tuples......Page 47 Dictionaries......Page 48 defaultdict......Page 50 Sets......Page 51 Control Flow......Page 52 Truthiness......Page 53 The Not-So-Basics......Page 55 Sorting......Page 56 List Comprehensions......Page 57 Generators and Iterators......Page 58 Randomness......Page 60 Object-Oriented Programming......Page 61 Functional Tools......Page 63 enumerate......Page 64 zip and Argument Unpacking......Page 65 args and kwargs......Page 66 Welcome to DataSciencester!......Page 68 For Further Exploration......Page 69 3. Visualizing Data......Page 71 matplotlib......Page 72 Bar Charts......Page 74 Line Charts......Page 78 Scatterplots......Page 79 For Further Exploration......Page 82 4. Linear Algebra......Page 84 Vectors......Page 85 Matrices......Page 90 For Further Exploration......Page 93 5. Statistics......Page 95 Describing a Single Set of Data......Page 96 Central Tendencies......Page 99 Dispersion......Page 100 Correlation......Page 102 Simpson’s Paradox......Page 105 Some Other Correlational Caveats......Page 107 Correlation and Causation......Page 108 For Further Exploration......Page 109 6. Probability......Page 111 Dependence and Independence......Page 112 Conditional Probability......Page 113 Bayes’s Theorem......Page 115 Random Variables......Page 117 Continuous Distributions......Page 118 The Normal Distribution......Page 120 The Central Limit Theorem......Page 124 For Further Exploration......Page 127 7. Hypothesis and Inference......Page 128 Statistical Hypothesis Testing......Page 129 Example: Flipping a Coin......Page 130 Confidence Intervals......Page 134 Example: Running an A/B Test......Page 136 Bayesian Inference......Page 138 For Further Exploration......Page 142 8. Gradient Descent......Page 143 The Idea Behind Gradient Descent......Page 144 Estimating the Gradient......Page 146 Using the Gradient......Page 148 Choosing the Right Step Size......Page 149 Putting It All Together......Page 150 Stochastic Gradient Descent......Page 152 For Further Exploration......Page 153 9. Getting Data......Page 155 stdin and stdout......Page 156 Reading Files......Page 158 The Basics of Text Files......Page 159 Delimited Files......Page 161 Scraping the Web......Page 162 HTML and the Parsing Thereof......Page 163 Example: O’Reilly Books About Data......Page 165 Using APIs......Page 169 JSON (and XML)......Page 170 Using an Unauthenticated API......Page 171 Finding APIs......Page 173 Example: Using the Twitter APIs......Page 174 Getting Credentials......Page 175 Using Twython......Page 177 For Further Exploration......Page 178 10. Working with Data......Page 180 Exploring Your Data......Page 181 Exploring One-Dimensional Data......Page 182 Two Dimensions......Page 184 Many Dimensions......Page 186 Cleaning and Munging......Page 188 Manipulating Data......Page 190 Rescaling......Page 193 Dimensionality Reduction......Page 195 For Further Exploration......Page 201 11. Machine Learning......Page 203 Modeling......Page 204 What Is Machine Learning?......Page 205 Overfitting and Underfitting......Page 206 Correctness......Page 209 The Bias-Variance Trade-off......Page 212 Feature Extraction and Selection......Page 214 For Further Exploration......Page 216 12. k-Nearest Neighbors......Page 218 The Model......Page 219 Example: Favorite Languages......Page 221 The Curse of Dimensionality......Page 225 For Further Exploration......Page 232 13. Naive Bayes......Page 234 A Really Dumb Spam Filter......Page 235 A More Sophisticated Spam Filter......Page 236 Implementation......Page 238 Testing Our Model......Page 240 For Further Exploration......Page 243 14. Simple Linear Regression......Page 245 The Model......Page 246 Using Gradient Descent......Page 250 Maximum Likelihood Estimation......Page 251 For Further Exploration......Page 252 15. Multiple Regression......Page 254 The Model......Page 255 Further Assumptions of the Least Squares Model......Page 256 Fitting the Model......Page 258 Interpreting the Model......Page 259 Goodness of Fit......Page 260 Digression: The Bootstrap......Page 261 Standard Errors of Regression Coefficients......Page 263 Regularization......Page 265 For Further Exploration......Page 267 16. Logistic Regression......Page 269 The Problem......Page 270 The Logistic Function......Page 273 Applying the Model......Page 276 Goodness of Fit......Page 278 Support Vector Machines......Page 279 For Further Investigation......Page 283 17. Decision Trees......Page 285 What Is a Decision Tree?......Page 286 Entropy......Page 289 The Entropy of a Partition......Page 291 Creating a Decision Tree......Page 292 Putting It All Together......Page 295 Random Forests......Page 298 For Further Exploration......Page 299 18. Neural Networks......Page 300 Perceptrons......Page 301 Feed-Forward Neural Networks......Page 303 Backpropagation......Page 306 Example: Defeating a CAPTCHA......Page 308 For Further Exploration......Page 312 19. Clustering......Page 314 The Idea......Page 315 The Model......Page 316 Example: Meetups......Page 318 Choosing k......Page 321 Example: Clustering Colors......Page 323 Bottom-up Hierarchical Clustering......Page 325 For Further Exploration......Page 330 20. Natural Language Processing......Page 332 Word Clouds......Page 333 n-gram Models......Page 336 Grammars......Page 339 An Aside: Gibbs Sampling......Page 342 Topic Modeling......Page 343 For Further Exploration......Page 348 21. Network Analysis......Page 350 Betweenness Centrality......Page 351 Eigenvector Centrality......Page 356 Matrix Multiplication......Page 357 Centrality......Page 360 Directed Graphs and PageRank......Page 361 For Further Exploration......Page 364 22. Recommender Systems......Page 365 Manual Curation......Page 366 Recommending What’s Popular......Page 367 User-Based Collaborative Filtering......Page 368 Item-Based Collaborative Filtering......Page 371 For Further Exploration......Page 373 23. Databases and SQL......Page 375 CREATE TABLE and INSERT......Page 376 UPDATE......Page 378 DELETE......Page 379 SELECT......Page 380 GROUP BY......Page 382 ORDER BY......Page 385 JOIN......Page 386 Subqueries......Page 389 Indexes......Page 390 Query Optimization......Page 391 NoSQL......Page 392 For Further Exploration......Page 393 24. MapReduce......Page 395 Example: Word Count......Page 396 Why MapReduce?......Page 398 MapReduce More Generally......Page 399 Example: Analyzing Status Updates......Page 400 Example: Matrix Multiplication......Page 402 An Aside: Combiners......Page 404 For Further Exploration......Page 405 25. Go Forth and Do Data Science......Page 407 IPython......Page 408 Mathematics......Page 409 Not from Scratch......Page 410 NumPy......Page 411 pandas......Page 412 scikit-learn......Page 413 Visualization......Page 414 R......Page 415 Find Data......Page 416 Do Data Science......Page 417 Hacker News......Page 418 Fire Trucks......Page 419 T-shirts......Page 420 And You?......Page 421 Index......Page 423 Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.Get a crash course in PythonLearn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data scienceCollect, explore, clean, munge, and manipulate dataDive into the fundamentals of machine learningImplement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clusteringExplore recommender systems, natural language processing, network analysis, MapReduce, and databases
دانلود کتاب Data Science From Scratch : First Principles with Python