وبلاگ بلیان

Python data analysis : learn how to apply powerful data analysis techniques with popular open source Python modules

معرفی کتاب «Python data analysis : learn how to apply powerful data analysis techniques with popular open source Python modules» نوشتهٔ Ivan Idris، منتشرشده توسط نشر Packt Publishing در سال 2014. این کتاب در 5 صفحه، فرمت pdf، زبان انگلیسی ارائه شده است. «Python data analysis : learn how to apply powerful data analysis techniques with popular open source Python modules» در دستهٔ بدون دسته‌بندی قرار دارد.

**Learn how to apply powerful data analysis techniques with popular open source Python modules** About This Book* Learn how to find, manipulate, and analyze data using Python * Perform advanced, high performance linear algebra and mathematical calculations with clean and efficient Python code * An easy-to-follow guide with realistic examples that are frequently used in real-world data analysis projects Who This Book Is ForThis book is for programmers, scientists, and engineers who have knowledge of the Python language and know the basics of data science. It is for those who wish to learn different data analysis methods using Python and its libraries. This book contains all the basic ingredients you need to become an expert data analyst. What You Will Learn* Install open source Python modules on various platforms * Get to know about the fundamentals of NumPy including arrays * Manipulate data with pandas * Retrieve, process, store, and visualize data * Understand signal processing and time-series data analysis * Work with relational and NoSQL databases * Discover more about data modeling and machine learning * Get to grips with interoperability and cloud computing In DetailPython is a multi-paradigm programming language well suited for both object-oriented application development as well as functional design patterns. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning. It will give you velocity and promote high productivity. This book will teach novices about data analysis with Python in the broadest sense possible, covering everything from data retrieval, cleaning, manipulation, visualization, and storage to complex analysis and modeling. It focuses on a plethora of open source Python modules such as NumPy, SciPy, matplotlib, pandas, IPython, Cython, scikit-learn, and NLTK. In later chapters, the book covers topics such as data visualization, signal processing, and time-series analysis, databases, predictive analytics and machine learning. This book will turn you into an ace data analyst in no time. Cover 1 Copyright 3 Credits 4 About the Author 5 About the Reviewers 6 www.PacktPub.com 9 Table of Contents 10 Preface 16 Chapter 1: Getting Started with Python Libraries 24 Software used in this book 25 Installing software and setup 25 On Windows 25 On Linux 27 On Mac OS X 28 Building NumPY, SciPy, matplotlib, and IPython from source 29 Installing with setuptools 30 NumPy arrays 31 Simple application 31 Using IPython as a shell 34 Reading manual pages 37 IPython notebooks 37 Where to find help and references 38 Summary 38 Chapter 2: NumPy Arrays 40 The NumPy array object 40 The advantages of NumPy arrays 41 Creating a multidimensional array 42 Selecting NumPy array elements 42 NumPy numerical types 43 Data type objects 45 Character codes 45 The dtype constructors 46 The dtype attributes 46 One-dimensional slicing and indexing 47 Manipulating array shapes 47 Stacking arrays 50 Splitting NumPy arrays 54 NumPy array attributes 56 Converting arrays 63 Creating array views and copies 63 Fancy indexing 65 Indexing with a list of locations 67 Indexing NumPy arrays with Booleans 68 Broadcasting NumPy arrays 70 Summary 73 Chapter 3: Statistics and Linear Algebra 74 NumPy and SciPy modules 74 Basic descriptive statistics with NumPy 78 Linear algebra with NumPy 81 Inverting matrices with NumPy 81 Solving linear systems with NumPy 83 Finding eigenvalues and eigenvectors with NumPy 84 NumPy random numbers 86 Gambling with the binomial distribution 87 Sampling the normal distribution 89 Performing a normality test with SciPy 90 Creating a NumPy-masked array 93 Disregarding negative and extreme values 95 Summary 98 Chapter 4: pandas Primer 100 Installing and exploring pandas 101 pandas DataFrames 102 pandas Series 105 Querying data in pandas 109 Statistics with pandas DataFrames 112 Data aggregation with pandas DataFrames 114 Concatenating and appending DataFrames 118 Joining DataFrames 120 Handling missing values 123 Dealing with dates 125 Pivot tables 128 Remote data access 129 Summary 132 Chapter 5: Retrieving, Processing, and Storing Data 134 Writing CSV files with NumPy and pandas 135 Comparing the NumPy .npy binary format and pickling pandas DataFrames 137 Storing data with PyTables 139 Reading and writing pandas DataFrames to HDF5 stores 141 Reading and writing to Excel with pandas 144 Using REST web services and JSON 146 Reading and writing JSON with pandas 147 Parsing RSS and Atom feeds 149 Parsing HTML with BeautifulSoup 150 Summary 157 Chapter 6: Data Visualization 158 matplotlib subpackages 159 Basic matplotlib plots 159 Logarithmic plots 161 Scatter plots 163 Legends and annotations 165 Three-dimensional plots 168 Plotting in pandas 170 Lag plots 173 Autocorrelation plots 174 Plot.ly 175 Summary 178 Chapter 7: Signal Processing and Time Series 180 statsmodels subpackages 181 Moving averages 182 Window functions 183 Defining cointegration 185 Autocorrelation 188 Autoregressive models 191 ARMA models 194 Generating periodic signals 196 Fourier analysis 199 Spectral analysis 201 Filtering 202 Summary 204 Chapter 8: Working with Databases 206 Lightweight access with sqlite3 207 Accessing databases from pandas 209 SQLAlchemy 211 Installing and setting up SQLAlchemy 211 Populating a database with SQLAlchemy 213 Querying the database with SQLAlchemy 215 Pony ORM 216 Dataset – databases for lazy people 217 PyMongo and MongoDB 219 Storing data in Redis 221 Apache Cassandra 222 Summary 225 Chapter 9: Analyzing Textual Data and Social Media 226 Installing NLTK 227 Filtering out stopwords, names, and numbers 229 The bag-of-words model 231 Analyzing word frequencies 232 Naive Bayes classification 234 Sentiment analysis 237 Creating word clouds 240 Social network analysis 245 Summary 247 Chapter 10: Predictive Analytics and Machine Learning 248 A tour of scikit-learn 250 Preprocessing 251 Classification with logistic regression 253 Classification with support vector machines 255 Regression with ElasticNetCV 257 Support vector regression 260 Clustering with affinity propagation 263 Mean Shift 265 Genetic algorithms 267 Neural networks 272 Decision trees 274 Summary 276 Chapter 11: Environments Outside the Python Ecosystem and Cloud Computing 278 Exchanging information with MATLAB/Octave 279 Installing rpy2 280 Interfacing with R 280 Sending NumPy arrays to Java 283 Integrating SWIG and NumPy 284 Integrating Boost and Python 287 Using Fortran code through f2py 289 Setting up Google App Engine 290 Running programs on PythonAnywhere 291 Working with Wakari 292 Summary 293 Chapter 12: Performance Tuning, Profiling, and Concurrency 294 Profiling the code 295 Installing Cython 299 Calling C code 303 Creating a process pool with multiprocessing 305 Speeding up embarrassingly parallel for loops with Joblib 308 Comparing Bottleneck to NumPy functions 309 Performing MapReduce with Jug 311 Installing MPI for Python 313 IPython Parallel 314 Summary 318 Appendix A: Key Concepts 320 Appendix B: Useful Functions 326 matplotlib 326 NumPy 327 pandas 328 Scikit-learn 329 SciPy 330 scipy.fftpack 330 scipy.signal 330 scipy.stats 330 Appendix C: Online Resources 332 Index 334 www.it-ebooks.info Dive deeper into data analysis with the flexibility of Python and learn how its extensive range of scientific and mathematical libraries can be used to solve some of the toughest challenges in data analysis. Build your confidence and expertise and develop valuable skills in high demand in a world driven by Big Data with this expert data analysis book. This data science tutorial will help you learn how to effectively retrieve, clean, manipulate, and visualize data and establish a successful data analysis workflow. Apply the impressive functionality of Python's data mining tools and scientific and numerical libraries to a range of the most important tasks within data analysis and data science, and develop strategies and ideas to take control your own data analysis projects. Get to grips with statistical analysis using NumPy and SciPy, visualize data with Matplotlib, and uncover sophisticated insights through predictive analytics and machine learning with SciKit-Learn. You will also learn how to use the tools needed to work with databases and find out how Python can be used to analyze textual and social media data, as you work through this essential data science tutorial.

Python is a multi-paradigm programming language well suited for both object-oriented application development as well as functional design patterns. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning. It will give you velocity and promote high productivity.

This book will teach novices about data analysis with Python in the broadest sense possible, covering everything from data retrieval, cleaning, manipulation, visualization, and storage to complex analysis and modeling. It focuses on a plethora of open source Python modules such as NumPy, SciPy, matplotlib, pandas, IPython, Cython, scikit-learn, and NLTK. In later chapters, the book covers topics such as data visualization, signal processing, and time-series analysis, databases, predictive analytics and machine learning. This book will turn you into an ace data analyst in no time.

دانلود کتاب Python data analysis : learn how to apply powerful data analysis techniques with popular open source Python modules