封面
版权信息
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Chapter 1. Getting Started
Computer science
Artificial intelligence (AI)
Machine Learning (ML)
Statistics
Mathematics
Knowledge domain
Data information and knowledge
The nature of data
The data analysis process
Quantitative versus qualitative data analysis
Importance of data visualization
What about big data?
Summary
Chapter 2. Working with Data
Datasource
Data scrubbing
Data formats
Getting started with OpenRefine
Summary
Chapter 3. Data Visualization
Data-Driven Documents (D3)
Getting started with D3.js
Interaction and animation
Summary
Chapter 4. Text Classification
Learning and classification
Bayesian classification
E-mail subject line tester
The algorithm
Classifier accuracy
Summary
Chapter 5. Similarity-based Image Retrieval
Image similarity search
Dynamic time warping (DTW)
Processing the image dataset
Implementing DTW
Analyzing the results
Summary
Chapter 6. Simulation of Stock Prices
Financial time series
Random walk simulation
Monte Carlo methods
Generating random numbers
Implementation in D3.js
Summary
Chapter 7. Predicting Gold Prices
Working with the time series data
Smoothing the time series
The data – historical gold prices
Nonlinear regression
Summary
Chapter 8. Working with Support Vector Machines
Understanding the multivariate dataset
Dimensionality reduction
Getting started with support vector machine
Summary
Chapter 9. Modeling Infectious Disease with Cellular Automata
Introduction to epidemiology
The epidemic models
Modeling with cellular automata
Simulation of the SIRS model in CA with D3.js
Summary
Chapter 10. Working with Social Graphs
Structure of a graph
Social Networks Analysis
Acquiring my Facebook graph
Representing graphs with Gephi
Statistical analysis
Degree distribution
Transforming GDF to JSON
Graph visualization with D3.js
Summary
Chapter 11. Sentiment Analysis of Twitter Data
The anatomy of Twitter data
Using OAuth to access Twitter API
Getting started with Twython
Sentiment classification
Getting started with Natural Language Toolkit (NLTK)
Summary
Chapter 12. Data Processing and Aggregation with MongoDB
Getting started with MongoDB
Data preparation
Group
The aggregation framework
Summary
Chapter 13. Working with MapReduce
MapReduce overview
Programming model
Using MapReduce with MongoDB
Filtering the input collection
Grouping and aggregation
Word cloud visualization of the most common positive words in tweets
Summary
Chapter 14. Online Data Analysis with IPython and Wakari
Getting started with Wakari
Getting started with IPython Notebook
Introduction to image processing with PIL
Getting started with Pandas
Multiprocessing with IPython
Sharing your Notebook
Summary
Appendix A. Setting Up the Infrastructure
Installing and running Python 3
Installing and running NumPy
Installing and running SciPy
Installing and running mlpy
Installing and running OpenRefine
Installing and running MongoDB
Installing and running UMongo
Installing and running Gephi
Index
更新时间:2021-07-23 15:59:56