作者簡(jiǎn)介: 克里斯·阿爾本是肯尼亞創(chuàng)業(yè)公司BRCK的首席數(shù)據(jù)科學(xué)家。他此前創(chuàng)立了AI公司New l
目錄:Preface
1. Vectors, Matrices, and Arrays
1.0 Introduction
1.1 Creating a Vector
1.2 Creating a Matrix
1.3 Creating a Sparse Matrix
1.4 Selecting Elements
1.5 Describing a Matrix
1.6 Applying Operations to Elements
1.7 Finding the Maximum and Minimum Values
1.8 Calculating the Average, Variance, and Standard Deviation
1.9 Reshaping Arrays
1.10 Transposing a Vector or Matrix
1.11 Flattening a Matrix
1.12 Finding the Rank of a Matrix
1.13 Calculating the Determinant
1.14 Getting the Diagonal of a Matrix
1.15 Calculating the Trace of a Matrix
1.16 Finding Eigenvalues and Eigenvectors
1.17 Calculating Dot Products
1.18 Adding and Subtracting Matrices
1.19 Multiplying Matrices
1.20 Inverting a Matrix
1.21 Generating Random Values
2. Loading Data
2.0 Introduction
2.1 Loading a Sample Dataset
2.2 Creating a Simulated Dataset
2.3 Loading a CSV File
2.4 Loading an Excel File
2.5 Loading a ]SON File
2.6 Querying a SQL Database
3. Data Wrangling
3.0 Introduction
3.1 Creating a Data Frame
3.2 Describing the Data
3.3 Navigating DataFrames
3.4 Selecting Rows Based on Conditionals
3.5 Replacing Values
3.6 Renaming Columns
3.7 Finding the Minimum, Maximum, Sum, Average, and Count
3.8 Finding Unique Values
3.9 Handling Missing Values
3.10 Deleting a Column
3.11 Deleting a Row
3.12 Dropping Duplicate Rows
3.13 Grouping Rows by Values
3.14 Grouping Rows by Time
3.15 Looping Over a Column
3.16 Applying a Function Over All Elements in a Column
3.17 Applying a Function to Groups
3.18 Concatenating DataFrames
3.19 Merging DataFrames
4. Handling Numerical Data
4.0 Introduction
4.1 Rescaling a Feature
4.2 Standardizing a Feature
4.3 Normalizing Observations
4.4 Generating Polynomial and Interaction Features
4.5 Transforming Features
4.6 Detecting Outliers
4.7 Handling Outliers
4.8 Discretizating Features
4.9 Grouping Observations Using Clustering
4.10 Deleting Observations with Missing Values
4.11 Imputing Missing Values
……
5. Handling Categorical Data
6. Handling Text
7. Handling Dates and Times
8. Handling Images
9. Dimensionality Reduction Using Feature Extraction
10. Dimensionality Reduction Using Feature Selection
11. Model Evaluation
12. Model Selection
13. Linear Regression
14. Trees and Forests
15. K-Nearest Neighbors
16. Logistic Regression
17. Support Vector Machines
18. Naive Bayes
19. Clustering
20. Neural Networks
21. Saving and Loading Trained Models