新書推薦:

《
救命有术
》
售價:HK$
74.8

《
DK企业运营手册(全彩)
》
售價:HK$
120.8

《
中国历代图书总目·哲学卷(全20册)
》
售價:HK$
2200.0

《
RNA时代(诺奖得主解密RNA分子如何创造生命的新奇迹)
》
售價:HK$
86.9

《
无论在哪儿都是生活(中国好书奖、老舍散文奖、冰心散文奖、人民文学奖特别奖得主肖复兴新作)
》
售價:HK$
52.8

《
绝美克孜尔:细品中国石窟奇迹
》
售價:HK$
184.8

《
隋唐与东亚
》
售價:HK$
63.8

《
理解集(1930-1954)(阿伦特作品集)
》
售價:HK$
118.8
|
| 內容簡介: |
现在人们已经意识到数据可以让选举或者商业模 式变得不同,数据科学作为一项职业正在不断发展。
來源:香港大書城megBookStore,http://www.megbook.com.hk 但是你应该如何在这样一个广阔而又错综复杂的交叉 学科领域中开展工作呢?舒特、奥尼尔著的《数据科 学影印版》这本书将会告诉你所需要了解的一切。
它富有深刻见解,是根据哥伦比亚大学的数据科学课 程的讲义整理而成。
|
| 目錄:
|
Preface
1. Introduction: What Is Data Science?
Big Data and Data Science Hype
Getting Past the Hype
Why Now?
Datafication
The Current Landscape with a Little History
Data Science lobs
A Data Science Profile
Thought Experiment: Meta-Definition
OK, So What Is a Data Scientist, Really?
In Academia
In Industry
2. Statistical Inference, Exploratory Data Analysis, and the Data Science
Process
Statistic.a1 Thinking in the Age of Big Data
Statistical Inference
Populations and Samples
Populations and Samples of Big Data
Big Data Can Mean Big Assumptions
Modeling
Exploratory Data Analysis
Philosophy of Exploratory Data Analysis
Exercise: EDA
The Data Science Process
A Data Scientist''s Role in This Process
Thought Experiment: How Would You Simulate Chaos?
Case Study: RealDirect
How Does RealDirect Make Money?
Exercise: RealDirect Data Strategy
3. Algorithms
Machine Learning Algorithms
Three Basic Algorithms
Linear Regression
k-Nearest Neighbors k-NN
k-means
Exercise: Basic Machine Learning Algorithms
Solutions
Summing It All Up
Thought Experiment: Automated Statistician
4. Spare Filters, Naive Bayes, and Wrangling
Thought Experiment: Learning by Example
Why Won''t Linear Regression Work for Filtering Spare?
How About k-nearest Neighbors?
Naive Bayes
Bayes Law
A Spare Filter for Individual Words
A Spam Filter That Combines Words: Naive Bayes
Fancy It Up: Laplace Smoothing
Comparing Naive Bayes to k-NN
Sample Code in bash
Scraping the Web: APIs and Other Tools
Jake''s Exercise: Naive Bayes for Article Classification
Sample R Code for Dealing with the NYT API
5. Logistic Regression
Thought Experiments
Classifiers
Runtime
You
Interpretability
Scalability
M6D Logistic Regression Case Study
Chck Models
The Underlying Math
6.1ime Stamps and Financial Modeling
7.Extracting Meaning from Data
8.Recommendation Engines:Building a User-Facing Data Product at Scale
9.Data Visualization and Fraud Detection
10.SociaI Networks and Data Journalism
11.Causality
12.Epidemiology
13.Lessons Learned from Data Competitions:Data Leakage and Model Evaluation
14.Data Engineering:MapReduce,Pregel,and Hadoop
15.The Students Speak
16.Next-Generation Data Scientists,Hubris,and Ethics
Index
|
|