Stanford Practical Machine Learning-课程介绍
本文最后更新于:1 年前
这门课程主要关于,机器学习在工业界的运用。教授一个数据科学家,将机器学习用到工业界的时候后,在不同的阶段所遇到了一些比较重要的技术细节。
Industrial ML Applications
- Manufacturing: Predictive maintenance, quality control
- Retail: Recommendation, chatbox, demand forecasting
- Healthcare: Alerts from real-time patient data, disease identification
- Finance: Fraud detection, application processing
- Automobile: Breakdown prediction, self-driving
- House Price Prediction: The goal is to predict the bid price for the winning buyer
ML Workflow
- Problem formulation
- A loop:
- Collect & process data
- Train & Tune models
- Deploy models
- Monitor
ML Challenges
- Formulate problem: focus on the most impactful industrial problems
- Data: high-quality data is scarce, privacy issues
- Train models: ML models are more and more complex, data-hungry, expensive
- Deploy models: heavy computation is not suitable for real-time inference
- Monitor: data distributions shifts, fairness issues
Roles
Domain experts: have business insights, know what data is important and where to find it, identify the real impact of a ML model
Data scientists: full stack on data mining, model training and deployment
ML experts: customize SOTA ML models
SDE: develop/maintain data pipelines, model training and serving pipelines
Skill Improvement:
SDE和领域专家也会慢慢向数据科学家靠拢,数据科学家慢慢会成为机器学习专家。
- How data scientists spent their time (source: Anaconda survey 2020)
Course Topics
- Techniques a data scientist needs but often not taught in university ML/stats/programming courses
- Data
- Collect/ preprocess data
- Covariate/ concepts/label shifts
- Data beyond IID
- Train
- Model validation/combinations/tuning
- Transfer learning
- Multi-modality
- Deploy
- Model deployment
- Distillation
- Monitor
- Fairness
- Explainability
- Data
References
Stanford Practical Machine Learning-课程介绍
https://alexanderliu-creator.github.io/2023/08/23/stanford-practical-machine-learning-ke-cheng-jie-shao/