Types of ML Algorithms

Supervised：Train on labeled data to predict labels
- Self-supervised: Labels are generated from data. (E.g. word2vec, Bert)
Semi-supervised：Train on both labeled and unlabeled data, use models to infer labels for unlabeled data
- E.g. self-training
Unsupervised：Train on unlabeled data
- E.g. clustering, density estimation(GAN)
Reinforcement learning：Use observations from the interaction with the environment to take actions to maximize reward
Tips:
- We can design supervised training tasks for unlabeled data:
  - Self-supervised learning: generate labels from data. E.g. word2vec, BERT
  - GAN: generating fake data with trivial label from unlabeled data
- Training tasks can be different from how the model is evaluated / used.

Components in Supervised Training

Model
- A parameterized function to map inputs to label
  - Model parameters VS hyper parameters
  - E.g. listing house -> sale price
Loss
- The measure of how good the model does in terms of predicting the outcome
  - E.g. classification / regression / contrastive / triplet / ranking
  - E.g. $(predict_price - sale_price) ^ 2$
Objective
- The goal to optimize model params for
  - E.g. minimize the sum of losses over examples
Optimization
- The algorithm for solving the objective