We have now established that Machine learning is more or less, a way for computers learns things without being specifically programmed. In this blog the idea is to explore the Machine learning types / categories.
Supervised Machine Learning
The goal is to predict an outcome based on labelled historic patterns in the data. The "supervised" label is due to the nature on how we train the algorithm to perform prediction. The algorithm is provided with set of input variables, we
estimate the value of an output variable. Depending on the type of variable for which we are creating a prediction model the problem could be categories into a _Regression _or a Classification problem
- Regression - problems, where the variable to predict is numerical / continuous in nature. Example would be the predicting Sale price of a house. Most popular regression algorithms include the linear Regression, Decision Tree and Lasso Regression.
In the Septum Deviation episode in season 8, an appalled Sheldon creates a probability prediction for the survival for the surgery Leonard wants to undergo.
- Classification - problems, where variable to predict is categorical, which can be as simple as "yes" or "no." Example would be the SPAM filter that our mailbox uses. Decision Tree, Random Forest , Support Vector Machine and Logistic Regression are some of the widely used algorithms for classification problem
Unsupervised Machine Learning
The goal here is not trying to predict a variable; instead, we discover hidden patterns within our data and group them / cluster them. The technique is mostly applied for marketing problems to underpin the consumer characteristics.
- Clustering - Refers to grouping objects into clusters based on parameters such as similarities or differences between objects. K-Means, Principal Component Analysis, Mean-Shift are the most widely used algorithms for this problem
- Association - The goal here is to identify typical relations between the variables of a large dataset.
In the episode "Einstein Approximation" Sheldon has a epiphany when he accidentally drops a pile of plate at the cheese cake factory and observes the pattern of broken pieces and realizes that his approach to solve the physical problem was incorrect (Why electrons behave as if they have no mass when traveling through a graphene sheet)
Classic Optimization
The goal is to identify the best solution for an objective function that we need to achieve (Maximize or minimize) by changing values of the decision variable ensuring the constraints defined are met. The algorithm basically simulates multiple scenario until if finds the best solution. The most common use case for this type of machine learning is for scheduling problems or facility location problem.
In season 2, the guys plan their dinner, movie using a map with multiple option of restaurant and movie theater choices subjecting to constraints set by Sheldon. Finally it was not a feasible solution and the guys ignore Sheldon and plan ahead anyway
Reinforcement learning
This is an enhancement to the optimization problem. Here, the AI model automatically takes stock of its surroundings by the hit & trial method, takes action, learns from experiences, and improves performance. The component is rewarded for each good action and penalized for every wrong move. Thus, the reinforcement learning component aims to maximize the rewards by performing good actions. The learning could be of two types as below
Positive reinforcement learning: This refers to adding a reinforcing stimulus after a specific behavior of the agent, which makes it more likely that the behavior may occur again in the future, e.g., adding a reward after a behavior.
Negative reinforcement learning: Negative reinforcement learning refers to strengthening a specific behavior that avoids a negative outcome
In season 2, Howard and guys sets up a "state of the art" driving simulator in the apartment's living room, using software from the United States Army. Sheldon attempts to drive, but fails horribly. Whenever he crashes the virtual car, Penny hits him in the face with a pillow because the simulator can't replicate an airbag going off ( Negative reinforcement 😀)
Further reading
https://wiki.seg.org/wiki/Machine_learning