Subgroups of Machine Learning algorithms » AI Tutorial AI Tutorial

Subgroups of Machine Learning algorithms

There are two common ways to subgroup the Machine learning algorithms. One is based on learning style, the two are based on the function (function) (of each algorithm).

1. Subgroup based on learning modalities

Supervised Learning (Supervised learning)

Classification (classification)

Regression (regression)

Unsupervised Learning (Unattended learning)

Association

Semi-Supervised Learning (semi-supervised)

Reinforcement Learning (Learning consolidation)

1. Group based functionality

Regression Algorithms

Classification Algorithms

Instance-based Algorithms

Regularization Algorithms

Bayesian Algorithms

Clustering Algorithms

Artificial Neural Network Algorithms

dimensionality Reduction Algorithms

Ensemble Algorithms

1. References
1. Subgroup based on learning modalities
According to the method of learning, the Machine Learning algorithm is often divided into four groups: Supervise learning, Unsupervised Learning, Semi-supervised lerning and Reinforcement Learning. There are several ways to subgroup no Semi-supervised learning or Reinforcement learning.

Supervised Learning (Supervised learning)
Supervised learning is the predicted output algorithm (outcome) of a new input data based on the folders (input, outcome) previously known. This data folder is also called (data, label), News (data, labels). Supervised Learning is the most popular group in the Machine Learning algorithm.

In a manner of mathematics, Supervised learning is when they out there is a set of input variables on X
yi≈f(xi),  ∀i=1,2,…,N

and a corresponding set of medicine labels
The purpose is approximately the function of F It’s good to have a data x
New, we can calculate the corresponding label of the =f(x)y=f(x).

For example 1: In handwriting recognition, we have pictures of thousands of examples of each digit written by many different people. We put these photos into an algorithm and only let it know each of the pictures corresponds to any digit. After the algorithm creates a pattern, it is a function that the input is a photograph and the output is a digit, when receiving a new photograph that the model hasn’t seen ever, it will predict the photograph that contains any digits.

This example is quite similar to human learning as a child. I give the alphabet a child and show them the letter A, this is the B. After a few times taught, the child can identify where the letter A, where is the letter B in a book that they haven’t seen ever before.

Example 2: The algorithm that detects the faces in a photograph has been developed for so long. First time, Facebook uses this algorithm to point out the faces in a photograph and ask users to tag friends-news labels for each face. The number of data pairs (face, person name) is greater, the accuracy at the next automatic tag will be greater.

For example 3: The algorithm itself seeks to face in 1 photograph is also a Supervised learning algorithm with training data (learning data) that thousands of pairs (pictures, people face) and (photos, non-humans) are included. Note that this data is only distinguished from the person and not the face of the person without distinction of different people’s faces.

The supervised learning algorithm is further divided into two main categories:

Classification (classification)
A mathematical article called classification if the label of input data is divided into a finite number of groups. For example, Gmail determines whether an email is spam or not; The credit firms determine whether a client has the ability to pay debt or not. Three examples above are divided into this category.

If the label is not broken down into groups that is a specific real value. For example: a house wide x m 2, has bedroom medicine and the city center Z km will be priced at how much?

Microsoft recently has a gender prediction application and age-based face. The Gender Prediction section may be considered as the Classification algorithm, which is an age prediction that can be considered as a Regression algorithm. Note that the Age Prediction section may also be considered as Classification if one is considered a positive integer less than 150, we will have a different 150 class (class).

Unsupervised Learning (Unattended learning)
In this algorithm, we do not know which is outcome or label that only data input. The unsupervised learning algorithm will rely on the structure of data to perform a certain task, such as subheadings (clustering) or reduce the number of dimensions of data (dimension reduction) for convenience in storage and calculation.

In a mathematical way, Unsupervised learning is when we only have data into X without knowing the corresponding Y label.

These types of algorithms are known as Unsupervised learning because unlike Supervised learning, we do not know the correct answer for each input data. Like when I learn, there is no teacher that tells me that is a letter or B. The unattended cluster is named in this sense.

Unsupervised learning is further broken down into two categories:

An entire data subheadings the X
into small groups based on the relevance between the data in each bucket. For example, a customer grouping is based on a purchase behavior. This is like the one that we give to a lot of piece of puzzle with different shapes and colors, e.g. triangles, squares, rounded with blue and red, then ask young people to form each group. Although no child knows any piece corresponding to any picture or color, most likely they can still sort the puzzle pieces according to color or shape.

Association
Is the math when we want to discover a rule based on multiple data for the first time. For example, male customers who buy clothing often tend to buy additional watches or belts; These viewers watch Spider man movies often tend to see more Bat man movies, based on that create a client suggestion system (Recommendation system), promoting shopping needs.

Semi-Supervised Learning (semi-supervised)
The math when we have a large amount of data X but only one part of them is labeled as Semi-Supervised Learning. The problems belonging to this group are between the two groups outlined above.

A typical example of this group is the only partial photo or text labeled (e.g. photographs of people, animals or scientific, political texts) and most of the other photographs/documents that have not been labeled are collected from the Internet. The fact shows that a lot of Machine Learning articles are belonging to this group because the data collection labeled a lot of time and has a high cost. A lot of data types should even have new experts labeled (Medical photos for example). Conversely, unlabeled data can be collected at low cost from the Internet.

Reinforcement Learning (Learning consolidation)
Reinforcement learning is a system of mathematics that helps to automatically determine the behavior based on circumstances to achieve the highest benefit (maximizing the performance). Currently, Reinforcement learning is mainly applied to game theory (game theory), algorithms that need to determine the next water to achieve the highest score.

Example 1: Alphago was recently known for playing a man’s play against the game. The fin is considered to have extremely high complexity with the total number of water going to be approximately 10761, compared to the chess that is 10120 and the total number of atoms in the universe is about 1080! Therefore, the algorithm must pick out an optimal move in the number of billions of options, and of course, the same algorithm cannot be applied to the IBM Deep Blue (the IBM Deep Blue to win a man in chess) 20 years ago. Basically, Alphago includes algorithms belonging to both Supervised learning and Reinforcement learning. In the Supervised learning section, data from the gambler’s players are brought together for training. However, the ultimate purpose of Alphago is not played as humans that must even win both humans. So, after school finished the human flag, Alphago himself played with itself with millions of games to find out the more optimal new countries. The algorithm in this self-playing section is categorized as Reinforcement learning. (See also at Google DeepMind’s Alphago: How it works).

Example 2: Trainer for the computer Mario game. This is an interesting program to teach Mario computer gaming. This game is simpler than the siege because at one point, the player only has to press a small number of nodes (move, jump, shoot bullets) or do not need to press any button. At the same time, the reaction of the machine is also simpler and echoed at each time (at a specific moment will appear a fixed obstacle in a fixed position). The input of the algorithm is the diagram of the screen at the current time, the task of the algorithm is with the input, which keystrokes should be clicked. This trainer is based on the score for how long the move is played during the game, as far and as fast as the higher the bonus points (the bonus is not the point of the game, which is the point of the programmer created). Through coaching, the algorithm will come up with an optimal way to maximize the score above, thereby achieving the ultimate goal of rescuing the princess.

1. Group based functionality
There is a second clustered way based on the function of the algorithm. In this section, I would like to only list algorithms. Specific information will be presented in other articles at this blog. In the process of writing, I will probably add some algorithms.

Regression Algorithms
1. Linear Regression
2. Logistic Regression
3. Stepwise Regression

Classification Algorithms

1. Linear Classifier
2. Support Vector Machine (SVM)
3. Kernel SVM
4. Sparse Representation-based classification (SRC)

Instance-based Algorithms

1. k-Nearest Neighbor (kNN)
2. Learning Vector Quantization (LVQ)

Regularization Algorithms

1. Ridge Regression
2. Least Absolute Shrinkage and Selection Operator (LASSO)
3. Least-Angle Regression (LARS)

Bayesian Algorithms

1. Naive Bayes
2. Gaussian Naive Bayes

Clustering Algorithms

1. k-Means clustering
2. k-Medians
3. Expectation Maximization (EM)

Artificial Neural Network Algorithms

1. Perceptron
2. Softmax Regression
3. Multi-layer Perceptron
4. Back-Propagation

Dimensionality Reduction Algorithms

1. Principal Component Analysis (PCA)
2. Linear Discriminant Analysis (LDA)

Ensemble Algorithms

1. Boosting