Support Vector Regressor

Support vector regressor (SVR) is a type of machine learning algorithm that is used for regression tasks. It is a variant of support vector machines (SVMs), which are a type of linear model that is used for classification tasks.

The basic idea of SVR is to find the hyperplane in the feature space that maximizes the margin between the predicted values of the response variable and the observed values of the response variable in the training data. The margin is defined as the distance between the hyperplane and the nearest data points in the training set. Maximizing the margin allows the model to make more accurate predictions by reducing the influence of noise in the training data.

Support vector machine for regression uses a kernel function (parameters=w) and a maximum error (`\epsilon`) to fit points in space separated within the gap.

$$\min \frac{1}{2} \left\Vert w \right\Vert^2 + C \sum_{i=1}^n \left|\xi_i\right|$$

$$\mathrm{subject\,to }\quad |y_i - w_i x_i | \le \epsilon + \left|\xi_i\right|$$

A slack variable `\xi` is minimized to allow but discourage values outside the maximum error region. The a higher parameter C is tuned to fit the general trend of the data (lower C) or favor inclusion of more data points (higher C).

Advantages: Kernel can be linear, polynomial, radial basis function (RBF), or sigmoid. Effective with high dimensional data.

Disadvantages: The fit time increases quadratically with the number of samples. For more than 10,000 rows of data, use a linear SVR instead of the default RBF regression method.

Support Vector Regressor (SVR) in Python

Here is an example of how to implement SVR in Python using the scikit-learn library:

from sklearn.svm import SVR
import numpy as np

# Assume that we have a training set of data points with input features X and output values y
X = np.array([[0, 1], [1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([1, 2, 3, 4, 5])

# Create an SVR model with a linear kernel and C=1
svr = SVR(kernel='linear', C=1)

# Fit the model to the training data
svr.fit(X, y)

# Predict the output value of a new data point
x_new = np.array([[1, 1]])
y_pred = svr.predict(x_new)
print(y_pred)  # Output: [1.5]

In this example, we have a training set of 5 data points with input features X and output values y. We create an SVR model with a linear kernel and a regularization parameter C=1. We then fit the model to the training data and use it to predict the output value of a new data point. Additional hyper-parameters such as the kernel function and gamma can be adjusted to improve the performance.

from sklearn import svm
s = svm.SVR(kernel='rbf',gamma='scale',C=2)
s.fit(X,y)
yP = s.predict(x_new)

See also Support Vector Classifier