The decision function for an SVM classifier is given by:

is the normal vector and *b* is the offset term for the decision surface .

The corresponding supporting hyperplanes are as follows:

In either of the above supporting hyperplanes, is known as the slack variable or error term that measures how far a particular point lies on the wrong side of its respective hyperplane.

The optimization problem to compute the a soft-margin decision surface is expressed as follows:

**Rewriting Supporting Hyperplane Constraints in Compact Form:**

The above Supporting Hyperplane constraints can be formatted into compact form as follows:

For Supporting Hyperplane representing all training points labeled as +1, we can rewrite the dot product and free term as follows:

Now we can substitute in place of +1, since the inequality will still hold true:

Rearranging the terms, the constraint becomes:

For Supporting Hyperplane representing all training points labeled as -1, multiplying LHS and RHS by -1 and changing the inequality from to we get:

Now we can substitute in place of -1, since the inequality will still hold true:

Rearranging the terms, the constraint becomes:

As evident, the compacted constraints for both the supporting hyperplanes are in similar form. Hence the constraint can be now expressed as one constraint for all points in the training set as follows:

**Optimization Problem with Constraints in Compact Form**

**Optimization Problem in Lagrange Form**