Analogous to the MLP we execute a gradient descent to find the suitable weights by way of the already well known delta rule. Here, back propagation is unnecessary since we mainly consist of to train one single weight layer, which involves less computing time.