The problem is typically defined with a dataset where there some of the datapoints are known to be “similar” and should be closer to each other than another arbitrarily-chosen datapoint in the dataset.
Distance metric learning first received significant attention in the machine learning community from the 2002 NIPS paper titled Distance metric learning, with application to clustering with side-information.
Axioms of a distance metric
The four axioms of a distance metric are:
- Non-negativity: \(d(x, y) \geq 0\) – The distance must always be greater than zero.
- Identity of indiscernibles: \(d(x, y) = 0 \Leftrightarrow x = y\) – The distance must be zero for two elements that are the same (i.e. indiscernible from each other).
- Symmetry: \(d(x,y) = d(y,x)\) – The distances must be the same, no matter which order the parameters are given.
- Triangle inequality: \(d(x,z) \leq d(x,y) + d(y,z)\) – For three elements in the set, the sum of the distances for any two pairs must be greater than the distance for the remaining pair.