Multi-label classification
1.Formal definitions
-
Learning framework
multi-label indicators:
- label cardinality
-
label density
-
label diversity
-
normalized label diversity
Real value function f:
where f(x, y) can be regarded as the confidence of y ∈ Y being the proper label of x. Specifically, given a multi-label example (x, Y ), f(·, ·) should yield larger output on the relevant label $y ′ ∈ Y$ and smaller output on the irrelevant label $y^{''}\notin Y $
multi-label classifier h(·):
where t : X → R acts as a
thresholding function
which dichotomizes the label space into relevant and irrelevant label sets -
key challenge:label correlations
- First-order strategy
- Second-order strategy
- High-order strategy
-
threshold calibration
in order to decide the proper label set for unseen instance x (i.e. h(x)), the real-valued output f(x, y) on each label should be calibrated against the thresholding function output t(x)
- constant function or inducing t(·) from the training examples
- a linear model for t(·)