Semi-supervised truth discovery¶
Introduction¶
Semi-supervised truth discovery: we are given some ground truths together with usual truth discovery input
Aim for this work:
Extend axiomatic framework to semi-supervised case
Come up with desirable axioms
Extend existing algorithms to handle partial ground truths
Investigate truth tracking
Availability of ground truths might make truth tracking a bit more tractable
Preliminaries¶
We consider finite disjoint sets of sources \(\S\), facts \(\F\) and objects \(\O\). A fixed mapping \(\obj: \F \to \O\) gives the object to which each fact relates.
Definition (TD network)
A TD network is a set \(N \subseteq \S \times \F\) with the following properties:
\(N\) is non-empty and finite.
For each \(s \in \S\) and \(o \in \O\) there is at most one \(f \in \obj^{-1}(o)\) such that \((s, f) \in N\).
Let \(\N\) denote the set of TD networks.
Ground truths tell us the true fact for a subset of objects.
Definition (Ground truths)
A set of facts \(F \subseteq \F\) is a ground truth set if for all distinct \(f, g \in F\) we have \(\obj(f) \ne \obj(g)\).
Write \(F_\perp = \bigcup_{f \in F}{\mut(f) \setminus \{f\}}\) for the set of ‘ground falsehoods’, i.e. facts which conflict with ground truths.
Let \(\G\) denote the set of ground truth sets.
Some basic properties about ground truth sets:
(P1): \(\G\) is downwards closed with respect to set inclusion: if \(F \in \G\) and \(F' \subseteq F\) then \(F' \in \G\).
(P2): The \(\cdot_\perp\) operation is monotone increasing: for \(F, F' \in \G\), if \(F' \subseteq F\) then \((F')_\perp \subseteq F_\perp\).
(P2): \(F \cap F_\perp = \emptyset\)
A supervised network adds a set of ground truths to a network.
Definition (Supervised TD network)
A supervised TD network is a pair \(N_*=\tuple{N, F}\) where \(N \in \N\) and \(F \in \G\). Let \(\N_*=\N \times \G\) denote the set of supervised TD networks.
Let \(\orderings(X)\) denote the set of total preorders on a set \(X\).
Definition (Supervised TD operator)
A supervised TD operator (henceforth an operator) is a mapping \(T: \N_* \to \orderings(\S) \times \orderings(\F)\). We will write \(T(N_*) = \tuple{\sle_{N_*}^T, \fle_{N_*}^T}\).
Axioms¶
For each axiom we implicitly assume universal quantification over supervised networks \(N_*=\tuple{N, F}\).
Axioms regarding ranking of grounded facts:
(G+): \(F \subseteq \max{\fle_{N_*}^T}\)
(G-): \(F_\perp \subseteq \min{\fle_{N_*}^T}\)
Agreeing with the ground truth should increase trust, and disagreeing should lower it. There are at least two ways to express this through a change to a network: by adding a ground truth or by adding a claim.
(A+): Suppose \((s, f) \in N\), \(f \not\in F \cup F_\perp\) and \(t \sle_{N_*}^T s\). Write \(N_*' = \tuple{N, F \cup \{f\}}\). Then \(t \slt_{N_*'}^T s\).
(A-): Suppose \((s, f) \in N\), \(f \not\in F \cup F_\perp\), \(s \sle_{N_*}^T t\). Let \(g \in \mut(f) \setminus \{f\}\) and write \(N_*' = \tuple{N, F \cup \{g\}}\). Then \(s \slt_{N_*'}^T t\).
(B+): Suppose \(f \in F\), \(\facts_N(s) \cap \mut(f) = \emptyset\) and \(t \sle_{N_*}^T s\). Write \(N'_*=\tuple{N \cup \{(s, f)\}, F}\). Then \(t \slt_{N'_*}^T s\).
(B-): Suppose \(f \in F_\perp\), \(\facts_N(s) \cap \mut(f) = \emptyset\) and \(s \sle_{N_*}^T t\). Write \(N'_* = \tuple{N \cup \{(s, f)\}, F}\). Then \(s \slt_{N'_*}^T t\).
Proposition 1
(A+) implies the following property:
(A’+): Suppose \((s, f) \in N\), \(f \in F\) and \(s \sle_{N_*}^T t\). Set \(N_*' = \tuple{N, F \setminus \{f\}})\). Then \(s \slt_{N'_*}^T t\)
Proof: Suppose otherwise, i.e. \(t \sle_{N_*'}^T s\). Set \(F' = F \setminus \{f\}\) so that \(N_*'=\tuple{N, F'}\). Note that \(f \not\in F'\) by definition, and \(f \not\in (F')_\perp\) by (P2) and (P3). 1 Therefore \(f \not\in F' \cup (F')_\perp\).
Now applying (A+) to \(N'_*\), we have \(t \slt_{N''_*}^T s\), where \(N''_*=\tuple{N, F' \cup \{f\}}\). But \(N''_*\) is just \(N_*\), so this contradicts the assumption that \(s \sle_{N_*}^T t\).
An analogous proposition might hold for (A-).
- 1
Indeed, since \(F' \subseteq F\) we have \((F')_\perp \subseteq F_\perp\) by (P2); thus \(\F \setminus F_\perp \subseteq \F \setminus (F')_\perp\). Since \(f \in F\), (P3) gives \(f \in \F \setminus F_\perp\) and clearly \(f \not\in (F')_\perp\).