\[\gdef{\S}{\mathcal{S}} % Note: overrides existing command \gdef{\O}{\mathcal{O}} % Note: overrides existing command \gdef{\F}{\mathcal{F}} \gdef{\N}{\mathcal{N}} \gdef{\R}{\mathbb{R}} \gdef{\Nat}{\mathbb{N}} \gdef{\sign}{sign} \gdef{\L}{\mathcal{L}} \gdef{\C}{\mathcal{C}} \gdef{\V}{\mathcal{V}} \gdef{\W}{\mathcal{W}} \gdef{\G}{\mathcal{G}} \gdef{\vle}{\preceq} % Source and fact orderings \gdef{\sle}{\sqsubseteq} \gdef{\slt}{\sqsubset} \gdef{\sge}{\ge} \gdef{\seq}{\simeq} \gdef{\fle}{\preceq} \gdef{\flt}{\prec} \gdef{\fgt}{\succ} \gdef{\fge}{\succeq} \gdef{\feq}{\approx} \gdef{\src}{\mathsf{src}} \gdef{\facts}{\mathsf{facts}} \gdef{\obj}{\mathsf{obj}} \gdef{\mut}{\mathsf{mut}} \gdef{\orderings}{\mathcal{L}} \gdef{\num}{\mathcal{T}_{Num}} \gdef{\rec}{\mathsf{rec}} \gdef{\norm}{\mathsf{norm}} \gdef{\ord}#1{\langle{#1}\rangle} % Shortcuts \gdef{\limn}{\lim_{n \to \infty}} \gdef{\voting}{\emph{Voting}} \gdef{\sums}{\emph{Sums}} \gdef{\usums}{\emph{UnboundedSums}} \gdef{\avlog}{\emph{Average$\cdot$Log}} \gdef{\scvoting}{\emph{SC-Voting}} \gdef{\scoh}{\mathrel{\lhd}} \gdef{\fcoh}{\mathrel{\blacktriangleleft}} \gdef{\tuple}#1{{\langle{#1}\rangle}}\]

Semi-supervised truth discovery

Introduction

  • Semi-supervised truth discovery: we are given some ground truths together with usual truth discovery input

  • Aim for this work:

    • Extend axiomatic framework to semi-supervised case

    • Come up with desirable axioms

    • Extend existing algorithms to handle partial ground truths

    • Investigate truth tracking

      • Availability of ground truths might make truth tracking a bit more tractable

Preliminaries

We consider finite disjoint sets of sources \(\S\), facts \(\F\) and objects \(\O\). A fixed mapping \(\obj: \F \to \O\) gives the object to which each fact relates.

Definition (TD network)

A TD network is a set \(N \subseteq \S \times \F\) with the following properties:

  1. \(N\) is non-empty and finite.

  2. For each \(s \in \S\) and \(o \in \O\) there is at most one \(f \in \obj^{-1}(o)\) such that \((s, f) \in N\).

Let \(\N\) denote the set of TD networks.

Ground truths tell us the true fact for a subset of objects.

Definition (Ground truths)

A set of facts \(F \subseteq \F\) is a ground truth set if for all distinct \(f, g \in F\) we have \(\obj(f) \ne \obj(g)\).

Write \(F_\perp = \bigcup_{f \in F}{\mut(f) \setminus \{f\}}\) for the set of ‘ground falsehoods’, i.e. facts which conflict with ground truths.

Let \(\G\) denote the set of ground truth sets.

Some basic properties about ground truth sets:

  • (P1): \(\G\) is downwards closed with respect to set inclusion: if \(F \in \G\) and \(F' \subseteq F\) then \(F' \in \G\).

  • (P2): The \(\cdot_\perp\) operation is monotone increasing: for \(F, F' \in \G\), if \(F' \subseteq F\) then \((F')_\perp \subseteq F_\perp\).

  • (P2): \(F \cap F_\perp = \emptyset\)

A supervised network adds a set of ground truths to a network.

Definition (Supervised TD network)

A supervised TD network is a pair \(N_*=\tuple{N, F}\) where \(N \in \N\) and \(F \in \G\). Let \(\N_*=\N \times \G\) denote the set of supervised TD networks.

Let \(\orderings(X)\) denote the set of total preorders on a set \(X\).

Definition (Supervised TD operator)

A supervised TD operator (henceforth an operator) is a mapping \(T: \N_* \to \orderings(\S) \times \orderings(\F)\). We will write \(T(N_*) = \tuple{\sle_{N_*}^T, \fle_{N_*}^T}\).

Axioms

For each axiom we implicitly assume universal quantification over supervised networks \(N_*=\tuple{N, F}\).

Axioms regarding ranking of grounded facts:

  • (G+): \(F \subseteq \max{\fle_{N_*}^T}\)

  • (G-): \(F_\perp \subseteq \min{\fle_{N_*}^T}\)

Agreeing with the ground truth should increase trust, and disagreeing should lower it. There are at least two ways to express this through a change to a network: by adding a ground truth or by adding a claim.

  • (A+): Suppose \((s, f) \in N\), \(f \not\in F \cup F_\perp\) and \(t \sle_{N_*}^T s\). Write \(N_*' = \tuple{N, F \cup \{f\}}\). Then \(t \slt_{N_*'}^T s\).

  • (A-): Suppose \((s, f) \in N\), \(f \not\in F \cup F_\perp\), \(s \sle_{N_*}^T t\). Let \(g \in \mut(f) \setminus \{f\}\) and write \(N_*' = \tuple{N, F \cup \{g\}}\). Then \(s \slt_{N_*'}^T t\).

  • (B+): Suppose \(f \in F\), \(\facts_N(s) \cap \mut(f) = \emptyset\) and \(t \sle_{N_*}^T s\). Write \(N'_*=\tuple{N \cup \{(s, f)\}, F}\). Then \(t \slt_{N'_*}^T s\).

  • (B-): Suppose \(f \in F_\perp\), \(\facts_N(s) \cap \mut(f) = \emptyset\) and \(s \sle_{N_*}^T t\). Write \(N'_* = \tuple{N \cup \{(s, f)\}, F}\). Then \(s \slt_{N'_*}^T t\).

Proposition 1

(A+) implies the following property:

  • (A’+): Suppose \((s, f) \in N\), \(f \in F\) and \(s \sle_{N_*}^T t\). Set \(N_*' = \tuple{N, F \setminus \{f\}})\). Then \(s \slt_{N'_*}^T t\)

Proof: Suppose otherwise, i.e. \(t \sle_{N_*'}^T s\). Set \(F' = F \setminus \{f\}\) so that \(N_*'=\tuple{N, F'}\). Note that \(f \not\in F'\) by definition, and \(f \not\in (F')_\perp\) by (P2) and (P3). 1 Therefore \(f \not\in F' \cup (F')_\perp\).

Now applying (A+) to \(N'_*\), we have \(t \slt_{N''_*}^T s\), where \(N''_*=\tuple{N, F' \cup \{f\}}\). But \(N''_*\) is just \(N_*\), so this contradicts the assumption that \(s \sle_{N_*}^T t\).

An analogous proposition might hold for (A-).


1

Indeed, since \(F' \subseteq F\) we have \((F')_\perp \subseteq F_\perp\) by (P2); thus \(\F \setminus F_\perp \subseteq \F \setminus (F')_\perp\). Since \(f \in F\), (P3) gives \(f \in \F \setminus F_\perp\) and clearly \(f \not\in (F')_\perp\).