\[\gdef{\Q}{\mathcal{Q}} \gdef{\S}{\mathcal{S}} \gdef{\D}{\mathcal{D}} \gdef{\X}{\mathcal{X}} \gdef{\P}{\mathcal{P}} \gdef{\R}{\mathbb{R}} \gdef{\N}{\mathbb{N}} \gdef{\tuple}#1{\langle{#1}\rangle} \gdef{\rs}{\upharpoonright} \gdef{\src}{\mathsf{src}} \gdef{\ques}{\mathsf{ques}} \gdef{\val}{\mathsf{val}} \gdef{\phi}{\varphi} \gdef{\Phiprofile}{\Phi_\text{profile}} \]

Question Answering (initial version)

Changelog

Since 23rd March 2020:

Since 22nd March 2020:

  • Added axiom satisfaction and impossibility sub-sections in the \(\Phiprofile\) section

Since 19th March 2020:

  • Defined \(\Phiprofile\) class of operators and obtained characterisation result

  • Defined another example operator ‘GenDifficulty’

  • Simplified IIA axiom (equivalent to the old version)

  • Added Weak IIA axiom

  • Added Directions for Work section

  • Footnote on potential question-value symmetry axiom

Idea

  • Consider a set of questions for which the true answers are known. Suppose a number of sources provide their own (potentially incorrect) answers for a subset of questions. How should we rank sources in terms of performance on the questions?

  • Motivations are two-fold:

    • In many iterative TD algorithms, source reliability scores are determined based on the current (estimated) set of true facts. This is an instance of the above problem. 1

    • In the supervised TD problem, we have a subset of ground truths which can be used to inform the source and fact rankings. This is an extension of the above problem. But surely we should have an idea on how sources should be ranked in the case of complete information before attempting the partial information case.

Problem Formulation

N.B: the formalism and notation is heavily inspired by [KEFQ14].

Consider a set of \(n\) questions \(\Q\), where each \(q \in \Q\) has a domain \(D_q\) and a true answer \(\tau_q \in D_q\), and \(m\) sources \(\S\). Write \(\D = \bigcup_{q \in \Q}{D_q}\) for the union of all domains. We assume that \(|D_q| > 1\) for each \(q\).

A source \(s\) may provide a value \(v\) for a question \(q\). This is represented by a tuple \(\tuple{s,q,v}\). A set of such tuples defines the input to the problem.

Definition (Answer-set)

We say that a set of tuples \(X \subseteq \S \times \Q \times \D\) is an answer-set if

  1. For each \(s\) and each \(q\) there is at most one \(v\) such that \(\tuple{s,q,v} \in X\)

  2. If \(\tuple{s,q,v} \in X\) then \(v \in D_q\)

Let \(\X\) denote the set of all answer-sets.

Remark. It is important to note that a source need not answer all questions. In [JRG17] this is not the case, and so a simpler formulation is possible: the input is a bipartite graph \((\S \cup \Q, E)\) where \((s, q) \in E\) iff \(s\) correctly answers \(q\).

We could do this by considering weighted edges (e.g. weight 1 for correct, -1 for incorrect). However, I like the idea of keeping some structure in the answers provided by sources. In particular, we could equip each domain \(D_q\) with a distance metric \(d_q\) so as to distinguish between incorrect answers which are almost correct, and those which are way off. This would make this problem much more applicable to the iterative TD algorithms discussed earlier.

Notation. Following [KEFQ14], for \(S \subseteq \S\), \(Q \subseteq \Q\) and \(D \subseteq \D\) we define the restriction

\[X \rs S, Q, D = \{\tuple{s, q, v} \in X \mid s \in S, q \in Q, v \in D\}\]

We omit components that are unconstrained, and omit brackets for singleton sets, e.g. \(X \rs s = X \rs \{s\}, \Q, \D\). Note that each restriction is again an answer-set.

We also define

\[\src(X) = \{s \mid \tuple{s, q, v} \in X\} \\ \ques(X) = \{q \mid \tuple{s, q, v} \in X\} \\ \val(X) = \{v \mid \tuple{s, q, v} \in X\}\]

Using this notation, condition (1) in the definition of an answer-set can be expressed as \(|X \rs s, q| \le 1\), 2 and (2) is expressed as \(\val(X \rs q) \subseteq D_q\).

Now, given an answer-set we wish to rank the sources in \(\S\). We define a ranking operator.

Definition (Operator)

An operator is a mapping \(\phi: \X \times \S \to \R\). The total preorder \({\le_X^\phi} \subseteq \S \times \S\) induced by \(\phi\) is defined by

\[s \le_X^\phi t \iff \phi(X, s) \le \phi(X, t)\]

Some Useful Notions

A source may answer a question correctly, incorrect, or decline to provide an answer at all. For \(s \in \S\), \(q \in \Q\), \(X \in \X\), let us write

\[X_q^s = \begin{cases} v, & \val(X \rs s, q) = \{v\} \\ \perp, & \val(X \rs s, q) = \emptyset \end{cases}\]

The set of questions which \(s\) answers, answers correctly, and answers incorrectly are given by

\[\begin{aligned} A_X(s) &= \{ q \in \Q \mid X_q^s \ne \perp \} \\ C_X(s) &= \{ q \in \Q \mid X_q^s = \tau_q \} \\ I_X(s) &= \{ q \in \Q \mid X_q^s \not\in \{\perp, \tau_q\} \} \\ \end{aligned}\]

Each set can be written in various other ways using the restriction notation. For example, \(A_X(s) = \ques(X \rs s)\), \(C_X(s) = \bigcup_{q \in \Q}{\ques(X \rs s, q, \tau_q)}\), \(I_X(s) = A_X(s) \setminus C_X(s)\).

The number of correct/incorrect questions answered by \(s\) in \(X\) will be called the profile of \(s\), denoted \(P_X(s) = \tuple{|C_X(s)|, |I_X(s)|}\). Write \(\P = \{\tuple{x, y} \in (\N_0)^2 \mid x + y \le n\}\) for the set of all possible profiles (recall that \(n =|\Q|\)).

Similar to above, each question has an associated set of sources who answer correctly and incorrectly.

\[\begin{aligned} \hat{A}_X(q) &= \{ s \in \S \mid q \in A_X(s) \} = \src(X \rs q) \\ \hat{C}_X(q) &= \{ s \in \S \mid q \in C_X(s) \} = \src(X \rs q, \tau_q) \\ \hat{I}_X(q) &= \{ s \in \S \mid q \in I_X(s) \} = \src(X \rs q, D_q \setminus \{\tau_q\}) = \hat{A}_X(q) \setminus \hat{C}_X(q) \end{aligned}\]

Example Operators

There are a number of simple operators one can consider.

NumCorrect: \(\phi(X, s) = |C_X(s)|\)

NumIncorrect: \(\phi(X, s) = -|I_X(s)|\)

Balance: \(\phi_\alpha(X, s) = |C_X(s)| - \alpha |I_X(s)|\) for some parameter \(\alpha > 0\).

Proportion: \(\phi(X, s) = \frac{|C_X(S)|}{|A_X(s)|}\) if \(A_X(s) \ne \emptyset\) and \(\phi(X, s) = 0\) otherwise.

PropLog: 3 \(\phi(X, s) = \frac{|C_X(S)|}{|A_X(s)|} \cdot \log(|A_X(s)|)\)

Note that each of the simple operators above only care about the profile of \(s\) in \(X\) (note \(|A_X(s)| = |C_X(s)| + |I_X(s)|\)). Indeed, for each operator there is a function \(f: \P \to \R\) such that \(\phi(X, s) = f(P_X(s))\). For example, PropLog corresponds to \(f(x, y) = \frac{x}{x + y} \cdot \log(x + y)\). The class of such ‘profile-based’ operators is investigated in more detail below.

More nuanced operators include the following.

Difficulty: \(\phi(X, s) = \sum_{q \in C_X(s)}{\frac{1}{|\hat{C}_X(q)|}}\)

Here the difficulty of a question is the reciprocal of the number of sources correctly answering it. This can be generalised by considering an arbitrary ‘difficulty function’ \(\delta: \X \times \Q \to \R\). Intuitively, \(\delta(X, q)\) should be decrease as \(\hat{C}_X(q)\) gets larger and increase as \(\hat{I}_X(q)\) gets larger.

GenDifficulty: \(\phi(X, s) = \sum_{q \in C_X(s)}{\delta(X, q)}\)

We can also assign weights not by difficulty, but by some notion of importance. If \(w: \X \times \Q \to \R_{\ge 0}\) is an importance weighting:

ImportanceWeighted: \(\phi(X, s) = \sum_{q \in C_X(s)}{w(X, q)} - \sum_{q \in I_X(s)}{w(X, q)}\)

Note that when it comes to incorrect answers, there is an important differences between weights set according to difficulty and according to importance. If a source incorrectly answers a highly difficult question, it seems fair to penalise them only slightly. In contrast, answering a highly important question incorrectly should carry a large penalty.

A Maximum Likelihood Approach

The source scores \(\phi(X, s)\) for the operators above are defined via some heuristic which aims to match our intuition about when a source should be deemed better or worse than another one. In most cases the scores themselves have no semantic meaning.

However this is not the case for Proportion: it can be seen that this is the operator which arises from maximum likelihood considerations in a basic probabilistic model.

Enumerate the set of sources as \(\S = \{s_1,\ldots,s_m\}\). The parameter of the model is a vector of true reliability levels \(\theta=\tuple{r_1,\ldots,r_m} \in (0, 1)^m\). Each \(r_i\) represents the probability that an answer provided by \(s_i\) is correct.

We make the following assumptions:

  • All questions have equal difficulty.

  • A sources choose incorrect value for question \(q\) uniformly at random from \(D_q \setminus \{\tau_q\}\).

  • Sources choose their answers independently.

  • There is a fixed answering rate \(\alpha\), which is the probability that a source answers any particular question. That is, a source ignores a question with probability \(1 - \alpha\).

We can now consider a random variable \(\tilde{X}\) in \(\X\). We have, for any source \(s_i\) and question \(q\),

\[\begin{aligned} P(\tilde{X}_q^{s_i} = \tau_q) &= \alpha r_i \\ P(\tilde{X}_q^{s_i} = v) &= \frac{\alpha (1 - r_i)}{|D_q|-1} \quad (v \in D_q \setminus \{\tau_q\}) \\ P(\tilde{X}_q^{s_i} = {\perp}) &= 1 - \alpha \end{aligned}\]

Given a particular input \(X\), maximum likelihood estimation finds the parameter vector \(\theta\) which is most likely to yield \(X\). That is, \(\hat{\theta}\) is a maximum likelihood estimate for \(X\) if

\[\hat{\theta} \in \text{argmax}_{\theta}{P(\tilde{X} = X \mid \theta)}\]

We make a quick definition.

Definition (MLE operator)

\(\phi\) is an MLE operator if \(\tuple{\phi(X, s_1), \ldots \phi(X, s_m)}\) is a maximum likelihood estimate for \(X\), for any answer-set \(X\).

What does \(P(X \mid \theta)\) look like? By the assumption that sources choose answers independently, it is equal to the product over all sources \(s_i\) and questions \(q\) of \(P(\tilde{X}_q^{s_i} = X_q^{s_i} \mid \theta)\). These probabilities only depend on whether \(q \in C_X({s_i})\), \(q \in I_X({s_i})\) or \(q \in U_X({s_i})\), where we write \(U_X({s_i}) = \Q \setminus A_X({s_i})\) for the questions unanswered by \({s_i}\). Writing \(K_q = |D_q| - 1\), we have

\[\begin{aligned} P(X \mid \theta) &= \prod_{i=1}^{m}{\left\{ \left(\prod_{q \in C_X(s_i)}{\alpha r_i}\right) \left(\prod_{q \in I_X(s_i)}{\alpha (1 - r_i) K_q^{-1}}\right) \left(\prod_{q \in U_X(s_i)}{(1 - \alpha)}\right) \right\}} \\ &= \prod_{i=1}^{m}{\left\{ (\alpha r_i)^{|C_X(s_i)|} \left(\alpha (1 - r_i)\right)^{|I_X(s_i)|} \left(\prod_{q \in I_X(s_i)}{K_q^{-1}}\right) (1 - \alpha)^{|U_X(s_i)|} \right\}} \end{aligned}\]

To find a maximum likelihood estimate, we need to find \(\theta\) which maximises this quantity. Note that this is maximised when its logarithm is maximised, since \(\log\) is increasing. We have

\[\begin{aligned} \log P(X \mid \theta) &= \sum_{i=1}^{m}\Big\{ |C_X(s_i)| (\log{\alpha} + \log{r_i}) + |I_X(s_i)| (\log{\alpha} + \log(1 - r_i)) \\ & \quad \quad \quad - \sum_{q \in I_X(s_i)}{\log{K_q}} + |U_X(s_i)| \log(1 - \alpha) \Big\} \end{aligned}\]

Most terms in this sum do not depend on \(\theta\), but only on the \(K_q\), \(X\) and \(\alpha\). Grouping such terms together and denoting their sum by \(\mathcal{E}(X)\), 4 we have

\[\log P(X \mid \theta) = \mathcal{E}(X) + \sum_{i=1}^{m}\left\{ |C_X(s_i)| \log{r_i} + |I_X(s_i)| \log(1 - r_i) \right\}\]

This quantity is maximised when each term in the sum is. The following simple lemma tells us the value of the maximising \(r_i\).

Lemma

Let \(a, b \ge 0\), \(a + b \ne 0\). Then

\[\text{argmax}_{x \in (0, 1)}{(a\log{x} + b\log(1 - x))} = \frac{a}{a + b}\]
Proof.

Write \(f(x) = a\log{x} + b\log(1 - x)\). Note that \(f'(x) = \frac{a}{x} - \frac{b}{1-x}\), so

\[\begin{aligned} f'(x) = 0 & \iff \frac{a}{x} - \frac{b}{1-x} = 0 \\ & \iff a(1-x) - bx = 0 \\ & \iff a - ax - bx = 0 \\ & \iff a - (a+b)x = 0 \\ & \iff x = \frac{a}{a+b} \end{aligned}\]

So \(f\) has a unique stationary point at \(\frac{a}{a+b}\). It is easily checked that this is indeed a maximum by observing that \(f''(\frac{a}{a+b}) < 0\). Moreover \(f(x) \to -\infty\) as \(x \to 0\) and \(x \to 1\), so the argmax does indeed exist, and we are done.

Note that each term in our sum is of the form of the quantity maximised in the lemma, unless \(|C_X(s_i)|=|I_X(s_i)|=0\) in which case the term is 0. It follows that \(P(X \mid \theta)\) is maximised by taking

\[r_i = \frac{|C_X(s_i)|}{|C_X(s_i)|+|I_X(s_i)|} = \frac{|C_X(s_i)|}{|A_X(s_i)|}\]

for all \(s_i\) with \(A_X(s_i)\ne \emptyset\). Note that this is exactly the score assigned to \(s_i\) by the Proportion operator \(\phi_\text{Prop}\). This discussion is summarised as follows.

Theorem

\(\phi\) is an MLE operator if and only if \(\phi(X, s) = \phi_\text{Prop}(X, s)\) for all \(X\) and \(s\) such that \(A_X(s) \ne \emptyset\).

Of course, the model here is extremely simple. An interesting extension would be to allow variability in the difficulty of questions. For example one could consider a difficulty level \(d_q \in \N_0\) for each question \(q\); the probability of \(s_i\) correctly answering \(q\) could then be \(\alpha \cdot r_i^{d_q}\) to reflect the idea that difficult questions require a higher reliability level in order to achieve the same probability of correctness.

Axioms

N.B: The following is a collection is reasonable-ish properties; not all are necessarily desirable axioms for all cases. Axiom names are all provisional :)

The classic notion of independence can be expressed in the framework: the ranking of \(s\) and \(t\) should depend only on the answers provided by \(s\) and \(t\). 5

Axiom (Independence of Irrelevant Answers (IIA))

\(s \le_X^\phi t\) iff \(s \le_{X \rs \{s,t\}}^\phi t\).

IIA may not be desirable if we want to incorporate the difficulty of questions into the ranking. E.g. suppose \(s\) only correctly answers the questions that everyone got right (i.e. the ‘easy’ questions), and \(t\) correctly answers questions which no one else got (the ‘hard’ questions). Intuitively this means \(t\) is better than \(s\), but IIA prevents this playing a role in the ranking. A weaker independence axiom is therefore required.

Axiom (Weak IIA)

Write \(Q = A_X(S) \cup A_X(t)\). Then \(s \le_X^\phi t\) iff \(s \le_{X \rs Q}^\phi t\).

Weak IIA allows the ranking of \(s\) and \(t\) to depend on answers given by other sources, but only those for questions which \(s\) or \(t\) answer themselves. Note that a case could be made for weakening this even further, e.g. to penalise sources for not answering ‘easy’ questions.

We can also state an axiom which explicitly ensures difficulty plays a role in the source ranking.

Axiom (Difficulty Awareness (DA))

Suppose \(X \rs s = Y \cup \{\tuple{s, q, \tau_q}\}\) and \(X \rs t = Y \cup \{\tuple{t, q', \tau_{q'}}\}\) for some \(q, q' \in \Q\) and \(Y \subseteq X\). Then \(|\hat{C}_X(q)| >|\hat{C}_X(q')|\) implies \(s <_X^\phi t\).

A naive monotonicity-type axiom concerns questions answered correctly.

Axiom (C-monotonicty)

If \(C_X(s) \subseteq C_X(t)\) then \(s \le_X^\phi t\).

The dual axiom concerns questions answered incorrectly.

Axiom (I-antimonotonicty)

If \(I_X(s) \supseteq I_X(t)\) then \(s \le_X^\phi t\).

C-monotonicty and I-antimonotonicty are somewhat naive since they do not consider which questions were attempted. E.g. consider a situation where \(s\) and \(t\) answer exactly the same questions correctly, \(t\) answers 10 more incorrectly, and \(s\) answers none incorrectly. Then C-monotonicty implies \(s \simeq_X^\phi t\), but intuitively \(t\) should be ranked below \(s\).

A similar property accounts for this somewhat (suggested by Richard).

Axiom (C-cardinality)

If \(A_X(s) = A_X(t)\) and \(|C_X(s)| \le|C_X(t)|\), then \(s \le_X^\phi t\).

In a similar vein, we can state a positive responsiveness axiom.

Axiom (Correct Positive Responsiveness (CPR))

Suppose \(s \le_X^\phi t\) and \(q \not\in A_X(t)\). Write \(X' = X \cup \{\tuple{t, q, \tau_q}\}\). Then \(s <_{X'}^\phi t\).

…and the dual property…

Axiom (Incorrect Negative Responsiveness (INR))

Suppose \(s \le_X^\phi t\) and \(q \not\in A_X(s)\). Let \(v \in D_q \setminus \{\tau_q\}\), and set \(X' = X \cup \{\tuple{s, q, v}\}\). Then \(s <_{X'}^\phi t\).

Of course, there is also a symmetry axiom available. Following [KEFQ14], we extend a permutation \(\pi: \S \to \S\) to \(\X\) element-wise: \(\pi(X) = \{\tuple{\pi(s), q, v} : \tuple{s, q, v} \in X\}\).

Axiom (Source-symmetry)

If \(\pi\) is a permutation on \(\S\) then \(s \le_X^\phi t\) iff \(\pi(s) \le_{\pi(X)}^\phi \pi(t)\).

Note that other symmetries (question-symmetry, value-symmetry) are more tricky/impossible to state. If trying to permute questions in \(X\) we must also permute the claimed values, which requires (bijective) mappings between domains. This is not always possible. E.g. consider \(D_1 = \{a,b\}\), \(D_2 = \R\). If there are more than two answers put forward for question 2, we cannot permute questions 1 and 2 without squishing some of these values together, thus not preserving the structure of the answer-set. 6

Profile-based Operators

For a function \(f: \P \to \R\), define an operator \(\phi_f\) by

\[\phi_f(X, s) = f(P_X(s)) = f(|C_X(s)|, |I_X(s)|)\]

Since our axioms generally have an ordinal quality to them (we place restrictions on the ordering \({\le_X^\phi}\), not the numeric values \(\phi(X, s)\)), we will be interested in the wider class of operators whose ranking is the same as \(\phi_f\) for some \(f\). Formally, write

\[\phi \sim \phi' \iff \forall X \in \X, \forall s, t \in \S: (s \le_X^\phi t \iff s \le_X^{\phi'} t)\]

and define the class of profile-based operators by

\[\Phiprofile = \{\phi \mid \exists f: \P \to \R \text{ such that } \phi \sim \phi_f \}\]

Characterisation

Each \(\phi \in \Phiprofile\) clearly satisfies the following axiom.

Axiom (Question-value Independence (QVI))

If for all \(s \in \S\) we have \(P_X(s) = P_{X'}(s)\), then \({\le_X^\phi} = {\le_{X'}^\phi}\).

If \(\phi\) satisfies QVI then the particular questions which were correctly/incorrect answered – and the particular incorrect values put forward – are irrelevant to the source ranking.

This is a very strong axiom. Together with IIA and Source-symmetry, it characterises \(\Phiprofile\).

Theorem

\(\phi\) satisfies QVI, IIA and Source-symmetry if and only if \(\phi \in \Phiprofile\).

Proof.

First we show the ‘if’ statement. Suppose \(\phi \sim \phi_f\) for some \(f\). To show QVI, suppose \(X, X' \in \X\) satisfy \(P_X(s) = P_{X'}(s)\) for all \(s \in \S\). Then, for \(s, t \in \S\) we have

\[\begin{aligned} s \le_X^\phi t & \iff s \le_X^{\phi_f} t \\ & \iff f(P_X(s)) \le f(P_X(t)) \\ & \iff f(P_{X'}(s)) \le f(P_{X'}(t)) \\ & \iff s \le_{X'}^\phi t \end{aligned}\]

as required for QVI.

For IIA, let \(X \in \X\) and \(s, t \in \S\). Note that

\[P_{X \rs \{s,t\}}(s) = P_X(s) \\ P_{X \rs \{s,t\}}(t) = P_X(t)\]

Hence

\[\begin{aligned} s \le_X^\phi t & \iff f(P_X(s)) \le f(P_X(t)) \\ & \iff f(P_{X \rs \{s,t\}}(s)) \le f(P_{X \rs \{s,t\}}(t)) \\ & \iff s \le_{X \rs \{s,t\}}^\phi t \end{aligned}\]

Finally, for Source-symmetry let \(\pi\) be a permutation on sources. It is easily checked that \(P_X(s) = P_{\pi(X)}(\pi(s))\) for each \(s\). Consequently

\[\begin{aligned} s \le_X^\phi t & \iff f(P_X(s)) \le f(P_X(t)) \\ & \iff f(P_{\pi(X)}(\pi(s))) \le f(P_{\pi(X)}(\pi(t))) \\ & \iff \pi(s) \le_{\pi(X)}^\phi \pi(t) \end{aligned}\]

Now we show the ‘only if’ statement. Suppose \(\phi\) satsifes QVI, IIA and Source-symmetry. TODO: briefly outline proof strategy here, to make it easier to follow?

For distinct \(s, t \in \S\) and \(\vec{u}, \vec{v} \in \P\), write

\[\X_{s,t}(\vec{u}, \vec{v}) = \{ X \in \X \mid P_X(s) = \vec{u}, P_X(t) = \vec{v} \} \ne \emptyset\]

It follows from QVI and IIA that the ranking of \(s\) and \(t\) is constant across \(\X_{s,t}(\vec{u}, \vec{v})\). Indeed, suppose \(X, X' \in \X_{s,t}(\vec{u}, \vec{v})\). Then the profiles are the same for all sources in \(X \rs \{s, t\}\) and \(X' \rs \{s, t\}\): the profiles of \(s\) and \(t\) are \(\vec{u}\) and \(\vec{v}\) respectively, and the profile of \(z \in \S \setminus \{s,t\}\) is \(\tuple{0, 0}\). Thus

\[\begin{aligned} s \le_X^\phi t & \iff s \le_{X \rs \{s,t\}}^\phi t \\ & \iff s \le_{X' \rs \{s,t\}}^\phi t \\ & \iff s \le_{X'}^\phi t \end{aligned}\]

where IIA is applied in the first and third steps, and QVI in the second.

We can therefore define a binary relation \(\sqsubseteq_{s,t}\) on \(\P\) by

\[\vec{u} \sqsubseteq_{s, t} \vec{v} \iff s \le_X^\phi t\]

where \(X\) is an arbitrary member of \(\X_{s,t}(\vec{u}, \vec{v})\).

Next, we apply Source-symmetry to show that \(\sqsubseteq_{s, t}\) does not actually depend on \(s\) and \(t\). Indeed, let \(s', t'\) be another pair of distinct sources. We show that \({\sqsubseteq_{s, t}} = {\sqsubseteq_{s', t'}}\).

Let \(\vec{u}, \vec{v} \in \P\). Take \(X \in \X_{s,t}(\vec{u}, \vec{v})\) and let \(\pi\) be any permutation on \(\S\) with \(\pi(s) = s', \pi(t) = t'\). 7 It is easily seen that \(\pi(X) \in \X_{s', t'}(\vec{u}, \vec{v})\). Consequently

\[\begin{aligned} \vec{u} \sqsubseteq_{s, t} \vec{v} & \iff s \le_X^\phi t \\ & \iff \pi(s) \le_{\pi(X)}^\phi \pi(t) \\ & \iff s' \le_{\pi(X)}^\phi t' \\ & \iff \vec{u} \sqsubseteq_{s', t'} \vec{v} \end{aligned}\]

where we have applied Source-symmetry in the second step. This shows that \(\sqsubseteq_{s, t}\) and \(\sqsubseteq_{s', t'}\) are in fact the same relation, which we now denote simply by \(\sqsubseteq\).

Despite the suggestive notation, we have not shown that \(\sqsubseteq\) is transitive. Next we show that it is transitive and total, i.e. a total preorder.

  • Transitivity. Suppose \(\vec{u} \sqsubseteq \vec{v}\) and \(\vec{v} \sqsubseteq \vec{w}\). Take arbitrary distinct sources \(s, t, z \in \S\). Let \(X\) be an answer-set with

    \[\begin{aligned} P_X(s) &= \vec{u} \\ P_X(z) &= \vec{v} \\ P_X(t) &= \vec{w} \end{aligned}\]

    Note that \(X \in \X_{s,z}(\vec{u}, \vec{v})\). Thus \(\vec{u} \sqsubseteq \vec{v}\) implies \(\vec{u} \sqsubseteq_{s,z} \vec{v}\) which implies \(s \le_X^\phi z\). Similarly, \(X \in \X_{z,t}(\vec{v}, \vec{w})\), so \(\vec{v} \sqsubseteq \vec{w}\) implies \(z \le_X^\phi t\).

    Transitivity of \({\le_X^\phi}\) therefore gives \(s \le_X^\phi t\). But \(X \in \X_{s,t}(\vec{u}, \vec{w})\), so this means \(\vec{u} \sqsubseteq_{s,t} \vec{w}\) and \(\vec{u} \sqsubseteq \vec{w}\), as required.

  • Totality. Suppose \(\vec{u} \not\sqsubseteq \vec{v}\). Take arbitrary distinct \(s, t \in \S\), and let \(X \in \X_{s,t}(\vec{u}, \vec{v})\). Then \(t <_X^\phi s\). Writing \(\pi = (s, t)\), Source-symmetry gives \(s <_{\pi(X)}^\phi t\). Since \(\pi(X) \in \X_{s, t}(\vec{v}, \vec{u})\), we have \(\vec{v} \sqsubset \vec{u}\). Therefore \(\sqsubseteq\) is total.

Finally, we are ready to define \(f\):

\[f(\vec{u}) = |\{\vec{v} \in \P \mid \vec{v} \sqsubseteq \vec{u}\}|\]

It is easily seen that, due to the finiteness of \(\P\) and the fact that \(\sqsubseteq\) is a total preorder, we have \(\vec{u} \sqsubseteq \vec{v}\) iff \(f(\vec{u}) \le f(\vec{v})\).

It only remains to show that \(\phi \sim \phi_f\). Indeed, let \(X \in \X\) and \(s, t \in \S\). If \(s = t\) then clearly \(s \le_X^\phi t\) iff \(s \le_X^{\phi_f} t\). Otherwise, noting that \(X \in \X_{s,t}(P_X(s), P_X(t))\) we have

\[\begin{aligned} s \le_X^\phi t & \iff P_X(s) \sqsubseteq_{s,t} P_X(t) \\ & \iff P_X(s) \sqsubseteq P_X(t) \\ & \iff f(P_X(s)) \le f(P_X(t)) \\ & \iff s \le_X^{\phi_f} t \end{aligned}\]

and we are done.

Axiom Satisfaction

The axioms satisfied by operators in \(\Phiprofile\) can be determined by considering \(\phi_f\) for different \(f\).

As intuition would suggest, \(\phi_f\) never satisfies Difficulty Awareness.

Proposition

\(\phi_f\) does not satisfy Difficulty Awareness for any \(f\).

Proof.

Take three distinct sources \(s, t, z \in \S\) and two distinct questions \(q, q' \in \Q\). Set

\[X = \{\tuple{s, q, \tau_q}, \tuple{z, q, \tau_q}, \tuple{t, q', \tau_{q'}} \}\]

Clearly the conditions for DA are satisfied (with \(Y = \emptyset\)) and \(|\hat{C}_X(q)| = 2 > 1 = |\hat{C}_X(q')|\). DA would require that \(s <_X^{\phi_f} t\), but clearly \(P_X(s)=P_X(t)=\tuple{1,0}\), so \(s \simeq_X^{\phi_f} t\).

We can characterise the other axioms by some condition on \(f\).

Proposition

\(\phi_f\) satisfies C-monotonicity if and only if there is \(g: \{0,\ldots,n\} \to \R\) monotone increasing such that \(f(x, y) = g(x)\) for all \(\tuple{x,y} \in \P\).

Proof.

For the ‘if’ statement, suppose such \(g\) exists. To show C-monotonicity, suppose \(C_X(s) \subseteq C_X(t)\) for some \(X \in \X\) and \(s, t \in \S\). Then \(|C_X(s)| \le|C_X(t)|\), so

\[\begin{aligned} \phi_f(X, s) & = f(|C_X(s)|, |I_X(s)|) \\ & = g(|C_X(s)|) \\ & \le g(|C_X(t)|) \\ & = f(|C_X(t)|, |I_X(t)) \\ & = \phi_f(X, t) \end{aligned}\]

so \(s \le_X^{\phi_f} t\).

Now for the ‘only if’ statement. Suppose \(\phi_f\) has C-monotonicity. Let \(x \in \{0,\ldots,n\}\) and \(\tuple{x, y}, \tuple{x, y'} \in \P\). We will show that \(f(x, y) = f(x, y')\). Indeed, let \(Q_C, Q_I, Q'_I \subseteq \Q\) be sets of questions with \(|Q_C| = x,|Q_I|=y\) and \(|Q'_I|=y'\), where \(Q_C\) does not intersect with \(Q_I\) or \(Q'_I\) (this is possible since \(x+y, x+y' \le n\)). Construct \(X\) such that \(C_X(s)=C_X(t)=Q_C\), \(I_X(s) = Q_I\) and \(I_X(t) = Q'_I\). Clearly \(P_X(s) = \tuple{x, y}\) and \(P_X(t) = \tuple{x, y'}\).

Now, we have both \(C_X(s) \subseteq C_X(t)\) and \(C_X(t) \subseteq C_X(s)\), so C-monotonicity gives \(s \simeq_X^{\phi_f} t\), i.e. \(f(x, y) = f(x, y')\).

We have shown that \(f(x, y)\) is independent of \(y\); define \(g(x)\) by this unqiue value \(f(x, y)\). It only remains to show that \(g\) is monotone increasing. For this, suppose \(x_1 \le x_2\). Let \(Q_2 \subseteq \Q\) be a set of \(x_2\) questions, and let \(Q_1 \subseteq Q_2\) be a subset of \(x_1\) questions. Take arbitrary \(s, t \in \S\) and construct \(X\) such that \(C_X(s)=Q_1, C_X(t)=Q_2\) and \(I_X(s)=I_X(t)=\emptyset\). Clearly C-monotonicity implies \(s \le_X^{\phi_f} t\).

Now, we have \(P_X(s)=\tuple{x_1,0}\) and \(P_X(t)=\tuple{x_2,0}\). Consequently

\[\begin{aligned} g(x_1) & = f(x_1, 0) \\ & = \phi_f(X, s) \\ & \le \phi_f(X, t) \\ & = f(x_2, 0) \\ & = g(x_2) \end{aligned}\]

so \(g\) is monotone increasing.

A similar result holds for I-antimonotonicty.

Proposition

\(\phi_f\) satisfies I-antimonotonicity if and only if there is \(h: \{0,\ldots,n\} \to \R\) monotone decreasing such that \(f(x, y) = h(y)\) for all \(\tuple{x,y} \in \P\).

Proof. Similar.

Corollary

\(\phi_f\) satisfies C-monotonicity and I-antimonotonicity if and only if \(f\) is constant.

Proof.

The ‘if’ statement is clear. For the ‘only if’ direction, suppose \(\phi_f\) satisfies C-monotonicity and I-antimonotonicity. Then there exist functions \(g, h: \{0\,\ldots,n\} \to \R\) as per the two previous propositions, and for any \(\tuple{x,y} \in \P\),

\[f(x,y)=g(x)=f(x,0)=h(0)\]

i.e. \(f\) is constant with value \(h(0)\).

For C-cardinality…

Proposition

\(\phi_f\) satisfies C-cardinality if and only if \(f(x,y) \le f(x+1,y-1)\) whenever \(\tuple{x,y}, \tuple{x+1,y-1} \in \P\).

Proof.

First, the ‘only if’ statement. Suppose \(\phi_f\) has C-cardinality and \(\tuple{x,y}, \tuple{x+1,y-1} \in \P\). Then we can construct an answer-set \(X\) with \(P_X(s) = \tuple{x, y}\), \(P_X(t) = \tuple{x+1,y-1}\) for some arbitrary sources \(s, t \in \S\). Clearly \(|A_X(s)| = |A_X(t)| = x + y\), and \(|C_X(s)| = x \le x + 1 = |C_X(t)|\). C-cardinality therefore gives \(s \le_X^{\phi_f} t\), i.e. \(f(x, y) \le f(x+1,y-1)\).

Now for the ‘if’ statement. Suppose \(f\) has the stated property, and suppose \(X \in \X\), \(s, t \in \S\) such that \(|A_X(s)| = |A_X(t)|\) and \(|C_X(s)| \le |C_X(t)|\). We must show \(s \le_X^{\phi_f} t\).

Write \(P_X(s) = \tuple{x,y}\), \(P_X(t) = \tuple{x', y'}\). Then \(x+y=x'+y'\) and \(x \le x'\). It is easy to see that our condition on \(f\) implies \(f(x,y) \le f(x+k,y-k)\) for all \(k \ge 0\) such that \(\tuple{x+k,y-k} \in \P\).

Writing \(k = x' - x \ge 0\) we find \(x+k = x'\) and

\[y-k = y - (x' - x) = (x + y) - x' = (x' + y') - x' = y'\]

Consequently

\[\phi_f(X, s) = f(x,y) \le f(x+k, y-k) = f(x', y') = \phi_f(X, t)\]

i.e. \(s \le_X^{\phi_f} t\) as required.

For the positive/negative responsiveness axioms…

Proposition

\(\phi_f\) satisfies CPR if and only if \(f(x, y) < f(x+1,y)\) for all \(x\) and \(y\).

Proof.

We start with the ‘if’ statement. Suppose \(f\) has the stated property, and let \(X, s, t, q\) be such that \(s \le_X^{\phi_f} t\) and \(q \not\in A_X(t)\). Write \(X' = X \cup \{\tuple{t,q,\tau_q}\}\). We must show \(s <_{X'}^{\phi_f} t\).

Write \(\tuple{x,y} = P_X(s)\) and \(\tuple{x',y'} = P_X(t)\). Then \(f(x,y) \le f(x',y')\). Note that

\[\begin{aligned} P_{X'}(s) & = P_X(s) = \tuple{x, y} \\ P_{X'}(t) & = \tuple{x' + 1, y'} \end{aligned}\]

By hypothesis we have \(f(x',y') < f(x' + 1, y')\) and hence

\[\begin{aligned} \phi_f(X', s) & = f(x, y) \\ & \le f(x', y') \\ & < f(x' + 1, y') \\ & = \phi_f(X', t) \end{aligned}\]

so \(s <_{X'}^{\phi_f} t\) as required.

For the ‘only if’ statement, suppose \(\phi_f\) has CPR and let \(\tuple{x,y}, \tuple{x+1,y} \in \P\). Construct an answer-set \(X\) such that, for some distinct sources \(s, t \in \S\),

\[P_X(s) = P_X(t) = \tuple{x, y}\]

Note that \(\phi_f(X,s) = f(x,y) = \phi_f(X, t)\). In particular, \(s \le_X^{\phi_f} t\).

Since \(\tuple{x+1,y} \in \P\) it must hold that \(x + y < n\), i.e. \(|A_X(t)| < n\). Therefore there is some \(q \in \Q \setminus A_X(t)\). Write \(X' = X \cup \{\tuple{t, q, \tau_q}\}\). Clearly CPR gives \(s <_{X'}^{\phi_f} t\). Noting that \(P_{X'}(s) = P_X(s) = \tuple{x, y}\) and \(P_{X'}(t) = \tuple{x + 1, y}\), we must have \(f(x,y) < f(x + 1, y)\) as required.

And the corresponding result for INR…

Proposition

\(\phi_f\) satisfies INR if and only if \(f(x, y) > f(x,y+1)\) for all \(x\) and \(y\).

Proof. Similar.

Corollary

If \(\phi_f\) satisfies CPR and INR, then \(\phi_f\) satisfies C-cardinality.

Proof.

Suppose \(\tuple{x,y}, \tuple{x+1,y-1} \in \P\). By CPR we have \(f(x,y) < f(x+1, y)\), and by INR we have \(f(x+1,y) < f(x+1,y-1)\). Consequently \(f(x, y) \le f(x+1,y-1)\). By an earlier proposition this is equivalent to C-cardinality, so we are done.

The above results make it simple to assess which axioms are satisfied by the specific examples of profile-based operators from the example operators section. For each profile-based operator in that section, we write its corresponding \(f\) function below.

NumCorrect: \(f(x, y) = x\)

NumIncorrect: \(f(x, y) = -y\)

Balance: \(f(x, y) = x - {\alpha}{y}\) for some parameter \(\alpha > 0\).

Proportion: \(f(x, y) = \frac{x}{x + y}\) for \(\tuple{x, y} \ne \tuple{0, 0}\), and \(f(0, 0) = 0\).

PropLog: \(f(x, y) = \frac{x}{x + y} \cdot \log(x + y)\), and \(f(0, 0) = 0\).

Theorem

The axioms satisfied by each of the above operators is given according to the following table. Note that IIA, Weak IIA and Source-symmetry are satisfied by all examples, and DA is satisfied by none; as such these axioms are omitted from the table.

Axiom NumCorrect NumIncorrect Balance Prop PropLog
C-mon.
I-antimon.
C-card.
CPR
INR
Proof.

Most of these are trivial given the propositions above and the \(f\) function associated with each operator. However, it is worth briefly explaining the counterexamples for CPR and IPR for Prop and PropLog.

Let \(f\) denote Prop. Then \(f(1, 0) = f(2, 0) = 1\), but our proposition for CPR requires that \(f(1, 0) < f(2, 0)\). Similarly, we have \(f(0, 0) = f(0, 1) = 0\), but INR requires \(f(0, 0) > f(0, 1)\).

If \(f\) now denotes PropLog, then \(f(0, 0) = f(1, 0) = 0\) provides the counterexample for CPR. The same counterexample for Prop for INR works for PropLog too.

Basic Impossibility Result

The characterisation result and the above properties lead to an impossibility result. It is not particularly surprising, since a lot of quite strict axioms are involved.

Theorem

There is no operator satisfying QVI, IIA, Source-symmetry, C-monotonicity, I-antimonotonicity and CPR (and similar for INR).

Proof.

If not, there is an operator \(\phi\) satisfying all the above axioms. By QVI, IIA and Source-symmetry, there must be \(f\) such that \(\phi \sim \phi_f\). Then \(\phi_f\) also satisfies C-monotonicity, I-antimonotonicity and CPR, since its ranking is the same as \(\phi\) for all \(X\).

Since C-monotonicity and I-antimonotonicity are satisfied together, \(f\) must be constant by an above corollary. But then we cannot have \(f(x, y) < f(x + 1, y)\) for any \(x, y\). Since this is equivalent to CNR, we have arrived at a contradiction.

Relation to Preference Aggregation?

Our problem can be converted into one of preference aggregation. Indeed, for each \(q \in \Q\), define a mapping \(\psi_q: \X \times \S \to \{-1, 0, 1\}\) by

\[\psi_q(X, s) = \begin{cases} 1, & q \in C_X(s) \\ -1, & q \in I_X(s) \\ 0, & \text{ otherwise} \end{cases}\]

Let \(\preceq_X^q\) be the induced ranking on \(\S\). Now we can view the questions as agents, and the sources as alternatives. Each \(q\) simply ranks the sources who answered correctly above those who declined to answer, and in turn ranks those above sources who answered incorrectly. This seems like a natural enough choice.

Our source ranking problem now reduces to aggregating \(\preceq_X^1,\ldots,\preceq_X^n\) to a single total preorder, for which we can consider the many SWFs discussed in the social choice literature.

Directions for Work

  • Find more reasonable axioms and investigate their consequences.

  • Try to find weaker axioms that imply QVI, for the \(\Phiprofile\) characterisation. QVI is very strong, and already goes most of the way to ensuring the behaviour of profile-based operators.

  • Focus on the difficulty-awareness stuff. Could look at ‘GenDifficulty’ and see how properties of \(\delta\) align with axioms for the corresponding operator.

  • Focus on the potential link with preference aggregation. See how axioms on the SWF correspond to axioms for \(\phi\).

  • Consider the distance from incorrect answers to the truth: axioms to state that providing close incorrect answers is better than far-off incorrect answers.

  • Look at an MLE-based operator in a more interesting probability model.


1

A slight difference is that in the TD algorithm the truths are only estimated, not completely known.

2

Condition (1) is called category-exclusiveness in [KEFQ14].

3

Inspired by AverageLog.

4

Specifically, \(\mathcal{E}(X) = \sum_{i=1}^{m}{\left\{ |A_X(s_i)|\log{\alpha} + |U_X(s_i)|\log(1 - \alpha) - \sum_{q \in I_X(s_i)}{\log{K_q}} \right\}}\).

5

Also of interest is a stronger, numerical (as opposed to ordinal) version of this axiom, which states that \(\phi(X, s) = \phi(X \rs s, s)\).

6

Maybe this is a deficiency of the framework, and we should do away with separate domains to resolve this problem.

Here’s an idea for question-value symmetry in this case. Let \(\sigma\) be a permutation on \(\Q\), and for each \(q \in \Q\) let \(\sigma_q\) be a permutation on \(\D\) such that \(\sigma_q(\tau_q) = \tau_{\sigma(q)}\). For \(X \in \X\) set

\[\sigma(X) = \{\tuple{s, \sigma(q), \sigma_q(v)} \mid \tuple{s, q, v} \in X \}\]

The \(\phi\) has question-value symmetry if \({\le_X^\phi} = {\le_{\sigma(X)}^\phi}\).

Note that multiple value permutations \(\sigma_q\) for each question \(q\) are needed since values may overlap between questions. For example consider \(\D = \{1,2\}\) and

\[X = \{\tuple{s, q_1, 1} ,\tuple{s, q_2, 2} ,\tuple{s, q_3, 2} \}\]

where \(\tau_{q_1} = 1\), \(\tau_{q_2} = 2\), \(\tau_{q_3} = 1\). That is, \(s\) gets \(q_1\) and \(q_2\) correct, but gets \(q_3\) incorrect.

Let \(\sigma\) swap \(q_1\) and \(q_2\), and suppose values are permuted according to a single permutation. We need to swap values 1 and 2 to retain the stucture of \(X\) regarding the first two tuples (\(s\) got these correct in \(X\), so should get the corresponding tuples correct in \(\sigma(X)\)). But this means \(\tuple{s, q_3, 2}\) maps to \(\tuple{s, q_3, 1}\), i.e. \(s\) was incorrect on \(q_3\) in \(X\) but is correct on \(\sigma(q_3\)) in \(\sigma(X)\). This means \(\sigma(X)\) does not have the same structure as \(X\), which is not good when we are trying to state a symmetry axiom.

7

E.g. \(\pi = (s, s')(t, t')\) if \(s \ne t'\) and \(t \ne s'\). In other cases the particular permuation may be different, but is guaranteed to exist.