ptrnsai

Consensus Voting

Intermediate👥 Multi-Agent PatternsAcademic research / Industry practice

Intent

Multiple agents independently solve the same problem, then vote to determine the best answer.

Problem

Any single agent can produce a wrong answer. Relying on one output for critical decisions is risky. You need a way to increase confidence without simply retrying the same approach.

Solution

Deploy multiple agents (potentially with different models, prompts, or temperatures) to independently solve the same problem. Collect all outputs and apply a voting mechanism: majority vote for discrete answers, or a scoring/ranking system for complex outputs. The consensus answer is more reliable than any individual response. This extends Self-Consistency from a single-model technique to a multi-agent architecture.

Diagram

              ┌→ [Agent 1 (GPT-4)] → Answer A
              │
Task → [Distribute] → [Agent 2 (Claude)] → Answer A
              │
              ├→ [Agent 3 (Gemini)] → Answer B
              │
              └→ [Agent 4 (GPT-4, different prompt)] → Answer A

                    Vote: A wins (3:1) → Final Answer: A

When to Use

  • Critical decisions where accuracy is paramount
  • When you have access to multiple models or configurations
  • Quality assurance for high-stakes outputs
  • Reducing model-specific biases through diversity

When NOT to Use

  • Low-stakes tasks where one agent is sufficient
  • When cost is more important than accuracy
  • Creative tasks where diversity of output is desired

Pros & Cons

Pros

  • Higher accuracy than any single agent
  • Reduces model-specific biases
  • Vote margin provides a confidence signal
  • Simple aggregation logic

Cons

  • N× cost increase
  • Only works for tasks with definitive answers
  • All agents might share the same blind spots
  • Latency limited by the slowest agent

Implementation Steps

  1. 1Select 3-5 diverse agents (different models, prompts, or configurations)
  2. 2Send the same task to all agents in parallel
  3. 3Collect responses and extract comparable answers
  4. 4Apply voting: majority vote, weighted vote, or ranked choice
  5. 5Use vote margin as confidence: unanimous = high, split = low
  6. 6For low-confidence results, escalate to human review

Real-World Example

Medical Document Classification

A clinical document needs to be classified by urgency. Three agents with different specializations independently classify it. Two say 'urgent,' one says 'routine.' Majority vote: 'urgent.' The split vote triggers a flag for human review.

PythonMulti-Model Consensus Voting
from openai import OpenAI
from collections import Counter

client = OpenAI()

def consensus_vote(question: str, n_agents: int = 5) -> dict:
    raw_answers = []
    for _ in range(n_agents):
        response = client.chat.completions.create(
            model="gpt-4o",
            temperature=0.7,
            messages=[{"role": "user", "content": f"{question}\n\nGive a concise final answer after ANSWER:"}],
        )
        raw_answers.append(response.choices[0].message.content)

    answers = []
    for text in raw_answers:
        if "ANSWER:" in text:
            answers.append(text.split("ANSWER:")[-1].strip())

    votes = Counter(answers)
    winner, count = votes.most_common(1)[0]
    return {"answer": winner, "confidence": count / n_agents, "votes": dict(votes)}

References