Adding manual and various other fixes for the 0.3 release (#36)

* cleaning some assertions * changes * bumping the release version Co-authored-by: Tim Hunter <tjhunter@cs.stanford.edu>
tjhunter · Oct 24, 2022 · 4e00df4 · 4e00df4
1 parent fa4c49f
commit 4e00df4
Show file tree

Hide file tree

Showing 13 changed files with 502 additions and 81 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,7 +1,7 @@
 # Generated by Cargo
 # will have compiled files and executables
 /target/
-
+/ranked_voting/target/
 # Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
 # More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
 Cargo.lock

diff --git a/Cargo.toml b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
-name = "ranked-voting"
-version = "0.2.0"
+name = "timrcv"
+version = "0.3.0"
 edition = "2021"
 # author = ["Tim Hunter <tjhunter@cs.stanford.edu>"]
 

diff --git a/ranked_voting/src/config.rs b/ranked_voting/src/config.rs
@@ -91,6 +91,7 @@ pub enum VotingErrors {
     ///
     // TODO: explain when it may happen
     NoConvergence,
+    NoCandidateToEliminate,
 }
 
 impl Error for VotingErrors {}

diff --git a/ranked_voting/src/lib.rs b/ranked_voting/src/lib.rs
@@ -1,6 +1,98 @@
+/*!
+The `ranked_voting` crate provides a thoroughly tested implementation of the
+[Instant-Runoff Voting algorithm](https://en.wikipedia.org/wiki/Instant-runoff_voting),
+which is also called ranked-choice voting in the United States, preferential voting
+in Australia or alternative vote in the United Kingdom.
+
+This library can be used in multiple flavours:
+- as a simple library for most cases (see the [run_election1] function)
+
+- as a command-line utility that provides fast and easy election results that can then
+be displayed or exported. The section [timrcv](#timrcv) provides a manual.
+
+- as a more complex library that can handle all the diversity of implementations. It provides
+for example multiple ways to deal with blank or absentee ballots, undeclared candidates, etc.
+If you are attempting to replicate the results of a specific elections, you should
+carefully check the voting rules and use the configuration accordingly. If you are doing so,
+you should check [run_election] and [VoteRules]
+
+# timrcv
+
+`timrcv` is a command-line program to run an instant runoff election. It can accomodate all common formats from vendors
+or public offices. This document presents a tutorial on how to use it.
+
+## Installation
+
+Download the latest release from the [releases page](https://github.com/tjhunter/timrcv/releases).
+ Pre-compiled versions are available for Windows, MacOS and Linux.
+
+
+## Quick start with existing data
+
+If you are running a poll and are collecting data using Microsoft Forms,
+Google Form, Qualtrics, look at the [quick start using Google Forms](quick_start/index.html).
+
+If you have very simple needs and you can collect data in a
+small text file, `timrcv` accepts a simple format of
+comma-separated values.
+
+
+To get started, let us say that you have a file with the following records of votes ([example.csv](https://github.com/tjhunter/timrcv/raw/main/tests/csv_simple_2/example.csv)). Each line corresponds to a vote, and A,B,C and D are the candidates:
+
+```text
+A,B,,D
+A,C,B,
+B,A,D,C
+B,C,A,D
+C,A,B,D
+D,B,A,C
+```
+Each line is a recorded vote. The first line `A,B,,D` says that this voter preferred candidate A over everyone else (his/her first choice), followed by B as a second choice and finally D as a last choice.
+
+Running a vote with the default options is simply:
+
+```bash
+timrcv --input example.csv
+```
+
+Output:
+
+```text
+[ INFO  ranked_voting] run_voting_stats: Processing 6 votes
+[ INFO  ranked_voting] Processing 6 aggregated votes
+[ INFO  ranked_voting] Candidate: 1: A
+[ INFO  ranked_voting] Candidate: 2: B
+[ INFO  ranked_voting] Candidate: 3: C
+[ INFO  ranked_voting] Candidate: 4: D
+[ INFO  ranked_voting] Round 1 (winning threshold: 4)
+[ INFO  ranked_voting]       2 B -> running
+[ INFO  ranked_voting]       2 A -> running
+[ INFO  ranked_voting]       1 C -> running
+[ INFO  ranked_voting]       1 D -> eliminated:1 -> B,
+[ INFO  ranked_voting] Round 2 (winning threshold: 4)
+[ INFO  ranked_voting]       3 B -> running
+[ INFO  ranked_voting]       2 A -> running
+[ INFO  ranked_voting]       1 C -> eliminated:1 -> A,
+[ INFO  ranked_voting] Round 3 (winning threshold: 4)
+[ INFO  ranked_voting]       3 A -> running
+[ INFO  ranked_voting]       3 B -> eliminated:3 -> A,
+[ INFO  ranked_voting] Round 4 (winning threshold: 4)
+[ INFO  ranked_voting]       6 A -> elected
+```
+
+`timrcv` supports many options (input and output formats, validation of the candidates, configuration of the tabulating process, ...).
+ Look at the [configuration section](manual/index.html#configuration) of the manual for more details.
+
+
+
+
+ */
+
 mod builder;
 mod config;
 pub use builder::Builder;
+pub mod manual;
+pub mod quick_start;
 use log::{debug, info};
 
 use std::{
@@ -217,6 +309,28 @@ pub fn run_election1(
     run_election(&builder)
 }
 
+fn candidates_from_ballots(ballots: &[Ballot]) -> Vec<config::Candidate> {
+    // Take everyone from the election as a valid candidate.
+    let mut cand_set: HashSet<String> = HashSet::new();
+    for ballot in ballots.iter() {
+        for choice in ballot.candidates.iter() {
+            if let BallotChoice::Candidate(name) = choice {
+                cand_set.insert(name.clone());
+            }
+        }
+    }
+    let mut cand_vec: Vec<String> = cand_set.iter().cloned().collect();
+    cand_vec.sort();
+    cand_vec
+        .iter()
+        .map(|n| config::Candidate {
+            name: n.clone(),
+            code: None,
+            excluded: false,
+        })
+        .collect()
+}
+
 /// Runs the voting algorithm with the given rules for the given votes.
 ///
 /// Arguments:
@@ -227,17 +341,20 @@ pub fn run_election1(
 fn run_voting_stats(
     coll: &Vec<Ballot>,
     rules: &config::VoteRules,
-    candidates: &Option<Vec<config::Candidate>>,
+    candidates_o: &Option<Vec<config::Candidate>>,
 ) -> Result<VotingResult, VotingErrors> {
     info!("run_voting_stats: Processing {:?} votes", coll.len());
+    let candidates = candidates_o
+        .to_owned()
+        .unwrap_or_else(|| candidates_from_ballots(coll));
+
     debug!(
         "run_voting_stats: candidates: {:?}, rules: {:?}",
         coll.len(),
         candidates,
     );
 
-    // TODO: ensure candidates
-    let cr: CheckResult = checks(coll, &candidates.clone().unwrap(), rules)?;
+    let cr: CheckResult = checks(coll, &candidates, rules)?;
     let checked_votes = cr.votes;
     debug!(
         "run_voting_stats: Checked votes: {:?}, detected UWIs {:?}",
@@ -596,13 +713,15 @@ fn run_one_round(
     }
 
     // Find the candidates to eliminate
-    let p = find_eliminated_candidates(&tally, rules, candidate_names, num_round);
+    let p = find_eliminated_candidates(&tally, rules, candidate_names, num_round)?;
     let resolved_tiebreak: TiebreakSituation = p.1;
     let eliminated_candidates: HashSet<CandidateId> = p.0.iter().cloned().collect();
 
     // TODO strategy to pick the winning candidates
 
-    assert!(!eliminated_candidates.is_empty(), "No candidate eliminated");
+    if eliminated_candidates.is_empty() {
+        return Err(VotingErrors::NoCandidateToEliminate);
+    }
     debug!("run_one_round: tiebreak situation: {:?}", resolved_tiebreak);
     debug!("run_one_round: eliminated_candidates: {:?}", p.0);
 
@@ -728,22 +847,22 @@ fn find_eliminated_candidates(
     rules: &config::VoteRules,
     candidate_names: &[(String, CandidateId)],
     num_round: u32,
-) -> (Vec<CandidateId>, TiebreakSituation) {
+) -> Result<(Vec<CandidateId>, TiebreakSituation), VotingErrors> {
     // Try to eliminate candidates in batch
     if rules.elimination_algorithm == EliminationAlgorithm::Batch {
         if let Some(v) = find_eliminated_candidates_batch(tally) {
-            return (v, TiebreakSituation::Clean);
+            return Ok((v, TiebreakSituation::Clean));
         }
     }
 
     if let Some((v, tb)) =
         find_eliminated_candidates_single(tally, rules.tiebreak_mode, candidate_names, num_round)
     {
-        return (v, tb);
+        return Ok((v, tb));
     }
     // No candidate to eliminate.
     // TODO check the conditions for this to happen.
-    unimplemented!("find_eliminated_candidates: No candidate to eliminate");
+    Err(VotingErrors::EmptyElection)
 }
 
 fn find_eliminated_candidates_batch(

diff --git a/ranked_voting/src/manual.rs b/ranked_voting/src/manual.rs
@@ -0,0 +1,126 @@
+/*!
+
+This is the long-form manual for `ranked_voting` and `timrcv`.
+
+## Input formats
+
+The following formats are supported:
+* `ess` ES&S company
+* `dominion` Dominion company
+* `cdf` NIST CDF
+* `csv`, `csv_likert` Comma Separated Values in various flavours
+* `msforms`, `msforms_likert`, `msforms_likert_transpose` Input from Microsoft Forms and Google Forms products.
+
+### `ess`
+
+Votes recorded in the ES&S format (Excel spreadsheet).
+
+### `dominion`
+
+Votes recorded in the format from the Dominion company.
+
+### `cdf`
+
+Votes recorded in the Common Data Format from NIST.
+
+Notes:
+- only the JSON notation is currently supported (not the XML)
+- only one election is supported
+
+### `msforms`
+
+Results from Microsoft Forms when using the ranking widget.
+The input file is expected to be in Excel (.xlsx) format.
+See the example in the `tests` directory.
+
+### `msforms_likert`
+
+Results from Microsoft Forms when using the 'Likert' input. It is also compatible with
+Google Forms when candidates are the rows and choices are the columns.
+The input file is expected to be in Excel (.xlsx) format.
+
+See the example in the `tests` directory. Your form is expected to be formatted as followed:
+
+
+|             | choice 1 | choice 2 | ... |
+|-------------|----------|----------|-----|
+| candidate A |          | x        |     |
+| candidate B | x        |          |     |
+| ...         |          |          |     |
+
+In this example, this vote would mark `candidate B` as the first choice and then `candidate A` as a second choice.
+
+In this case, both the names of the choices and of the candidates are mandatory. See the example `msforms_likert` for an example of a configuration file.
+
+### `msforms_likert_transpose`
+
+Results from Microsoft Forms when using the 'Likert' input with the candidates in the first row.
+It is also compatible with Google Forms when the rows are the choices and the columns are
+the candidates. The input file is expected to be in Excel (.xlsx) format.
+See the example in the `tests` directory. Your form is expected to be formatted as followed:
+
+|               | candidate A | candidate B | ... |
+|---------------|-------------|-------------|-----|
+| first choice  |             | x           |     |
+| second choice | x           |             |     |
+| ...           |             |             |     |
+
+In this example, this vote would mark `candidate B` as the first choice and then `candidate A` as a second choice.
+
+In this case, both the names of the choices and of the candidates are mandatory. See the example `msforms_likert_transpose` for an example of a configuration file.
+
+### csv
+
+Simple CSV reader. Each column (in order) is considered to be a choice. The name of the choice in the header is not significant.
+
+```text
+id,count,choice 1,choice 2,choice 3,choice 4
+id1,20,A,B,C,D
+id2,20,A,C,B,D
+```
+
+The `id` and `count` columns are optional. Headers in the first row is optional.
+See the [Configuration section](#configuration) on controling the optional rows and columns.
+
+### csv_likert
+
+Simple CSV reader sorted by candidates. This format is also created by Qualtrics polls. The file is expected to look as follows:
+
+```text
+id,count,A,B,C,D
+id1,20,1,2,3,
+id2,20,1,3,2,4
+```
+
+The `id` and `count` columns are optional. The candidate names must all be a column and defined in the first row of the CSV file. The numbers below are the ranks of this candidate for each ballot (or empty if this candidate was not ranked).
+
+## Configuration
+
+`timrcv` comes with sensible defaults but users may want to apply specific rules
+(for example, how to treat blank choices). The program accepts a configuration file in JSON that follows the specification of the [RCVTab program]()
+
+See the [complete documentation](https://github.com/BrightSpots/rcv/blob/develop/config_file_documentation.txt) for more details.
+ Note that not all options are supported and that some options have been added to better control the use of CSV.
+ Contributions are welcome in this area.
+
+The deviations from the specification of RCVTab are documented below.
+
+> Note: this documenation is incomplete for now.
+
+Deviations for FileSource:
+ - added `count_column_index` (string or number, optional): the location of the column that
+ indicates the counts. If not provided, every vote will be assigned a count of 1.
+
+ - added `excel_worksheet_name` (string, optional): for Excel-based inputs, the name of
+ the worksheet in Excel.
+
+ - added `choices` (array of strings, optional): The list of labels for the choices. For example, if
+   the list is `["First choice", "Second choice"]`, then seeing `First choice` will be
+   intepreted as choice #1, and so on.
+
+
+Deviations for OutputSettings:
+- removed `generateCdfJson`: feature not supported
+- removed `tabulateByPrecinct`: feature not supported
+
+ */