CS50 (Computer Science 50) is an on-campus and online introductory course on computer science taught at Harvard University and, as of 2015, Yale University as well. In 2016, CS50 became available to high school students as an AP course. The course material is available online for free on EdX with a range of certificates available for a fee. The on-campus version is Harvard’s largest class with 800 students, 102 staff and up to 2,200 participants in their regular hackathons.

CS50, Wikipedia

* 此课程笔记根据 Harvard CS50 Spring 2020 完成。

虽然已经有一段时间了,但那天父亲突然走进房间,问我知不知道 Harvard 的 CS50,让我一头雾水。原来是电视上在放一套反思美国高校教育体系的纪录片,似乎是把 CS50 的创意性吹得天花乱坠。看来应该找时间看一看原片,到底是怎样的情况呢。

第 3 周介绍了算法,三种符号表示都提到了。从搜索:线性搜索和二分搜索开始。然后中途简单地说明了 C 的结构体。主要讲了排序方法:冒泡排序(包括改良 flag 版)、选择排序、通过递归思想引入了归并排序。大部分内容引用自 CS50 课程官方的笔记

Searching

  • Last time, we talked about memory in a computer, or RAM, and how our data can be stored as individual variables or as arrays of many items, or elements.
  • We can think of an array with a number of items as a row of lockers, where a computer can only open one locker to look at an item, one at a time.
  • For example, if we want to check whether a number is in an array, with an algorithm that took in an array as input and produce a boolean as a result, we might:
    • look in each locker, or at each element, one at a time, from the beginning to the end.
      • This is called linear search, where we move in a line, since our array isn’t sorted.
    • start in the middle and move left or right depending on what we’re looking for, if our array of items is sorted.
      • This is called binary search, since we can divide our problem in two with each step, like what David did with the phone book in week 0.
  • We might write pseudocode for linear search with:
 For i from 0 to n–1
    If i'th element is 50
        Return true
Return false
  • We can label each of n lockers from 0 to n–1, and check each of them in order.

  • For binary search, our algorithm might look like:
If no items
    Return false
If middle item is 50
    Return true
Else if 50 < middle item
    Search left half
Else if 50 > middle item
    Search right half
  • Eventually, we won’t have any parts of the array left (if the item we want wasn’t in it), so we can return false.
    • Otherwise, we can search each half depending on the value of the middle item.

Big O

  • In week 0, we saw different types of algorithms and their running times: chart with: "size of problem" as x–axis; "time to solve" as y–axis; red, steep straight line from origin to top of graph labeled "n"; yellow, less steep straight line from origin to top of graph labeled "n/2"; green, curved line that gets less and less steep from origin to right of graph labeled "log_2 n"
  • The more formal way to describe this is with big O notation, which we can think of as “on the order of”. For example, if our algorithm is linear search, it will take approximately O(n) steps, “on the order of n”. In fact, even an algorithm that looks at two items at a time and takes n/2 steps has O(n). This is because, as n gets bigger and bigger, only the largest term, n, matters.
  • Similarly, a logarithmic running time is O(log n), no matter what the base is, since this is just an approximation of what happens with n is very large.
  • There are some common running times:
    • O(n2)
    • O(n log n)
    • O(n)
      • (linear search)
    • O(log n)
      • (binary search)
    • O(1)
  • Computer scientists might also use big Ω, big Omega notation, which is the lower bound of number of steps for our algorithm. (Big O is the upper bound of number of steps, or the worst case, and typically what we care about more.) With linear search, for example, the worst case is n steps, but the best case is 1 step since our item might happen to be the first item we check. The best case for binary search, too, is 1 since our item might be in the middle of the array.
  • And we have a similar set of the most common big Ω running times:
    • Ω(n2)
    • Ω(n log n)
    • Ω(n)
      • (counting the number of items)
    • Ω(log n)
    • Ω(1)
      • (linear search, binary search)

Linear Search

  • Let’s take a look at numbers.c:
#include <cs50.h>
#include <stdio.h>

int main(void)
{
    // An array of numbers
    int numbers[] = {4, 8, 15, 16, 23, 42};

    // Search for 50
    for (int i = 0; i < 6; i++)
    {
        if (numbers[i] == 50)
        {
            printf("Found\n");
            return 0;
        }
    }
    printf("Not found\n");
    return 1;
}
  • Here we initialize an array with some values, and we check the items in the array one at a time, in order.
  • And in each case, depending on whether the value was found or not, we can return an exit code of either 0 (for success) or 1 (for failure).
  • We can do the same for names:
#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    // An array of names
    string names[] = {"EMMA", "RODRIGO", "BRIAN", "DAVID"};

    // Search for EMMA
    for (int i = 0; i < 4; i++)
    {
        if (strcmp(names[i], "EMMA") == 0)
        {
            printf("Found\n");
            return 0;
        }
    }
    printf("Not found\n");
    return 1;
}
  • We can’t compare strings directly, since they’re not a simple data type but rather an array of many characters, and we need to compare them differently. Luckily, the string library has a strcmp function which compares strings for us and returns 0 if they’re the same, so we can use that.
  • Let’s try to implement a phone book with the same ideas:
#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    string names[] = {"EMMA", "RODRIGO", "BRIAN", "DAVID"};
    string numbers[] = {"617–555–0100", "617–555–0101", "617–555–0102", "617–555–0103"};

    for (int i = 0; i < 4; i++)
    {
        if (strcmp(names[i], "EMMA") == 0)
        {
            printf("Found %s\n", numbers[i]);
            return 0;
        }
    }
    printf("Not found\n");
    return 1;
}
  • We’ll use strings for phone numbers, since they might include formatting or be too long for a number.
  • Now, if the name at a certain index in the names array matches who we’re looking for, we’ll return the phone number in the numbers array, at the same index. But that means we need to particularly careful to make sure that each number corresponds to the name at each index, especially if we add or remove names and numbers.

Structs

  • It turns out that we can make our own custom data types called structs:
#include <cs50.h>
#include <stdio.h>
#include <string.h>

typedef struct
{
    string name;
    string number;
}
person;

int main(void)
{
    person people[4];

    people[0].name = "EMMA";
    people[0].number = "617–555–0100";

    people[1].name = "RODRIGO";
    people[1].number = "617–555–0101";

    people[2].name = "BRIAN";
    people[2].number = "617–555–0102";

    people[3].name = "DAVID";
    people[3].number = "617–555–0103";

    // Search for EMMA
    for (int i = 0; i < 4; i++)
    {
        if (strcmp(people[i].name, "EMMA") == 0)
        {
            printf("Found %s\n", people[i].number);
            return 0;
        }
    }
    printf("Not found\n");
    return 1;
}
  • We can think of structs as containers, inside of which are multiple other data types.
  • Here, we create our own type with a struct called person, which will have a string called name and a string called number. Then, we can create an array of these struct types and initialize the values inside each of them, using a new syntax, ., to access the properties of each person.
  • In our loop, we can now be more certain that the number corresponds to the name since they are from the same person element.

Bubble Sort

  • If our input is an unsorted list of numbers, there are many algorithms we could use to produce an output of a sorted list.
  • With eight volunteers on the stage with the following numbers, we might consider swapping pairs of numbers next to each other as a first step.
  • Our volunteers start in the following random order:
6 3 8 5 2 7 4 1
  • We look at the first two numbers, and swap them so they are in order:
6 3 8 5 2 7 4 1
– –
3 6 8 5 2 7 4 1
  • The next pair, 6 and 8, are in order, so we don’t need to swap them.
  • The next pair, 8 and 5, need to be swapped:
3 6 8 5 2 7 4 1
    – –
3 6 5 8 2 7 4 1

We continue until we reach the end of the list:

3 6 5 2 8 7 4 1
        – –
3 6 5 2 7 8 4 1
          – –
3 6 5 2 7 4 8 1
            – –
3 6 5 2 7 4 1 8
  • Our list isn’t sorted yet, but we’re slightly closer to the solution because the biggest value, 8, has been shifted all the way to the right.
  • We repeat this with another pass through the list:
3 6 5 2 7 4 1 8
– –
3 6 5 2 7 4 1 8
  – –
3 5 6 2 7 4 1 8
    – –
3 5 2 6 7 4 1 8
      – –
3 5 2 6 7 4 1 8
        – –
3 5 2 6 4 7 1 8
            – –
3 5 2 6 4 1 7 8
  • Note that we didn’t need to swap the 3 and 6, or the 6 and 7.
  • Now, the next biggest value, 7, moved all the way to the right. If we repeat this, more and more of the list becomes sorted, and pretty quickly we have a fully sorted list.
  • This algorithm is called bubble sort, where large values “bubble” to the right. The pseudocode for this might look like:
Repeat n–1 times
    For i from 0 to n–2
        If i'th and i+1'th elements out of order
            Swap them
  • Since we are comparing the i'th and i+1'th element, we only need to go up to n – 2 for i. Then, we swap the two elements if they’re out of order.
  • And we can stop after we’ve made n – 1 passes, since we know the largest n–1 elements will have bubbled to the right.
  • We have n – 2 steps for the inner loop, and n – 1 loops, so we get n2 – 3n + 2 steps total. But the largest factor, or dominant term, is n2, as n gets larger and larger, so we can say that bubble sort is O(n2).
  • We’ve seen running times like the following, and so even though binary search is much faster than linear search, it might not be worth the one–time cost of sorting the list first, unless we do lots of searches over time:
    • O(n2)
      • bubble sort
    • O(n log n)
    • O(n)
      • linear search
    • O(log n)
      • binary search
    • O(1)
  • And Ω for bubble sort is still n2, since we still check each pair of elements for n – 1 passes.

Selection Sort

  • We can take another approach with the same set of numbers: 6 3 8 5 2 7 4 1
  • First, we’ll look at each number, and remember the smallest one we’ve seen. Then, we can swap it with the first number in our list, since we know it’s the smallest:
6 3 8 5 2 7 4 1
–             –
1 3 8 5 2 7 4 6
  • Now we know at least the first element of our list is in the right place, so we can look for the smallest element among the rest, and swap it with the next unsorted element (now the second element):
1 3 8 5 2 7 4 6
  –     –
1 2 8 5 3 7 4 6
  • We can repeat this over and over, until we have a sorted list.
  • This algorithm is called selection sort, and we might write pseudocode like this:
For i from 0 to n–1
    Find smallest item between i'th item and last item
    Swap smallest item with i'th item
  • With big O notation, we still have running time of O(n2), since we were looking at roughly all n elements to find the smallest, and making n passes to sort all the elements.
  • More formally, we can use some formulas to show that the biggest factor is indeed n2:
  • n + (n – 1) + (n – 2) + ... + 1
    n(n + 1)/2
    (n^2 + n)/2
    n^2/2 + n/2
    O(n^2)
  • So it turns out that selection sort is fundamentally about the same as bubble sort in running time:
    • O(n2)
      • bubble sort, selection sort
    • O(n log n)
    • O(n)
      • linear search
    • O(log n)
      • binary search
    • O(1)
  • The best case, Ω, is also n2.
  • We can go back to bubble sort and change its algorithm to be something like this, which will allow us to stop early if all the elements are sorted:
Repeat until no swaps
    For i from 0 to n–2
        If i'th and i+1'th elements out of order
            Swap them
  • Now, we only need to look at each element once, so the best case is now Ω(n):
    • Ω(n2)
      • selection sort
    • Ω(n log n)
    • Ω(n)
      • bubble sort
    • Ω(log n)
    • Ω(1)
      • linear search, binary search
  • We look at a visualization online comparing sorting algorithms with animations for how the elements move within arrays for both bubble sort and insertion sort.

Recursion

  • Recall that in week 0, we had pseudocode for finding a name in a phone book, where we had lines telling us to “go back” and repeat some steps:
1  Pick up phone book
2  Open to middle of phone book
3  Look at page
4  If Smith is on page
5      Call Mike
6  Else if Smith is earlier in book
7      Open to middle of left half of book
8      **Go back to line 3**
9  Else if Smith is later in book
10     Open to middle of right half of book
11     **Go back to line 3**
12 Else
13     Quit
  • We could instead just repeat our entire algorithm on the half of the book we have left:
1  Pick up phone book
2  Open to middle of phone book
3  Look at page
4  If Smith is on page
5      Call Mike
6  Else if Smith is earlier in book
7      **Search left half of book**
8
9  Else if Smith is later in book
10     **Search right half of book**
11
12 Else
13     Quit
  • This seems like a cyclical process that will never end, but we’re actually dividing the problem in half each time, and stopping once there’s no more book left.
  • Recursion occurs when a function or algorithm refers to itself, as in the new pseudocode above. In week 1, too, we implemented a “pyramid” of blocks in the following shape:
#
##
###
####
  • And we might have had iterative code like this:
#include <cs50.h>
#include <stdio.h>

void draw(int h);

int main(void)
{
    // Get height of pyramid
    int height = get_int("Height: ");

    // Draw pyramid
    draw(height);
}

void draw(int h)
{
    // Draw pyramid of height h
    for (int i = 1; i <= h; i++)
    {
        for (int j = 1; j <= i; j++)
        {
            printf("#");
        }
        printf("\n");
    }
}
  • Here, we use for loops to print each block in each row.

But notice that a pyramid of height 4 is actually a pyramid of height 3, with an extra row of 4 blocks added on. And a pyramid of height 3 is a pyramid of height 2, with an extra row of 3 blocks. A pyramid of height 2 is a pyramid of height 1, with an extra row of 2 blocks. And finally, a pyramid of height 1 is just a pyramid of height 0, or nothing, with another row of a single block added on. With this idea in mind, we can write:

#include <cs50.h>
#include <stdio.h>

void draw(int h);

int main(void)
{
    // Get height of pyramid
    int height = get_int("Height: ");

    // Draw pyramid
    draw(height);
}

void draw(int h)
{
    // If nothing to draw
    if (h == 0)
    {
        return;
    }

    // Draw pyramid of height h – 1
    draw(h – 1);

    // Draw one more row of width h
    for (int i = 0; i < h; i++)
    {
        printf("#");
    }
    printf("\n");
}
  • Now, our draw function first calls itself recursively, drawing a pyramid of height h – 1. But even before that, we need to stop if h is 0, since there won’t be anything left to drawn.
  • After, we draw the next row, or a row of width h.

Merge Sort

  • We can take the idea of recusion to sorting, with another algorithm called merge sort. The pseudocode might look like:
If only one item
  Return
Else
    Sort left half of items
    Sort right half of items
    Merge sorted halves
  • We’ll best see this in practice with an unsorted list:
7 4 5 2 6 3 8 1
  • First, we’ll sort the left half (the first four elements):
7 4 5 2 | 6 3 8 1
– – – –
  • Well, to sort that, we need to sort the left half of the left half first:
7 4 | 5 2 | 6 3 8 1
– –
  • Now, we have just one item, 7, in the left half, and one item, 4, in the right half. So we’ll merge that together, by taking the smallest item from each list first:
– – | 5 2 | 6 3 8 1
4 7
  • And now we go back to the right half of the left half, and sort it:
– – | – – | 6 3 8 1
4 7 | 2 5
  • Now, both halves of the left half are sorted, so we can merge the two of them together. We look at the start of each list, and take 2 since it’s smaller than 4. Then, we take 4, since it’s now the smallest item at the front of both lists. Then, we take 5, and finally, 7, to get:
– – – – | 6 3 8 1
– – – –
2 4 5 7
  • We now sort the right half the same way. First, the left half of the right half:
– – – – | – – | 8 1
– – – – | 3 6 |
2 4 5 7
  • Then, the right half of the right half:
– – – – | – – | – –
– – – – | 3 6 | 1 8
2 4 5 7
  • We can merge the right half together now:
– – – – | – – – –
– – – – | – – – –
2 4 5 7 | 1 3 6 8
  • And finally, we can merge both halves of the whole list, following the same steps as before. Notice that we don’t need to check all the elements of each half to find the smallest, since we know that each half is already sorted. Instead, we just take the smallest element of the two at the start of each half:
– – – – | – – – –
– – – – | – – – –
2 4 5 7 | – 3 6 8
1
– – – – | – – – –
– – – – | – – – –
– 4 5 7 | – 3 6 8
1 2
– – – – | – – – –
– – – – | – – – –
– 4 5 7 | – – 6 8
1 2 3
– – – – | – – – –
– – – – | – – – –
– – 5 7 | – – 6 8
1 2 3 4
– – – – | – – – –
– – – – | – – – –
– – – 7 | – – 6 8
1 2 3 4   5
– – – – | – – – –
– – – – | – – – –
– – – 7 | – – – 8
1 2 3 4   5 6
– – – – | – – – –
– – – – | – – – –
– – – – | – – – 8
1 2 3 4   5 6 7
– – – – | – – – –
– – – – | – – – –
– – – – | – – – –
1 2 3 4   5 6 7 8
  • It took a lot of steps, but it actually took fewer steps than the other algorithms we’ve seen so far. We broke our list in half each time, until we were “sorting” eight lists with one element each:
7 | 4 | 5 | 2 | 6 | 3 | 8 | 1
4   7 | 2   5 | 3   6 | 1   8
2   4   5   7 | 1   3   6   8
1   2   3   4   5   6   7   8
  • Since our algorithm divided the problem in half each time, its running time is logarithmic with O(log n). And after we sorted each half (or half of a half), we needed to merge together all the elements, with n steps since we had to look at each element once.
  • So our total running time is O(n log n):
    • O(n2)
      • bubble sort, selection sort
    • O(n log n)
      • merge sort
    • O(n)
      • linear search
    • O(log n)
      • binary search
    • O(1)
  • Since log n is greater than 1 but less than n, n log n is in between n (times 1) and n2.
  • The best case, Ω, is still n log n, since we still sort each half first and then merge them together:
    • Ω(n2)
      • selection sort
    • Ω(n log n)
      • merge sort
    • Ω(n)
      • bubble sort
    • Ω(log n)
    • Ω(1)
      • linear search, binary search
  • Finally, there is another notation, Θ, Theta, which we use to describe running times of algorithms if the upper bound and lower bound is the same. For example, merge sort has Θ(n log n) since the best and worst case both require the same number of steps. And selection sort has Θ(n2).
  • We look at a final visualization of sorting algorithms with a larger number of inputs, running at the same time.

习题个人解答

Plurality

这回变成填空游戏了,给代码框架,要求实现其中内容,以完成多数制选举的统计。

  • Let’s now take a look at plurality.c and read through the distribution code that’s been provided to you.
  • The line #define MAX 9 is some syntax used here to mean that MAX is a constant (equal to 9) that can be used throughout the program. Here, it represents the maximum number of candidates an election can have.
  • The file then defines a struct called a candidate. Each candidate has two fields: a string called name representing the candidate’s name, and an int called votes representing the number of votes the candidate has. Next, the file defines a global array of candidates, where each element is itself a candidate.
  • Now, take a look at the main function itself. See if you can find where the program sets a global variable candidate_count representing the number of candidates in the election, copies command-line arguments into the array candidates, and asks the user to type in the number of voters. Then, the program lets every voter type in a vote (see how?), calling the vote function on each candidate voted for. Finally, main makes a call to the print_winner function to print out the winner (or winners) of the election.
  • If you look further down in the file, though, you’ll notice that the vote and print_winner functions have been left blank. This part is up to you to complete!

Complete the implementation of plurality.c in such a way that the program simulates a plurality vote election.

  • Complete the vote function.
    • vote takes a single argument, a string called name, representing the name of the candidate who was voted for.
    • If name matches one of the names of the candidates in the election, then update that candidate’s vote total to account for the new vote. The vote function in this case should return true to indicate a successful ballot.
    • If name does not match the name of any of the candidates in the election, no vote totals should change, and the vote function should return false to indicate an invalid ballot.
    • You may assume that no two candidates will have the same name.
  • Complete the print_winner function.
    • The function should print out the name of the candidate who received the most votes in the election, and then print a newline.
    • It is possible that the election could end in a tie if multiple candidates each have the maximum number of votes. In that case, you should output the names of each of the winning candidates, each on a separate line.

You should not modify anything else in plurality.c other than the implementations of the vote and print_winner functions (and the inclusion of additional header files, if you’d like).

#include <cs50.h>
#include <stdio.h>
#include <string.h>

// Max number of candidates
#define MAX 9

// Candidates have name and vote count
typedef struct
{
    string name;
    int votes;
}
candidate;

// Array of candidates
candidate candidates[MAX];

// Number of candidates
int candidate_count;

// Function prototypes
bool vote(string name);
void print_winner(void);

int main(int argc, string argv[])
{
    // Check for invalid usage
    if (argc < 2)
    {
        printf("Usage: plurality [candidate ...]\n");
        return 1;
    }

    // Populate array of candidates
    candidate_count = argc - 1;
    if (candidate_count > MAX)
    {
        printf("Maximum number of candidates is %i\n", MAX);
        return 2;
    }
    for (int i = 0; i < candidate_count; i++)
    {
        candidates[i].name = argv[i + 1];
        candidates[i].votes = 0;
    }

    int voter_count = get_int("Number of voters: ");

    // Loop over all voters
    for (int i = 0; i < voter_count; i++)
    {
        string name = get_string("Vote: ");

        // Check for invalid vote
        if (!vote(name))
        {
            printf("Invalid vote.\n");
        }
    }

    // Display winner of election
    print_winner();
}

// Update vote totals given a new vote
bool vote(string name)
{
    for (int i = 0; i < candidate_count; i++)
    {
        if (strcmp(candidates[i].name, name) == 0)
        {
            candidates[i].votes++;
            return true;
        }
    }
    return false;
}

// Print the winner (or winners) of the election
void print_winner(void)
{
    int max = 0;
    for (int i = 1; i < candidate_count; i++)
    {
        if (candidates[i].votes > candidates[max].votes)
            max = i;
    }
    for (int i = 0; i < candidate_count; i++)
    {
        if (candidates[i].votes == candidates[max].votes)
            printf("%s\n", candidates[i].name);
    }
    return;
}

Runoff

依旧是选举,不过是排序复选制(Instant-Runoff Voting, IRV),即一次按志愿次序多投,从投票名次大者开始分配选票给候选人;当未有多数(majority)出现时,淘汰最少得票者,将其选票顺延至下一志愿;如此直到多数候选者出现。

  • Let’s open up runoff.c to take a look at what’s already there. We’re defining two constants: MAX_CANDIDATES for the maximum number of candidates in the election, and MAX_VOTERS for the maximum number of voters in the election.
  • Next up is a two-dimensional array preferences. The array preferences[i] will represent all of the preferences for voter number i, and the integer preferences[i][j] here will store the index of the candidate who is the jth preference for voter i.
  • Next up is a struct called candidate. Every candidate has a string field for their name, and int representing the number of votes they currently have, and a bool value called eliminated that indicates whether the candidate has been eliminated from the election. The array candidates will keep track of all of the candidates in the election.
  • The program also has two global variables: voter_count and candidate_count.
  • Now onto main. Notice that after determining the number of candidates and the number of voters, the main voting loop begins, giving every voter a chance to vote. As the voter enters their preferences, the vote function is called to keep track of all of the preferences. If at any point, the ballot is deemed to be invalid, the program exits.
  • Once all of the votes are in, another loop begins: this one’s going to keep looping through the runoff process of checking for a winner and eliminating the last place candidate until there is a winner.
  • The first call here is to a function called tabulate, which should look at all of the voters’ preferences and compute the current vote totals, by looking at each voter’s top choice candidate who hasn’t yet been eliminated. Next, the print_winner function should print out the winner if there is one; if there is, the program is over. But otherwise, the program needs to determine the fewest number of votes anyone still in the election received (via a call to find_min). If it turns out that everyone in the election is tied with the same number of votes (as determined by the is_tie function), the election is declared a tie; otherwise, the last-place candidate (or candidates) is eliminated from the election via a call to the eliminate function.
  • If you look a bit further down in the file, you’ll see that these functions — vote, tabulate, print_winner, find_min, is_tie, and eliminate — are all left to up to you to complete!

Complete the implementation of runoff.c in such a way that it simulates a runoff election. You should complete the implementations of the vote, tabulate, print_winner, find_min, is_tie, and eliminate functions, and you should not modify anything else in plurality.c (and the inclusion of additional header files, if you’d like).

Complete the vote function.

  • The function takes arguments voter, rank, and name. If name is a match for the name of a valid candidate, then you should update the global preferences array to indicate that the voter voter has that candidate as their rank preference (where 0 is the first preference, 1 is the second preference, etc.).
  • If the preference is successfully recorded, the function should return true; the function should return false otherwise (if, for instance, name is not the name of one of the candidates).
  • You may assume that no two candidates will have the same name.
  • Hints
    • Recall that candidate_count stores the number of candidates in the election.
    • Recall that you can use strcmp to compare two strings.
    • Recall that preferences[i][j] stores the index of the candidate who is the jth ranked preference for the ith voter.

Complete the tabulate function.

  • The function should update the number of votes each candidate has at this stage in the runoff.
  • Recall that at each stage in the runoff, every voter effectively votes for their top-preferred candidate who has not already been eliminated.
  • Hints
    • Recall that voter_count stores the number of voters in the election.
    • Recall that for a voter i, their top choice candidate is represented by preferences[i][0], their second choice candidate by preferences[i][1], etc.
    • Recall that the candidate struct has a field called eliminated, which will be true if the candidate has been eliminated from the election.
    • Recall that the candidate struct has a field called votes, which you’ll likely want to update for each voter’s preferred candidate.

Complete the print_winner function.

  • If any candidate has more than half of the vote, their name should be printed to stdout and the function should return true.
  • If nobody has won the election yet, the function should return false.
  • Hints
    • Recall that voter_count stores the number of voters in the election. Given that, how would you express the number of votes needed to win the election?

Complete the find_min function.

  • The function should return the minimum vote total for any candidate who is still in the election.
  • Hints
    • You’ll likely want to loop through the candidates to find the one who is both still in the election and has the fewest number of votes. What information should you keep track of as you loop through the candidates?

Complete the is_tie function.

  • The function takes an argument min, which will be the minimum number of votes that anyone in the election currently has.
  • The function should return true if every candidate remaining in the election has the same number of votes, and should return false otherwise.
  • Hints
    • Recall that a tie happens if every candidate still in the election has the same number of votes. Note, too, that the is_tie function takes an argument min, which is the smallest number of votes any candidate currently has. How might you use that information to determine if the election is a tie (or, conversely, not a tie)?

Complete the eliminate function.

  • The function takes an argument min, which will be the minimum number of votes that anyone in the election currently has.
  • The function should eliminate the candidate (or candidates) who have min number of votes.
#include <cs50.h>
#include <stdio.h>
#include <string.h>

// Max voters and candidates
#define MAX_VOTERS 100
#define MAX_CANDIDATES 9

// preferences[i][j] is jth preference for voter i
int preferences[MAX_VOTERS][MAX_CANDIDATES];

// Candidates have name, vote count, eliminated status
typedef struct
{
    string name;
    int votes;
    bool eliminated;
}
candidate;

// Array of candidates
candidate candidates[MAX_CANDIDATES];

// Numbers of voters and candidates
int voter_count;
int candidate_count;

// Function prototypes
bool vote(int voter, int rank, string name);
void tabulate(void);
bool print_winner(void);
int find_min(void);
bool is_tie(int min);
void eliminate(int min);

int main(int argc, string argv[])
{
    // Check for invalid usage
    if (argc < 2)
    {
        printf("Usage: runoff [candidate ...]\n");
        return 1;
    }

    // Populate array of candidates
    candidate_count = argc - 1;
    if (candidate_count > MAX_CANDIDATES)
    {
        printf("Maximum number of candidates is %i\n", MAX_CANDIDATES);
        return 2;
    }
    for (int i = 0; i < candidate_count; i++)
    {
        candidates[i].name = argv[i + 1];
        candidates[i].votes = 0;
        candidates[i].eliminated = false;
    }

    voter_count = get_int("Number of voters: ");
    if (voter_count > MAX_VOTERS)
    {
        printf("Maximum number of voters is %i\n", MAX_VOTERS);
        return 3;
    }

    // Keep querying for votes
    for (int i = 0; i < voter_count; i++)
    {

        // Query for each rank
        for (int j = 0; j < candidate_count; j++)
        {
            string name = get_string("Rank %i: ", j + 1);

            // Record vote, unless it's invalid
            if (!vote(i, j, name))
            {
                printf("Invalid vote.\n");
                return 4;
            }
        }

        printf("\n");
    }

    // Keep holding runoffs until winner exists
    while (true)
    {
        // Calculate votes given remaining candidates
        tabulate();

        // Check if election has been won
        bool won = print_winner();
        if (won)
        {
            break;
        }

        // Eliminate last-place candidates
        int min = find_min();
        bool tie = is_tie(min);

        // If tie, everyone wins
        if (tie)
        {
            for (int i = 0; i < candidate_count; i++)
            {
                if (!candidates[i].eliminated)
                {
                    printf("%s\n", candidates[i].name);
                }
            }
            break;
        }

        // Eliminate anyone with minimum number of votes
        eliminate(min);

        // Reset vote counts back to zero
        for (int i = 0; i < candidate_count; i++)
        {
            candidates[i].votes = 0;
        }
    }
    return 0;
}

// Record preference if vote is valid
bool vote(int voter, int rank, string name)
{
    for (int i = 0; i < candidate_count; i++)
    {
        if (strcmp(candidates[i].name, name) == 0)
        {
            preferences[voter][rank] = i;
            return true;
        }
    }
    return false;
}

// Tabulate votes for non-eliminated candidates
void tabulate(void)
{
    for (int i = 0; i < candidate_count; i++)
    {
        candidates[i].votes = 0;
    }
    for (int i = 0; i < voter_count; i++)
    {
        int elect;
        int chose = -1;
        do
        {
            chose++;
            elect = preferences[i][chose];
        } while (candidates[elect].eliminated);
        candidates[elect].votes++;
    }
    return;
}

// Print the winner of the election, if there is one
bool print_winner(void)
{
    int threshold = voter_count / 2 + 1;
    for (int i = 0; i < candidate_count; i++)
    {
        if (candidates[i].votes >= threshold)
        {
            printf("%s\n", candidates[i].name);
            return true;
        }
    }
    return false;
}

// Return the minimum number of votes any remaining candidate has
int find_min(void)
{
    int min = 0;
    for (int i = 1; i < candidate_count; i++)
        if (!candidates[i].eliminated && candidates[i].votes < candidates[min].votes)
            min = i;
    return candidates[min].votes;
}

// Return true if the election is tied between all candidates, false otherwise
bool is_tie(int min)
{
    for (int i = 0; i < candidate_count; i++)
        if (!candidates[i].eliminated && candidates[i].votes != min)
            return false;
    return true;
}

// Eliminate the candidate (or candidiates) in last place
void eliminate(int min)
{
    for (int i = 0; i < candidate_count; i++)
        if (candidates[i].votes == min)
            candidates[i].eliminated = true;
    return;
}

Tideman

一种新型的排序选举制,通过选举者两两组队分析胜负关系生成有向图,按照胜利程度(碾压还是险胜)次序在避免形成环的情况下添加边,若产生冲突则略去该边;最后,将有向图的源元素(全部为出度、无入度)作为胜利者输出。有个比较 tricky 的地方是如何判断环,可以用递归来实现。添加 A->B 后若能形成闭环,说明有 B 通过某些边与 A 连通;对于其中包含关系 C、D->A,则有 B 通过某些边与 C 或 D 连通……如此递归逆向寻找端点关系,最终返回查到与否;若能查到则有环,反之则无。

  • Let’s open up tideman.c to take a look at what’s already there.
  • First, notice the two-dimensional array preferences. The integer preferences[i][j] will represent the number of voters who prefer candidate i over candidate j.
  • The file also defines another two-dimensional array, called locked, which will represent the candidate graph. locked is a boolean array, so locked[i][j] being true represents the existence of an edge pointing from candidate i to candidate j; false means there is no edge. (If curious, this representation of a graph is known as an “adjacency matrix”).
  • Next up is a struct called pair, used to represent a pair of candidates: each pair includes the winner’s candidate index and the loser’s candidate index.
  • The candidates themselves are stored in the array candidates, which is an array of strings representing the names of each of the candidates. There’s also an array of pairs, which will represent all of the pairs of candidates (for which one is preferred over the other) in the election.
  • The program also has two global variables: pair_count and candidate_count, representing the number of pairs and number of candidates in the arrays pairs and candidates, respectively.
  • Now onto main. Notice that after determining the number of candidates, the program loops through the locked graph and initially sets all of the values to false, which means our initial graph will have no edges in it.
  • Next, the program loops over all of the voters and collects their preferences in an array called ranks (via a call to vote), where ranks[i] is the index of the candidate who is the ith preference for the voter. These ranks are passed into the record_preference function, whose job it is to take those ranks and update the global preferences variable.
  • Once all of the votes are in, the pairs of candidates are added to the pairs array via a called to add_pairs, sorted via a call to sort_pairs, and locked into the graph via a call to lock_pairs. Finally, print_winner is called to print out the name of the election’s winner!
  • Further down in the file, you’ll see that the functions vote, record_preference, add_pairs,sort_pairs, lock_pairs, and print_winner are left blank. That’s up to you!

Complete the implementation of tideman.c in such a way that it simulates a Tideman election.

  • Complete the vote function.
    • The function takes arguments rank, name, and ranks. If name is a match for the name of a valid candidate, then you should update the ranks array to indicate that the voter has the candidate as their rank preference (where 0 is the first preference, 1 is the second preference, etc.)
    • Recall that ranks[i] here represents the user’s ith preference.
    • The function should return true if the rank was successfully recorded, and false otherwise (if, for instance, name is not the name of one of the candidates).
    • You may assume that no two candidates will have the same name.
  • Complete the record_preferences function.
    • The function is called once for each voter, and takes as argument the ranks array, (recall that ranks[i] is the voter’s ith preference, where ranks[0] is the first preference).
    • The function should update the global preferences array to add the current voter’s preferences. Recall that preferences[i][j] should represent the number of voters who prefer candidate i over candidate j.
    • You may assume that every voter will rank each of the candidates.
  • Complete the add_pairs function.
    • The function should add all pairs of candidates where one candidate is preferred to the pairs array. A pair of candidates who are tied (one is not preferred over the other) should not be added to the array.
    • The function should update the global variable pair_count to be the number of pairs of candidates. (The pairs should thus all be stored between pairs[0] and pairs[pair_count - 1], inclusive).
  • Complete the sort_pairs function.
    • The function should sort the pairs array in decreasing order of strength of victory, where strength of victory is defined to be the number of voters who prefer the preferred candidate. If multiple pairs have the same strength of victory, you may assume that the order does not matter.
  • Complete the lock_pairs function.
    • The function should create the locked graph, adding all edges in decreasing order of victory strength so long as the edge would not create a cycle.
  • Complete the print_winner function.
    • The function should print out the name of the candidate who is the source of the graph. You may assume there will not be more than one source.

You should not modify anything else in tideman.c other than the implementations of the vote, record_preferences, add_pairs, sort_pairs, lock_pairs, and print_winner functions (and the inclusion of additional header files, if you’d like). You are permitted to add additional functions to tideman.c, so long as you do not change the declarations of any of the existing functions.

#include <cs50.h>
#include <stdio.h>
#include <string.h>

// Max number of candidates
#define MAX 9

// preferences[i][j] is number of voters who prefer i over j
int preferences[MAX][MAX];

// locked[i][j] means i is locked in over j
bool locked[MAX][MAX];

// Each pair has a winner, loser
typedef struct
{
    int winner;
    int loser;
}
pair;

// Array of candidates
string candidates[MAX];
pair pairs[MAX * (MAX - 1) / 2];

int pair_count;
int candidate_count;

// Function prototypes
bool vote(int rank, string name, int ranks[]);
void record_preferences(int ranks[]);
void add_pairs(void);
void sort_pairs(void);
void lock_pairs(void);
void print_winner(void);
bool check_cycle(int s, int f);

int main(int argc, string argv[])
{
    // Check for invalid usage
    if (argc < 2)
    {
        printf("Usage: tideman [candidate ...]\n");
        return 1;
    }

    // Populate array of candidates
    candidate_count = argc - 1;
    if (candidate_count > MAX)
    {
        printf("Maximum number of candidates is %i\n", MAX);
        return 2;
    }
    for (int i = 0; i < candidate_count; i++)
    {
        candidates[i] = argv[i + 1];
    }

    // Clear graph of locked in pairs
    for (int i = 0; i < candidate_count; i++)
    {
        for (int j = 0; j < candidate_count; j++)
        {
            locked[i][j] = false;
        }
    }

    pair_count = 0;
    int voter_count = get_int("Number of voters: ");

    // Query for votes
    for (int i = 0; i < voter_count; i++)
    {
        // ranks[i] is voter's ith preference
        int ranks[candidate_count];

        // Query for each rank
        for (int j = 0; j < candidate_count; j++)
        {
            string name = get_string("Rank %i: ", j + 1);

            if (!vote(j, name, ranks))
            {
                printf("Invalid vote.\n");
                return 3;
            }
        }

        record_preferences(ranks);

        printf("\n");
    }

    add_pairs();
    sort_pairs();
    lock_pairs();
    print_winner();
    return 0;
}

// Update ranks given a new vote
bool vote(int rank, string name, int ranks[])
{
    for (int i = 0; i < candidate_count; i++)
    {
        if (strcmp(candidates[i], name) == 0)
        {
            ranks[rank] = i;
            return true;
        }
    }
    return false;
}

// Update preferences given one voter's ranks
void record_preferences(int ranks[])
{
    // Skip self-comparison
    for (int i = 0; i < candidate_count; i++)
        for (int j = i + 1; j < candidate_count; j++)
            preferences[ranks[i]][ranks[j]]++;
    return;
}

// Record pairs of candidates where one is preferred over the other
void add_pairs(void)
{
    for (int i = 0; i < candidate_count; i++)
        for (int j = 0; j < candidate_count; j++)
            if (preferences[i][j] > preferences[j][i])
            {
                pairs[pair_count].winner = i;
                pairs[pair_count].loser = j;
                pair_count++;
            }
    return;
}

// Sort pairs in decreasing order by strength of victory
void sort_pairs(void)
{
    int max;
    for (int i = 0; i < pair_count; i++)
    {
        max = i;
        for (int j = i + 1; j < pair_count; j++)
            if (preferences[pairs[j].winner][pairs[j].loser] > preferences[pairs[max].winner][pairs[max].loser])
                max = j;
        pair temp = pairs[i];
        pairs[i] = pairs[max];
        pairs[max] = temp;
    }
    return;
}

// Lock pairs into the candidate graph in order, without creating cycles
void lock_pairs(void)
{
    for (int i = 0; i < pair_count; i++)
    {
        int a = pairs[i].winner;
        int b = pairs[i].loser;
        if (check_cycle(a, b))
            continue;
        else
            locked[a][b] = true;
    }
    return;
}

// Print the winner of the election
void print_winner(void)
{
    bool indegree;
    for (int i = 0; i < candidate_count; i++)
    {
        indegree = false;
        for (int j = 0; j < candidate_count; j++)
        {
            if (locked[j][i])
            {
                indegree = true;
                break;
            }
        }
        if (indegree)
            continue;
        else
        {
            printf("%s\n", candidates[i]);
            return;
        }
    }
    return;
}

bool check_cycle(int s, int f)
{
    // Try to find f->s
    // Directly success as base condition
    if (locked[f][s])
        return true;
    // Reduce f->s by finding i->s and recursive to find f->i
    for (int i = 0; i < candidate_count; i++)
    {
        if (locked[i][s])
            if (check_cycle(i, f))
                return true;
    }
    return false;
}