CS50 (Computer Science 50) is an on-campus and online introductory course on computer science taught at Harvard University and, as of 2015, Yale University as well. In 2016, CS50 became available to high school students as an AP course. The course material is available online for free on EdX with a range of certificates available for a fee. The on-campus version is Harvard’s largest class with 800 students, 102 staff and up to 2,200 participants in their regular hackathons.

CS50, Wikipedia

* 此课程笔记根据 Harvard CS50 Spring 2020 完成。

第 4 周揭开了 CS50 库中字符串的本质是字符数组,传参时则是字符指针的秘密。介绍了十六进制、指针、字符串、Valgrind 检测内存泄漏、函数传值与传指针的区别,以及文件读写的内容。大部分内容引用自 CS50 课程官方的笔记

Hexadecimal

  • In week 0, we learned binary, a counting system with 0s and 1s.
  • In week 2, we talked about memory and how each byte has an address, or identifier, so we can refer to where our variables are actually stored.
  • It turns out that, by convention, the addresses for memory use the counting system hexadecimal, where there are 16 digits, 0-9 and A-F.
  • Recall that, in binary, each digit stood for a power of 2:
    128 64 32 16 8 4 2 1
    1 1 1 1 1 1 1 1
    • With 8 bits, we can count up to 255.
  • It turns out that, in hexadecimal, we can perfectly count up to 8 binary bits with just 2 digits:
    16^1 16^0
    F F
    • Here, the F is a value of 15 in decimal, and each place is a power of 16, so the first F is 16^1 * 15 = 240, plus the second F with the value of 16^0 * 15 = 15, for a total of 255.
  • And 0A is the same as 10 in decimal, and 0F the same as 15. 10 in hexadecimal would be 16, and we would say it as “one zero in hexadecimal” instead of “ten”, if we wanted to avoid confusion.
  • The RGB color system also conventionally uses hexadecimal to describe the amount of each color. For example, 000000 in hexadecimal means 0 of each red, green, and blue, for a color of black. And FF0000 would be 255, or the highest possible, amount of red. With different values for each color, we can represent millions of different colors.
  • In writing, we can also indicate a value is in hexadecimal by prefixing it with 0x, as in 0x10, where the value is equal to 16 in decimal, as opposed to 10.

Pointers

  • We might create a value n, and print it out:
#include <stdio.h>

int main(void)
{
    int n = 50;
    printf("%i\n", n);
}
  • In our computer’s memory, there are now 4 bytes somewhere that have the binary value of 50, labeled n:
    grid representing bytes, with four boxes together containing 50 with small n underneath
  • It turns out that, with the billions of bytes in memory, those bytes for the variable n starts at some unique address that might look like 0x12345678.
  • In C, we can actually see the address with the & operator, which means “get the address of this variable”:
#include <stdio.h>

int main(void)
{
    int n = 50;
    printf("%p\n", &n);
}
  • And in the CS50 IDE, we might see an address like 0x7ffe00b3adbc, where this is a specific location in the server’s memory.
  • The address of a variable is called a pointer, which we can think of as a value that “points” to a location in memory. The * operator lets us “go to” the location that a pointer is pointing to.
  • For example, we can print *&n, where we “go to” the address of n, and that will print out the value of n, 50, since that’s the value at the address of n:
#include <stdio.h>

int main(void)
{
    int n = 50;
    printf("%i\n", *&n);
}
  • We also have to use the * operator (in an unfortunately confusing way) to declare a variable that we want to be a pointer:
#include <stdio.h>

int main(void)
{
   int n = 50;
   int *p = &n;
   printf("%p\n", p);
}
  • Here, we use int *p to declare a variable, p, that has the type of *, a pointer, to a value of type int, an integer. Then, we can print its value (something like 0x12345678), or print the value at its location with printf("%i\n", *p);.
  • In our computer’s memory, the variables might look like this:
    grid representing bytes, with four boxes together containing 50 with small 0x12345678 underneath, and eight boxes together containing 0x12345678 with small p underneath
    • We have a pointer, p, with the address of some variable.
  • We can abstract away the actual value of the addresses now, since they’ll be different as we declare variables in our programs, and simply think of p as “pointing at” some value:
    one box containing p pointing at smaller box containing 50
  • Let’s say we have a mailbox labeled “123”, with the number “50” inside it. The mailbox would be int n, since it stores an integer. We might have another mailbox with the address “456”, inside of which is the value “123”, which is the address of our other mailbox. This would be int *p, since it’s a pointer to an integer.
  • With the ability to use pointers, we can create different data structures, or different ways to organize data in memory that we’ll see next week.
  • Many modern computer systems are “64-bit”, meaning that they use 64 bits to address memory, so a pointer will be 8 bytes, twice as big as an integer of 4 bytes.

string

  • We might have a variable string s for a name like EMMA, and be able to access each character with s[0] and so on:
    boxes side by side, containing: E labeled s[0], M labeled s[1], M labeled s[2], A labeled s[3], \0 labeled s[4]
  • But it turns out that each character is in stored in memory at a byte with some address, and s is actually just a pointer with the address of the first character:
    box containing 0x123 labeled s, boxes side by side containing E labeled 0x123, M labeled 0x124, M labeled 0x125, A labeled 0x126, \0 labeled 0x127
  • And since s is just a pointer to the beginning, only the \0 indicates the end of the string.
  • In fact, the CS50 Library defines a string with typedef char *string, which just says that we want to name a new type, string, as a char *, or a pointer to a character.
  • Let’s print out a string:
#include <cs50.h>
#include <stdio.h>

int main(void)
{
    string s = "EMMA";
    printf("%s\n", s);
}
  • This is familiar, but we can just say:
#include <stdio.h>

int main(void)
{
    char *s = "EMMA";
    printf("%s\n", s);
}
  • This will also print EMMA.
  • With printf("%p\n", s);, we can print s as its value as a pointer, like 0x42ab52. (printf knows to go to the address and print the entire string when we use %s and pass in s, even though s only points to the first character.)
  • We can also try printf("%p\n", &s[0]);, which is the address of the first character of s, and it’s exactly the same as printing s. And printing &s[1], &s[2], and &s[3] gets us the addresses that are the next characters in memory after &s[0], like 0x42ab53, 0x42ab54, and 0x42ab55, exactly one byte after another.
  • And finally, if we try to printf("%c\n", *s);, we get a single character E, since we’re going to the address contained in s, which has the first character in the string.
  • In fact, s[0], s[1], and s[2] actually map directly to *s, *(s+1), and *(s+2), since each of the next characters are just at the address of the next byte.

Compare and Copy

  • Let’s look at compare0:
#include <cs50.h>
#include <stdio.h>

int main(void)
{
    // Get two integers
    int i = get_int("i: ");
    int j = get_int("j: ");

    // Compare integers
    if (i == j)
    {
        printf("Same\n");
    }
    else
    {
        printf("Different\n");
    }
}
  • We can compile and run this, and our program works as we’d expect, with the same values of the two integers giving us “Same” and different values “Different”.
  • In compare1, we see that the same string values are causing our program to print “Different”:
#include <cs50.h>
#include <stdio.h>

int main(void)
{
    // Get two strings
    string s = get_string("s: ");
    string t = get_string("t: ");

    // Compare strings' addresses
    if (s == t)
    {
        printf("Same\n");
    }
    else
    {
        printf("Different\n");
    }
}
  • Given what we now know about strings, this makes sense because each “string” variable is pointing to a different location in memory, where the first character of each string is stored. So even if the values of the strings are the same, this will always print “Different”.
  • For example, our first string might be at address 0x123, our second might be at 0x456, and s will be 0x123 and t will be 0x456, so those values will be different.
  • And get_string, this whole time, has been returning just a char *, or a pointer to the first character of a string from the user.
  • Now let’s try to copy a string:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>

int main(void)
{
    string s = get_string("s: ");

    string t = s;

    t[0] = toupper(t[0]);

    // Print string twice
    printf("s: %s\n", s);
    printf("t: %s\n", t);
}
  • We get a string s, and copy the value of s into t. Then, we capitalize the first letter in t.
  • But when we run our program, we see that both s and t are now capitalized.
  • Since we set s and t to the same values, they’re actually pointers to the same character, and so we capitalized the same character!
  • To actually make a copy of a string, we have to do a little more work:
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    char *s = get_string("s: ");

    char *t = malloc(strlen(s) + 1);

    for (int i = 0, n = strlen(s); i < n + 1; i++)
    {
        t[i] = s[i];
    }

    t[0] = toupper(t[0]);

    printf("s: %s\n", s);
    printf("t: %s\n", t);
}
  • We create a new variable, t, of the type char *, with char *t. Now, we want to point it to a new chunk of memory that’s large enough to store the copy of the string. With malloc, we can allocate some number of bytes in memory (that aren’t already used to store other values), and we pass in the number of bytes we’d like. We already know the length of s, so we add 1 to that for the terminating null character. So, our final line of code is char *t = malloc(strlen(s) + 1);.
  • Then, we copy each character, one at a time, and now we can capitalize just the first letter of t. And we use i < n + 1, since we actually want to go up to n, to ensure we copy the terminating character in the string.
  • We can actually also use the strcpy library function with strcpy(t, s) instead of our loop, to copy the string s into t. To be clear, the concept of a “string” is from the C language and well-supported; the only training wheels from CS50 are the type string instead of char *, and the get_string function.
  • If we didn’t copy the null terminating character, \0, and tried to print out our string t, printf will continue and print out the unknown, or garbage, values that we have in memory, until it happens to reach a \0, or crashes entirely, since our program might end up trying to read memory that doesn’t belong to it!

valgrind

  • It turns out that, after we’re done with memory that we’ve allocated with malloc, we should call free (as in free(t)), which tells our computer that those bytes are no longer useful to our program, so those bytes in memory can be reused again.
  • If we kept running our program and allocating memory with malloc, but never freed the memory after we were done using it, we would have a memory leak, which will slow down our computer and use up more and more memory until our computer runs out.
  • valgrind is a command-line tool that we can use to run our program and see if it has any memory leaks. We can run valgrind on our program above with help50 valgrind ./copy and see, from the error message, that line 10, we allocated memory that we never freed (or “lost”).
  • So at the end, we can add a line free(t), which won’t change how our program runs, but no errors from valgrind.
  • Let’s take a look at memory.c:
// http://valgrind.org/docs/manual/quick-start.html#quick-start.prepare

#include <stdlib.h>

void f(void)
{
    int *x = malloc(10 * sizeof(int));
    x[10] = 0;
}

int main(void)
{
    f();
    return 0;
}
  • This is an example from valgrind’s documentation (valgrind is a real tool, while help50 was written specifically to help us in this course).
    • The function f allocates enough memory for 10 integers, and stores the address in a pointer called x. Then we try to set the 11th value of x with x[10] to 0, which goes past the array of memory we’ve allocated for our program. This is called buffer overflow, where we go past the boundaries of our buffer, or array, and into unknown memory.
  • valgrind will also tell us there’s an “Invalid write of size 4” for line 8, where we are indeed trying to change the value of an integer (of size 4 bytes).
  • And this whole time, the CS50 Library has been freeing memory it’s allocated in get_string, when our program finishes!

Swap

  • We have two colored drinks, purple and green, each of which is in a cup. We want to swap the drinks between the two cups, but we can’t do that without a third cup to pour one of the drink into first.
  • Now, let’s say we wanted to swap the values of two integers.
void swap(int a, int b)
{
    int tmp = a;
    a = b;
    b = tmp;
}
  • With a third variable to use as temporary storage space, we can do this pretty easily, by putting a into tmp, and then b to a, and finally the original value of a, now in tmp, into b.
  • But, if we tried to use that function in a program, we don’t see any changes:
#include <stdio.h>

void swap(int a, int b);

int main(void)
{
    int x = 1;
    int y = 2;

    printf("x is %i, y is %i\n", x, y);
    swap(x, y);
    printf("x is %i, y is %i\n", x, y);
}

void swap(int a, int b)
{
    int tmp = a;
    a = b;
    b = tmp;
}
  • It turns out that the swap function gets its own variables, a and b when they are passed in, that are copies of x and y, and so changing those values don’t change x and y in the main function.

Memory Layout

  • Within our computer’s memory, the different types of data that need to be stored for our program are organized into different sections:
    Grid with sections, from top to bottom: machine code, globals, heap (with arrow pointing downward), stack (with arrow pointing upward)
    • The machine code section is our compiled program’s binary code. When we run our program, that code is loaded into the “top” of memory.
    • Globals are global variables we declare in our program or other shared variables that our entire program can access.
    • The heap section is an empty area where malloc can get free memory from, for our program to use.
    • The stack section is used by functions in our program as they are called. For example, our main function is at the very bottom of the stack, and has the local variables x and y. The swap function, when it’s called, has its own frame, or slice, of memory that’s on top of main’s, with the local variables a, b, and tmp:
      Stack section with (a, b, tmp) above (x, y)
      • Once the function swap returns, the memory it was using is freed for the next function call, and we lose anything we did, other than the return values, and our program goes back to the function that called swap.
      • So by passing in the addresses of x and y from main to swap, we can actually change the values of x and y: Stack section with (a, b, tmp) above (x, y), and a pointing to x and b pointing to y
  • By passing in the address of x and y, our swap function can actually work:
#include <stdio.h>

void swap(int *a, int *b);

int main(void)
{
    int x = 1;
    int y = 2;

    printf("x is %i, y is %i\n", x, y);
    swap(&x, &y);
    printf("x is %i, y is %i\n", x, y);
}

void swap(int *a, int *b)
{
    int tmp = *a;
    *a = *b;
    *b = tmp;
}
  • The addresses of x and y are passed in from main to swap, and we use the int *a syntax to declare that our swap function takes in pointers. We save the value of x to tmp by following the pointer a, and then take the value of y by following the pointer b, and store that to the location a is pointing to (x). Finally, we store the value of tmp to the location pointed to by b (y), and we’re done.
  • If we call malloc too many times, we will have a heap overflow, where we end up going past our heap. Or, if we have too many functions being called, we will have a stack overflow, where our stack has too many frames of memory allocated as well. And these two types of overflow are generally known as buffer overflows, after which our program (or entire computer) might crash.

get_int

  • We can implement get_int ourselves with a C library function, scanf:
#include <stdio.h>

int main(void)
{
    int x;
    printf("x: ");
    scanf("%i", &x);
    printf("x: %i\n", x);
}
  • scanf takes a format, %i, so the input is “scanned” for that format, and the address in memory where we want that input to go. But scanf doesn’t have much error checking, so we might not get an integer.
  • We can try to get a string the same way:
#include <stdio.h>

int main(void)
{
    char *s = NULL;
    printf("s: ");
    scanf("%s", s);
    printf("s: %s\n", s);
}
  • But we haven’t actually allocated any memory for s (s is NULL, or not pointing to anything), so we might want to call char s[5] to allocate an array of 5 characters for our string. Then, s will be treated as a pointer in scanf and printf.
    • Now, if the user types in a string of length 4 or less, our program will work safely. But if the user types in a longer string, scanf might be trying to write past the end of our array into unknown memory, causing our program to crash.

Files

  • With the ability to use pointers, we can also open files:
#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    // Open file
    FILE *file = fopen("phonebook.csv", "a");

    // Get strings from user
    char *name = get_string("Name: ");
    char *number = get_string("Number: ");

    // Print (write) strings to file
    fprintf(file, "%s,%s\n", name, number);

    // Close file
    fclose(file);
}
  • fopen is a new function we can use to open a file. It will return a pointer to a new type, FILE, that we can read from and write to. The first argument is the name of the file, and the second argument is the mode we want to open the file in (r for read, w for write, and a for append, or adding to).
    • After we get some strings, we can use fprintf to print to a file.
    • Finally, we close the file with fclose.
  • Now we can create our own CSV files, files of comma-separated values (like a mini-spreadsheet), programmatically.

JPEG

  • We can also write a program that opens a file and tells us if it’s a JPEG (image) file:
#include <stdio.h>

int main(int argc, char *argv[])
{
    // Check usage
    if (argc != 2)
    {
        return 1;
    }

    // Open file
    FILE *file = fopen(argv[1], "r");
    if (!file)
    {
        return 1;
    }

    // Read first three bytes
    unsigned char bytes[3];
    fread(bytes, 3, 1, file);

    // Check first three bytes
    if (bytes[0] == 0xff && bytes[1] == 0xd8 && bytes[2] == 0xff)
    {
        printf("Maybe\n");
    }
    else
    {
        printf("No\n");
    }

    // Close file
    fclose(file);
}
  • Now, if we run this program with ./jpeg brian.jpg, our program will try to open the file we specify (checking that we indeed get a non-NULL file back), and read the first three bytes from the file with fread.
    • We can compare the first three bytes (in hexadecimal) to the three bytes required to begin a JPEG file. If they’re the same, then our file is likely to be a JPEG file (though, other types of files may still begin with those bytes). But if they’re not the same, we know it’s definitely not a JPEG file.
  • We can use these abilities to read and write files, in particular images, and modify them by changing the bytes in them, in this week’s problem set!

习题个人解答

Filter

该题分为 lessmore。二者均包含灰阶、镜面翻转与模糊的图像处理;less 中有褐色(sepia)滤镜,而 more 中则更换为基于 Sobel 算子的边缘处理。

bmp.h

本文件实现了 bmp DIB 的文件头,具体字段功能可以参阅这篇:DIBs and Their Use。__attribute__((__packed__)) 告知编译器在编译过程中取消字节对齐的优化,按照实际实现排列字段。

// BMP-related data types based on Microsoft's own

#include <stdint.h>

/**
 * Common Data Types
 *
 * The data types in this section are essentially aliases for C/C++
 * primitive data types.
 *
 * Adapted from http://msdn.microsoft.com/en-us/library/cc230309.aspx.
 * See http://en.wikipedia.org/wiki/Stdint.h for more on stdint.h.
 */
typedef uint8_t  BYTE;
typedef uint32_t DWORD;
typedef int32_t  LONG;
typedef uint16_t WORD;

/**
 * BITMAPFILEHEADER
 *
 * The BITMAPFILEHEADER structure contains information about the type, size,
 * and layout of a file that contains a DIB [device-independent bitmap].
 *
 * Adapted from http://msdn.microsoft.com/en-us/library/dd183374(VS.85).aspx.
 */
typedef struct
{
    WORD   bfType;
    DWORD  bfSize;
    WORD   bfReserved1;
    WORD   bfReserved2;
    DWORD  bfOffBits;
} __attribute__((__packed__))
BITMAPFILEHEADER;

/**
 * BITMAPINFOHEADER
 *
 * The BITMAPINFOHEADER structure contains information about the
 * dimensions and color format of a DIB [device-independent bitmap].
 *
 * Adapted from http://msdn.microsoft.com/en-us/library/dd183376(VS.85).aspx.
 */
typedef struct
{
    DWORD  biSize;
    LONG   biWidth;
    LONG   biHeight;
    WORD   biPlanes;
    WORD   biBitCount;
    DWORD  biCompression;
    DWORD  biSizeImage;
    LONG   biXPelsPerMeter;
    LONG   biYPelsPerMeter;
    DWORD  biClrUsed;
    DWORD  biClrImportant;
} __attribute__((__packed__))
BITMAPINFOHEADER;

/**
 * RGBTRIPLE
 *
 * This structure describes a color consisting of relative intensities of
 * red, green, and blue.
 *
 * Adapted from http://msdn.microsoft.com/en-us/library/aa922590.aspx.
 */
typedef struct
{
    BYTE  rgbtBlue;
    BYTE  rgbtGreen;
    BYTE  rgbtRed;
} __attribute__((__packed__))
RGBTRIPLE;

helpers.h

定义了滤镜函数原型。

#include "bmp.h"

// Convert image to grayscale
void grayscale(int height, int width, RGBTRIPLE image[height][width]);

// Convert image to sepia
void sepia(int height, int width, RGBTRIPLE image[height][width]);

// Reflect image horizontally
void reflect(int height, int width, RGBTRIPLE image[height][width]);

// Detect edges
void edges(int height, int width, RGBTRIPLE image[height][width]);

// Blur image
void blur(int height, int width, RGBTRIPLE image[height][width]);

filter.c

给出了打开与写入 BMP 的整个流程,值得一读。亮点是使用 POSIX 的 getopt() 读取命令行参数,并且有技巧地使用了指向数组的指针进行二维数组的空间分配。fread 和 fseek 算是 C 语言文件读写比较常见的函数了,没有接触过的可以了解一下。

#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>

#include "helpers.h"

int main(int argc, char *argv[])
{

    // Define allowable filters
    char *filters = "begrs";

    // Get filter flag and check validity
    char filter = getopt(argc, argv, filters);
    if (filter == '?')
    {
        fprintf(stderr, "Invalid filter.\n");
        return 1;
    }

    // Ensure only one filter
    if (getopt(argc, argv, filters) != -1)
    {
        fprintf(stderr, "Only one filter allowed.\n");
        return 2;
    }

    // Ensure proper usage
    if (argc != optind + 2)
    {
        fprintf(stderr, "Usage: filter [flag] infile outfile\n");
        return 3;
    }

    // Remember filenames
    char *infile = argv[optind];
    char *outfile = argv[optind + 1];

    // Open input file
    FILE *inptr = fopen(infile, "r");
    if (inptr == NULL)
    {
        fprintf(stderr, "Could not open %s.\n", infile);
        return 4;
    }

    // Open output file
    FILE *outptr = fopen(outfile, "w");
    if (outptr == NULL)
    {
        fclose(inptr);
        fprintf(stderr, "Could not create %s.\n", outfile);
        return 5;
    }

    // Read infile's BITMAPFILEHEADER
    BITMAPFILEHEADER bf;
    fread(&bf, sizeof(BITMAPFILEHEADER), 1, inptr);

    // Read infile's BITMAPINFOHEADER
    BITMAPINFOHEADER bi;
    fread(&bi, sizeof(BITMAPINFOHEADER), 1, inptr);

    // Ensure infile is (likely) a 24-bit uncompressed BMP 4.0
    if (bf.bfType != 0x4d42 || bf.bfOffBits != 54 || bi.biSize != 40 ||
        bi.biBitCount != 24 || bi.biCompression != 0)
    {
        fclose(outptr);
        fclose(inptr);
        fprintf(stderr, "Unsupported file format.\n");
        return 6;
    }

    int height = abs(bi.biHeight);
    int width = bi.biWidth;

    // Allocate memory for image
    RGBTRIPLE(*image)[width] = calloc(height, width * sizeof(RGBTRIPLE));
    if (image == NULL)
    {
        fprintf(stderr, "Not enough memory to store image.\n");
        fclose(outptr);
        fclose(inptr);
        return 7;
    }

    // Determine padding for scanlines
    int padding = (4 - (width * sizeof(RGBTRIPLE)) % 4) % 4;

    // Iterate over infile's scanlines
    for (int i = 0; i < height; i++)
    {
        // Read row into pixel array
        fread(image[i], sizeof(RGBTRIPLE), width, inptr);

        // Skip over padding
        fseek(inptr, padding, SEEK_CUR);
    }

    // Filter image
    switch (filter)
    {
        // Blur
        case 'b':
            blur(height, width, image);
            break;

        // Edges
        case 'e':
            edges(height, width, image);
            break;

        // Grayscale
        case 'g':
            grayscale(height, width, image);
            break;

        // Reflection
        case 'r':
            reflect(height, width, image);
            break;

        // Sepia
        case 's':
            sepia(height, width, image);
            break;
    }

    // Write outfile's BITMAPFILEHEADER
    fwrite(&bf, sizeof(BITMAPFILEHEADER), 1, outptr);

    // Write outfile's BITMAPINFOHEADER
    fwrite(&bi, sizeof(BITMAPINFOHEADER), 1, outptr);

    // Write new pixels to outfile
    for (int i = 0; i < height; i++)
    {
        // Write row to outfile
        fwrite(image[i], sizeof(RGBTRIPLE), width, outptr);

        // Write padding at end of row
        for (int k = 0; k < padding; k++)
        {
            fputc(0x00, outptr);
        }
    }

    // Free memory for image
    free(image);

    // Close infile
    fclose(inptr);

    // Close outfile
    fclose(outptr);

    return 0;
}

helper.c

需要学生实现的滤镜及卷积核主体部分,感觉写得稍微有些笨拙了。另外今天才知道可以通过 CS50X 的渠道提交,试着交了一下这次的 more 和上一章的 tideman,全部都通过了。

#include "helpers.h"
#include <stdlib.h>
#include <math.h>

// Convert image to grayscale
void grayscale(int height, int width, RGBTRIPLE image[height][width])
{
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width; j++)
        {
            int r = image[i][j].rgbtRed;
            int g = image[i][j].rgbtGreen;
            int b = image[i][j].rgbtBlue;
            int avg = round((r + g + b) / 3.0);
            image[i][j].rgbtRed = image[i][j].rgbtGreen = image[i][j].rgbtBlue = (BYTE) avg;
        }
    }
    return;
}

// Convert image to sepia
void sepia(int height, int width, RGBTRIPLE image[height][width])
{
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width; j++)
        {
            int r = image[i][j].rgbtRed;
            int g = image[i][j].rgbtGreen;
            int b = image[i][j].rgbtBlue;
            double sepia_r = round(0.393 * r + 0.769 * g + 0.189 * b);
            double sepia_g = round(0.349 * r + 0.686 * g + 0.168 * b);
            double sepia_b = round(0.272 * r + 0.534 * g + 0.131 * b);
            sepia_r = (sepia_r > 255) ? 255 : sepia_r;
            sepia_g = (sepia_g > 255) ? 255 : sepia_g;
            sepia_b = (sepia_b > 255) ? 255 : sepia_b;
            image[i][j].rgbtRed = (BYTE) sepia_r;
            image[i][j].rgbtGreen = (BYTE) sepia_g;
            image[i][j].rgbtBlue = (BYTE) sepia_b;
        }
    }
    return;
}

// Reflect image horizontally
void reflect(int height, int width, RGBTRIPLE image[height][width])
{
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width / 2; j++)
        {
            RGBTRIPLE * temp = malloc(sizeof(RGBTRIPLE));
            *temp = image[i][width - 1 - j];
            image[i][width - 1 - j] = image[i][j];
            image[i][j] = *temp;
            free(temp);
        }
    }
    return;
}

// Blur image
void blur(int height, int width, RGBTRIPLE image[height][width])
{
    RGBTRIPLE (*newimage)[width] = calloc(height, width * sizeof(RGBTRIPLE));
    int dxy[3] = {-1, 0, 1};
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width; j++)
        {
            double blocks;
            int r = 0, g = 0, b = 0;
            for (int dx = 0; dx < 3; dx++)
            {
                int nx = i + dxy[dx];
                if (nx < 0 || nx >= height)
                    continue;
                for (int dy = 0; dy < 3; dy++)
                {
                    int ny = j + dxy[dy];
                    if (ny < 0 || ny >= width)
                        continue;
                    r += image[nx][ny].rgbtRed;
                    g += image[nx][ny].rgbtGreen;
                    b += image[nx][ny].rgbtBlue;
                }
            }
            if ((i == 0 || i == height - 1) && (j == 0 || j == width - 1))
                blocks = 4;
            else if (i == 0 || i == height - 1 || j == 0 || j == width - 1)
                blocks = 6;
            else
                blocks = 9;
            r = round(r / blocks);
            g = round(g / blocks);
            b = round(b / blocks);
            newimage[i][j].rgbtRed = (BYTE) r;
            newimage[i][j].rgbtGreen = (BYTE) g;
            newimage[i][j].rgbtBlue = (BYTE) b;
        }
    }
    for (int i = 0; i < height; i++)
        for (int j = 0; j < width; j++)
            image[i][j] = newimage[i][j];
    free(newimage);
    return;
}

// Detect edges
void edges(int height, int width, RGBTRIPLE image[height][width])
{
    RGBTRIPLE (*newimage)[width] = calloc(height, width * sizeof(RGBTRIPLE));
    int Gx[3][3] = {{-1,0,1},{-2,0,2},{-1,0,1}};
    int Gy[3][3] = {{-1,-2,-1},{0,0,0},{1,2,1}};
    int dxy[3] = {-1, 0, 1};
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width; j++)
        {
            int rx = 0, gx = 0, bx = 0;
            int ry = 0, gy = 0, by = 0;
            for (int dx = 0; dx < 3; dx++)
            {
                int nx = i + dxy[dx];
                if (nx < 0 || nx >= height)
                    continue;
                for (int dy = 0; dy < 3; dy++)
                {
                    int ny = j + dxy[dy];
                    if (ny < 0 || ny >= width)
                        continue;
                    rx += image[nx][ny].rgbtRed * Gx[dx][dy];
                    gx += image[nx][ny].rgbtGreen * Gx[dx][dy];
                    bx += image[nx][ny].rgbtBlue * Gx[dx][dy];
                    ry += image[nx][ny].rgbtRed * Gy[dx][dy];
                    gy += image[nx][ny].rgbtGreen * Gy[dx][dy];
                    by += image[nx][ny].rgbtBlue * Gy[dx][dy];
                }
            }
            int rsqrt = round(sqrt(rx * rx + ry * ry));
            int gsqrt = round(sqrt(gx * gx + gy * gy));
            int bsqrt = round(sqrt(bx * bx + by * by));
            rsqrt = (rsqrt > 255) ? 255 : rsqrt;
            gsqrt = (gsqrt > 255) ? 255 : gsqrt;
            bsqrt = (bsqrt > 255) ? 255 : bsqrt;
            newimage[i][j].rgbtRed = (BYTE) rsqrt;
            newimage[i][j].rgbtGreen = (BYTE) gsqrt;
            newimage[i][j].rgbtBlue = (BYTE) bsqrt;
        }
    }
    for (int i = 0; i < height; i++)
        for (int j = 0; j < width; j++)
            image[i][j] = newimage[i][j];
    free(newimage);
    return;
}

Recover

每次读 512 Bytes(FAT32 的扇区大小)然后匹配开头是否为 JPEG 文件头,将一整个磁盘文件切分为若干 JPEG。调试了半个小时都崩溃,最后发现是声明指针的时候想着反正一开始用不到,就没有初始化,而之后直接与 NULL 进行的比较,最常见的那种垃圾值问题。初始化害人啊。

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#define MAXSIZE 512

typedef uint8_t BYTE;

int main(int argc, char *argv[])
{
    int pic = 0;
    FILE * outfile = NULL;
    BYTE buf[MAXSIZE];
    char filename[20];
    size_t readSize;

    if (argc != 2)
    {
        printf("Usage: ./recover image\n");
        return 1;
    }

    FILE * infile = fopen(argv[1], "r");

    if (infile == NULL)
    {
        printf("Usage: ./recover image\n");
        return 1;
    }

    do
    {
        readSize = fread(buf, sizeof(BYTE), MAXSIZE, infile);
        bool header = (buf[0] == 0xff) && (buf[1] == 0xd8) && (buf[2] == 0xff) && ((buf[3] & 0xf0) == 0xe0);
        if (header)
        {
            if (outfile != NULL)
                fclose(outfile);
            sprintf(filename, "%03i.jpg", pic);
            outfile = fopen(filename, "w");
            pic++;
        }
        if (outfile != NULL)
        {
            fwrite(buf, sizeof(BYTE), readSize, outfile);
        }
    } while (readSize != 0 && !feof(infile));
    fclose(outfile);
    fclose(infile);
    return 0;
}