### 11.2. 初窥算法

Before covering the structure of the algorithms library, let's look at a couple of examples. We've already seen the use of find; in this section we'll use a few additional algorithms. To use a generic algorithm, we must include the algorithm header:

```    #include <algorithm>
```

The library also defines a set of generalized numeric algorithms, using the same conventions as the generic algorithms. To use these algorithms we include the numeric header:

```    #include <numeric>
```

With only a handful of exceptions, all the algorithms operate over a range of elements. We'll refer to this range as the "input range." The algorithms that take an input range always use their first two parameters to denote that range. These parameters are iterators used to denote the first and one past the last element that we want to process.

Although most algorithms are similar in that they operate over an input range, they differ in how they use the elements in that range. The most basic way to understand the algorithms is to know whether they read elements, write elements, or rearrange the order of the elements. We'll look at samples of each kind of algorithm in the remainder of this section.

#### 11.2.1. 只读算法

A number of the algorithms read, but never write to, the elements in their input range. find is one such algorithm. Another simple, read-only algorithm is accumulate, which is defined in the numeric header. Assuming vec is a vector of int values, the following code

```     // sum the elements in vec starting the summation with the value 42
int sum = accumulate(vec.begin(), vec.end(), 42);
```

sets sum equal to the sum of the elements in vec plus 42. accumulate takes three parameters. The first two specify a range of elements to sum. The third is an initial value for the sum. The accumulate function sets an internal variable to the initial value. It then adds the value of each element in the range to that initial value. The algorithm returns the result of the summation. The return type of accumulate is the type of its third argument.

 The third argument, which specifies the starting value, is necessary because accumulate knows nothing about the element types that it is accumulating. Therefore, it has no way to invent an appropriate starting value or associated type. 用于指定累加起始值的第三个实参是必要的，因为 accumulate 对将要累加的元素类型一无所知，因此，除此之外，没有别的办法创建合适的起始值或者关联的类型。

There are two implications of the fact that accumulate doesn't know about the types over which it sums. First, we must pass an initial starting value because otherwise accumulate cannot know what starting value to use. Second, the type of the elements in the container must match or be convertible to the type of the third argument. Inside accumulate, the third argument is used as the starting point for the summation; the elements in the container are successively added into this sum. It must be possible to add the element type to the type of the sum.

accumulate 对要累加的元素类型一无所知，这个事实有两层含义。首先，调用该函数时必须传递一个起始值，否则，accumulate 将不知道使用什么起始值。其次，容器内的元素类型必须与第三个实参的类型匹配，或者可转换为第三个实参的类型。在 accumulate 内部，第三个实参用作累加的起点；容器内的元素按顺序连续累加到总和之中。因此，必须能够将元素类型加到总和类型上。

As an example, we could use accumulate to concatenate the elements of a vector of strings:

```     // concatenate elements from v and store in sum
string sum = accumulate(v.begin(), v.end(), string(""));
```

The effect of this call is to concatenate each element in vec onto a string that starts out as the empty string. Note that we explicitly create a string as the third parameter. Passing a character-string literal would be a compile-time error. If we passed a string literal, the summation type would be const char* but the string addition operator (Section 3.2.3, p. 86) for operands of type string and const char* yields a string not a const char*.

##### find_first_of 的使用

In addition to find, the library defines several other, more complicated searching algorithms. Several of these are similar to the find operations of the string class (Section 9.6.4, p. 343). One such is find_first_of. This algorithm takes two pairs of iterators that denote two ranges of elements. It looks in the first range for a match to any element from the second range and returns an iterator denoting the first element that matches. If no match is found, it returns the end iterator of the first range. Assuming that roster1 and roster2 are two lists of names, we could use find_first_of to count how many names are in both lists:

```     // program for illustration purposes only:
// there are much faster ways to solve this problem
size_t cnt = 0;
list<string>::iterator it = roster1.begin();
// look in roster1 for any name also in roster2
while   ((it = find_first_of(it, roster1.end(),
roster2.begin(), roster2.end()))
!= roster1.end()) {
++cnt;
// we got a match, increment it to look in the rest of roster1
++it;
}
cout << "Found " << cnt
<< " names on both rosters" << endl;
```

The call to find_first_of looks for any element in roster2 that matches an element from the first rangethat is, it looks for an element in the range from it to roster1.end(). The function returns the first element in that range that is also in the second range. On the first trip through the while, we look in the entire range of roster1. On second and subsequent trips, we look in that part of roster1 that has not already been matched.

In the condition, we check the return from find_first_of to see whether we found a matching name. If we got a match, we increment our counter. We also increment it so that it refers to the next element in roster1. We know we're done when find_first_of returns roster1.end(), which it does if there is no match.

## 关键概念：迭代器实参类型

In general, the generic algorithms operate on iterator pairs that denote a range of elements in a container (or other sequence). The types of the arguments that denote the range must match exactly, and the iterators themselves must denote a range: They must refer to elements in the same container (or to the element just past the end of that container), and if they are unequal, then it must be possible to reach the second iterator by repeatedly incrementing the first iterator.

Some algorithms, such as find_first_of, take two pairs of iterators. The iterator types in each pair must match exactly, but there is no requirement that the type of the two pairs match each other. In particular, the elements can be stored in different kinds of sequences. What is required is that we be able to compare elements from the two sequences.

In our program, the types of roster1 and roster2 need not match exactly: roster1 could be a list while roster2 was a vector, deque, or other sequence that we'll learn about later in this chapter. What is required is that we be able to compare the elements from these two sequences using the == operator. If roster1 is a list<string>, then roster2 could be a vector<char*> because the string library defines == on a string and a char*.

## Exercises Section 11.2.1

 Exercise 11.3: Use accumulate to sum the elements in a vector. 用 accumulate 统计 vector 容器对象中的元素之和 Exercise 11.4: Assuming v is a vector what, if anything, is wrong with calling accumulate 类型的对象，则调用 accumulate

#### 11.2.2. 写容器元素的算法

Some algorithms write element values. When using algorithms that write elements, we must take care to ensure that the sequence into which the algorithm writes is at least as large as the number of elements being written.

Some algorithms write directly into the input sequence. Others take an additional iterator that denotes a destination. Such algorithms use the destination iterator as a place in which to write output. Still a third kind writes a specified number of elements to some sequence.

##### 写入输入序列的元素

The algorithms that write to their input sequence are inherently safethey write only as many elements as are in the specified input range.

A simple example of an algorithm that writes to its input sequence is fill:

```     fill(vec.begin(), vec.end(), 0); // reset each element to 0
// set subsequence of the range to 10
fill(vec.begin(), vec.begin() + vec.size()/2, 10);
```

fill takes a pair of iterators that denote a range in which to write copies of its third parameter. It executes by setting each element in the range to the given value. Assuming that the input range is valid, then the writes are safe. The algorithm writes only to elements known to exist in the input range itself.

fill 带有一对迭代器形参，用于指定要写入的范围，而所写的值是它的第三个形参的副本。执行时，将该范围内的每个元素都设为给定的值。如果输入范围有效，则可安全写入。这个算法只会对输入范围内已存在的元素进行写入操作。

##### 不检查写入操作的算法

The fill_n function takes an iterator, a count, and a value. It writes the specified number of elements with the given value starting at the element referred to by the iterator. The fill_n function assumes that it is safe to write the specified number of elements. It is a fairly common beginner mistake to call fill_n (or similar algorithms that write to elements) on a container that has no elements:

fill_n 函数带有的参数包括：一个迭代器、一个计数器以及一个值。该函数从迭代器指向的元素开始，将指定数量的元素设置为给定的值。fill_n 函数假定对指定数量的元素做写操作是安全的。初学者常犯的错误的是：在没有元素的空容器上调用 fill_n 函数（或者类似的写元素算法）。

```     vector<int> vec; // empty vector
// disaster: attempts to write to 10 (nonexistent) elements in vec
fill_n(vec.begin(), 10, 0);
```

This call to fill_n is a disaster. We specified that ten elements should be written, but there are no such elements vec is empty. The result is undefined and will probably result in a serious run-time failure.

 Algorithms that write a specified number of elements or that write to a destination iterator do not check whether the destination is large enough to hold the number of elements being written. 对指定数目的元素做写入运算，或者写到目标迭代器的算法，都不检查目标的大小是否足以存储要写入的元素。
##### 引入 back_inserter

One way to ensure that an algorithm has enough elements to hold the output is to use an insert iterator. An insert iterator is an iterator that adds elements to the underlying container. Ordinarily, when we assign to a container element through an iterator, we assign to the element to which the iterator refers. When we assign through an insert iterator, a new element equal to the right-hand value is added to the container.

We'll have more to say about insert iterators in Section 11.3.1 (p. 406). However, in order to illustrate how to safely use algorithms that write to a container, we will use back_inserter. Programs that use back_inserter must include the iterator header.

The back_inserter function is an iterator adaptor. Like the container adaptors (Section 9.7, p. 348), an iterator adaptor takes an object and generates a new object that adapts the behavior of its argument. In this case, the argument to back_inserter is a reference to a container. back_inserter generates an insert iterator bound to that container. When we attempt to assign to an element through that iterator, the assignment calls push_back to add an element with the given value to the container. We can use back_inserter to generate an iterator to use as the destination in fill_n:

back_inserter 函数是迭代器适配器。与容器适配器（第 9.7 节）一样，迭代器适配器使用一个对象作为实参，并生成一个适应其实参行为的新对象。在本例中，传递给 back_inserter 的实参是一个容器的引用。back_inserter 生成一个绑定在该容器上的插入迭代器。在试图通过这个迭代器给元素赋值时，赋值运算将调用 push_back 在容器中添加一个具有指定值的元素。使用 back_inserter 可以生成一个指向 fill_n 写入目标的迭代器：

```     vector<int> vec; // empty vector
// ok: back_inserter creates an insert iterator that adds elements to vec
fill_n (back_inserter(vec), 10, 0); // appends 10 elements to vec
```

Now, each time fill_n writes a value, it will do so through the insert iterator generated by back_inserter. The effect will be to call push_back on vec, adding ten elements to the end of vec, each of which has the value 0.

##### 写入到目标迭代器的算法

A third kind of algorithm writes an unknown number of elements to a destination iterator. As with fill_n, the destination iterator refers to the first element of a sequence that will hold the output. The simplest such algorithm is copy. This algorithm takes three iterators: The first two denote an input range and the third refers to an element in the destination sequence. It is essential that the destination passed to copy be at least as large as the input range. Assuming ilst is a list holding ints, we might copy it into a vector:

```     vector<int> ivec; // empty vector
// copy elements from ilst into ivec
copy (ilst.begin(), ilst.end(), back_inserter(ivec));
```

copy reads elements from the input range, copying them to the destination.

copy 从输入范围中读取元素，然后将它们复制给目标 ivec。

Of course, this example is a bit inefficient: Ordinarily if we want to create a new container as a copy of an existing container, it is better to use an input range directly as the initializer for a newly constructed container:

```     // better way to copy elements from ilst
vector<int> ivec(ilst.begin(), ilst.end());
```
##### 算法的 _copy 版本

Several algorithms provide so-called "copying" versions. These algorithms do some processing on the elements of their input sequence but do not change the original elements. Instead, a new sequence is written that contains the result of processing the elements of the original.

The replace algorithm is a good example. This algorithm reads and writes to an input sequence, replacing a given value by a new value. The algorithm takes four parameters: a pair of iterators denoting the input range and a pair of values. It substitutes the second value for each element that is equal the first:

replace 算法就是一个很好的例子。该算法对输入序列做读写操作，将序列中特定的值替换为新的值。该算法带有四个形参：一对指定输入范围的迭代器和两个值。每一个等于第一值的元素替换成第二个值。

```     // replace any element with value of 0 by 42
replace(ilst.begin(), ilst.end(), 0, 42);
```

This call replaces all instances of 0 by the 42. If we wanted to leave the original sequence unchanged, we would call replace_copy. That algorithm takes a third iterator argument denoting a destination in which to write the adjusted sequence:

```     // create empty vector to hold the replacement
vector<int> ivec;

// use back_inserter to grow destination as needed
replace_copy (ilst.begin(), ilst.end(),
back_inserter(ivec), 0, 42);
```

After this call, ilst is unchanged, and ivec contains a copy of ilst with the exception that every element in ilst with the value 0 has the value 42 in ivec.

## Exercises Section 11.2.2

 Exercise 11.6: Using fill_n, write a program to set a sequence of int values to 0. 使用 fill_n 编写程序，将一个 int 序列的值设为 0。 Exercise 11.7: Determine if there are any errors in the following programs and, if so, correct the error(s): 判断下面的程序是否有错，如果有，请改正之： ``` (a) vector vec; list lst; int i; while (cin >> i) lst.push_back(i); copy(lst.begin(), lst.end(), vec.begin()); (b) vector vec; vec.reserve(10); fill_n(vec.begin(), 10, 0); ``` Exercise 11.8: We said that algorithms do not change the size of the containers over which they operate. Why doesn't the use of back_inserter invalidate this claim? 前面说过，算法不改变它所操纵的容器的大小，为什么使用 back_inserter 也不能突破这个限制？

#### 11.2.3. 对容器元素重新排序的算法

Suppose we want to analyze the words used in a set of children's stories. For example, we might want know how many words contain six or more characters. We want to count each word only once, regardless of how many times it appears or whether it appears in multiple stories. We'd like to be able to print the words in size order, and we want the words to be in alphabetic order within a given size.

We'll assume that we have read our input and stored the text of each book in a vector of strings named words. How might we solve the part of the problem that involves counting word occurrences? To solve this problem, we'd need to:

1. Eliminate duplicate copies of each word

去掉所有重复的单词。

2. Order the words based on size

按单词的长度排序。

3. Count the words whose size is 6 or greater

统计长度等于或超过 6 个字符的单词个数。

We can use generic algorithms in each of these steps.

For purposes of illustration, we'll use the following simple story as our input:

```     the quick red fox jumps over the slow red turtle
```

Given this input, our program should produce the following output:

```     1 word 6 characters or longer
```
##### 去除重复

Assuming our input is in a vector named words, our first subproblem is to eliminate duplicates from the words:

```     // sort words alphabetically so we can find the duplicates
sort(words.begin(), words.end());
/* eliminate duplicate words:
* unique reorders words so that each word appears once in the
*    front portion of words and returns an iterator one past the unique range;
* erase uses a vector operation to remove the nonunique elements
*/
vector<string>::iterator end_unique =
unique(words.begin(), words.end());
words.erase(end_unique, words.end());
```

Our input vector contains a copy of every word used in each story. We start by sorting this vector. The sort algorithm takes two iterators that denote the range of elements to sort. It uses the < operator to compare the elements. In this call we ask that the entire vector be sorted.

vector 对象包含每个故事中使用的所有单词。首先对此 vector 对象排序。sort 算法带有两个迭代器实参，指出要排序的元素范围。这个算法使用小于（<）操作符比较元素。在本次调用中，要求对整个 vector 对象排序。

After the call to sort, our vector elements are ordered:

```     fox jumps over quick red red slow the the turtle
```

Note that the words red and the are duplicated.

##### unique 的使用

Once words is sorted, our problem is to keep only one copy of each word that is used in our stories. The unique algorithm is well suited to this problem. It takes two iterators that denote a range of elements. It rearranges the elements in the input range so that adjacent duplicated entries are eliminated and returns an iterator that denotes the end of the range of the unique values.

After the call to unique, the vector holds

Note that the size of words is unchanged. It still has ten elements; only the order of these elements has changed. The call to unique "removes" adjacent duplicates. We put remove in quotes because unique doesn't remove any elements. Instead, it overwrites adjacent duplicates so that the unique elements are copied into the front of the sequence. The iterator returned by unique denotes one past the end of the range of unique elements.

##### 使用容器操作删除元素

If we want to eliminate the duplicate items, we must use a container operation, which we do in the call to erase. This call erases the elements starting with the one to which end_unique refers through the end of words. After this call, words contains the eight unique words from the input.

 Algorithms never directly change the size of a container. If we want to add or remove elements, we must use a container operation. 算法不直接修改容器的大小。如果需要添加或删除元素，则必须使用容器操作。

It is worth noting that this call to erase would be safe even if there were no duplicated words in our vector. If there were no duplicates, then unique would return words.end(). Both arguments in the call to erase would have the same value, words.end(). The fact that the iterators are equal would mean that the range to erase would be empty. Erasing an empty range has no effect, so our program is correct even if the input has no duplicates.

##### 定义需要的实用函数

Our next subproblem is to count how many words are of length six or greater. To solve this problem, we'll use two additional generic algorithms: stable_sort and count_if. To use each of these algorithms we'll need a companion utility function, known as a predicates. A predicate is a function that performs some test and returns a type that can be used in a condition to indicate success or failure.

The first predicate we need will be used to sort the elements based on size. To do this sort, we need to define a predicate function that compares two strings and returns a bool indicating whether the first is shorter in length than the second:

```     // comparison function to be used to sort by word length
bool isShorter(const string &s1, const string &s2)
{
return s1.size() < s2.size();
}
```

The other function we need will determine whether a given string is of length six or greater:

```     // determine whether a length of a given word is 6 or more
bool GT6(const string &s)
{
return s.size() >= 6;
}
```

Although this function solves our problem, it is unnecessarily limitedthe function hardwires the size into the function itself. If we wanted to find out how many words were of another length, we'd have to write another function. We could easily write a more general comparison function that took two parameters, the string and the size. However, the function we pass to count_if takes a single argument, so we cannot use the more general approach in this program. We'll see a better way to write this part of our solution in Section 14.8.1 (p. 531).

##### 排序算法

The library defines four different sort algorithms, of which we've used the simplest, sort, tosort words into alphabetical order. In addition to sort, the library also defines a stable_sort algorithm. A stable_sort maintains the original order among equal elements. Ordinarily, we don't care about the relative order of equal elements in a sorted sequence. After all, they're equal. However, in this case, we have defined "equal" to mean "the same length." Elements that have the same length can still be distinct when viewed alphabetically. By calling stable_sort, we maintain alphabetic order among those elements that have the same length.

Both sort and stable_sort are overloaded functions. One version uses the < operator for the element type to do the comparison. We used this version of sort to sort words before looking for duplicate elements. The second version takes a third parameter: the name of a predicate to use when comparing elements. That function must take two arguments of the same type as the element type and return a value that can be tested in a condition. We will use this second version, passing our isShorter function to compare elements:

sort 和 stable_sort 都是重载函数。其中一个版本使用元素类型提供的小于（<）操作符实现比较。在查找重复元素之前，我们就是用这个 sort 版本对元素排序。第二个重载版本带有第三个形参：比较元素所使用的谓词函数的名字。这个谓词函数必须接受两个实参，实参的类型必须与元素类型相同，并返回一个可用作条件检测的值。下面将比较元素的 isShorter 函数作为实参，调用第二个版本的排序函数：

```     // sort words by size, but maintain alphabetic order for words of the same size
stable_sort(words.begin(), words.end(), isShorter);
```

After this call, words is sorted by element size, but the words of each length are also still in alphabetical order:

##### 统计长度不小于 6 的单词

Now that we've reordered our vector by word size, our remaining problem is to count how many words are of length six or greater. The count_if algorithm handles this problem:

```     vector<string>::size_type wc =
count_if(words.begin(), words.end(), GT6);
```

count_if executes by reading the range denoted by its first two parameters. It passes each value that it reads to the predicate function represented by its third argument. That function must take a single argument of the element type and must return a value that can be tested as a condition. The algorithm returns a count of the number of elements for which the function succeeded. In this case, count_if passes each word to GT6, which returns the bool value true if the word's length is six or more.

##### 将全部程序段放在一起

Having looked at the program in detail, here is the program as a whole:

```     // comparison function to be used to sort by word length
bool isShorter(const string &s1, const string &s2)
{
return s1.size() < s2.size();
}
// determine whether a length of a given word is 6 or more
bool GT6(const string &s)
{
return s.size() >= 6;
}
int main()
{
vector<string> words;
// copy contents of each book into a single vector
string next_word;
while (cin >> next_word) {
// insert next book's contents at end of words
words.push_back(next_word);
}
// sort words alphabetically so we can find the duplicates
sort (words.begin(), words.end());
/* eliminate duplicate words:
* unique reorders words so that each word appears once in the
*       front portion of words and returns an iterator one past the unique range;
* erase uses a vector operation to remove the nonunique elements
*/
vector<string>::iterator end_unique =
unique(words.begin(), words.end());
words.erase(end_unique, words.end());
// sort words by size, but maintain alphabetic order for words of the same size
stable_sort(words.begin(), words.end(), isShorter);
vector<string>::size_type wc =
count_if (words.begin(), words.end(), GT6);
cout << wc << " " << make_plural(wc, "word", "s")
<< " 6 characters or longer" << endl;
return 0;
}
```

We leave as an exercise the problem of printing the words in size order.

## Exercises Section 11.2.3

 Exercise 11.9: Implement the program to count words of size 4 or greater, including printing the list of unique words in the input. Test your program by running it on the program's source file. 编写程序统计长度不小于 4 的单词，并输出输入序列中不重复的单词。在程序源文件上运行和测试你自己编写的程序。 Exercise 11.10: The library defines a find_if function. Like find, the find_if function takes a pair of iterators that indicates a range over which to operate. Like count_if, it also takes a third parameter that names a predicate that can be used to test each element in the range. find_if returns an iterator that refers to the first element for which the function returns a nonzero value. It returns its second iterator argument if there is no such element. Use the find_if function to rewrite the portion of our program that counted how many words are greater than length six. 标准库定义了一个 find_if 函数。与 find 一样，find_if 函数带有一对迭代器形参，指定其操作的范围。与 count_if 一样，该函数还带有第三个形参，表明用于检查范围内每个元素的谓词函数。find_if 返回一个迭代器，指向第一个谓词函数返回非零值的元素。如果这样的元素不存在，则返回第二个迭代器实参。使用 find_if 函数重写上述例题中统计长度大于 6 的单词个数的程序部分。 Exercise 11.11: Why do you think the algorithms don't change the size of containers? 你认为为什么算法不改变容器的大小？ Exercise 11.12: Why was it be necessary to use erase rather than define a generic algorithm that could remove elements from the container? 为什么必须使用 erase，而不是定义一个泛型算法来删除容器中的元素？