Team LiB
Previous Section Next Section

2.2. Literal Constants

2.2. 字面值常量

A value, such as 42, in a program is known as a literal constant: literal because we can speak of it only in terms of its value; constant because its value cannot be changed. Every literal has an associated type. For example, 0 is an int and 3.14159 is a double. Literals exist only for the built-in types. There are no literals of class types. Hence, there are no literals of any of the library types.

42 这样的值,在程序中被当作字面值常量。称之为字面值是因为只能用它的值称呼它,称之为常量是因为它的值不能修改。每个字面值都有相应的类型,例如:0int 型,3.14159double 型。只有内置类型存在字面值,没有类类型的字面值。因此,也没有任何标准库类型的字面值。

Rules for Integer Literals

整型字面值规则

We can write a literal integer constant using one of three notations: decimal, octal, or hexadecimal. These notations, of course, do not change the bit representation of the value, which is always binary. For example, we can write the value 20 in any of the following three ways:

定义字面值整数常量可以使用以下三种进制中的任一种:十进制、八进制和十六进制。当然这些进制不会改变其二进制位的表示形式。例如,我们能将值 20 定义成下列三种形式中的任意一种:

     20     // decimal
     024    // octal
     0x14   // hexadecimal

Literal integer constants that begin with a leading 0 (zero) are interpreted as octal; those that begin with either 0x or 0X are interpreted as hexadecimal.

0(零)开头的字面值整数常量表示八进制,以 0x0X 开头的表示十六进制。

By default, the type of a literal integer constant is either int or long. The precise type depends on the value of the literalvalues that fit in an int are type int and larger values are type long. By adding a suffix, we can force the type of a literal integer constant to be type long or unsigned or unsigned long. We specify that a constant is a long by immediately following the value with either L or l (the letter "ell" in either uppercase or lowercase).

字面值整数常量的类型默认为 intlong 类型。其精度类型决定于字面值——其值适合 int 就是 int 类型,比 int 大的值就是 long 类型。通过增加后缀,能够强制将字面值整数常量转换为 longunsignedunsigned long 类型。通过在数值后面加 L 或者 l(字母“l”大写或小写)指定常量为 long 类型。

When specifying a long, use the uppercase L: the lowercase letter l is too easily mistaken for the digit 1.

定义长整型时,应该使用大写字母 L。小写字母 l 很容易和数值 1 混淆。



In a similar manner, we can specify unsigned by following the literal with either U or u. We can obtain an unsigned long literal constant by following the value by both L and U. The suffix must appear with no intervening space:

类似地,可通过在数值后面加 Uu 定义 unsigned 类型。同时加 LU 就能够得到 unsigned long 类型的字面值常量。但其后缀不能有空格:

     128u     /* unsigned   */          1024UL    /* unsigned long   */
     1L       /* long    */             8Lu        /* unsigned long   */

There are no literals of type short.

没有 short 类型的字面值常量。

Rules for Floating-Point Literals

浮点字面值规则

We can use either common decimal notation or scientific notation to write floating-point literal constants. Using scientific notation, the exponent is indicated either by E or e. By default, floating-point literals are type double. We indicate single precision by following the value with either F or f. Similarly, we specify extended precision by following the value with either L or l (again, use of the lowercase l is discouraged). Each pair of literals below denote the same underlying value:

通常可以用十进制或者科学计数法来表示浮点字面值常量。使用科学计数法时,指数用 E 或者 e 表示。默认的浮点字面值常量为 double 类型。在数值的后面加上 Ff 表示单精度。同样加上 L 或者 l 表示扩展精度(再次提醒,不提倡使用小写字母l)。下面每一组字面值表示相同的值:

     3.14159F            .001f          12.345L            0.
     3.14159E0f          1E-3F          1.2345E1L          0e0

Boolean and Character Literals

布尔字面值和字符字面值

The words true and false are literals of type bool:

单词 truefalse 是布尔型的字面值:

     bool test = false;

Printable character literals are written by enclosing the character within single quotation marks:

可打印的字符型字面值通常用一对单引号来定义:

     'a'         '2'         ','         ' ' // blank

Such literals are of type char. We can obtain a wide-character literal of type wchar_t by immediately preceding the character literal with an L, as in

这些字面值都是 char 类型的。在字符字面值前加 L 就能够得到 wchar_t 类型的宽字符字面值。如:

     L'a'

Escape Sequences for Nonprintable Characters

非打印字符的转义序列

Some characters are nonprintable. A nonprintable character is a character for which there is no visible image, such as backspace or a control character. Other characters have special meaning in the language, such as the single and double quotation marks, and the backslash. Nonprintable characters and special characters are written using an escape sequence. An escape sequence begins with a backslash. The language defines the following escape sequences:

有些字符是不可打印的。不可打印字符实际上是不可显示的字符,比如退格或者控制符。还有一些在语言中有特殊意义的字符,例如单引号、双引号和反斜线符号。不可打印字符和特殊字符都用转义字符书写。转义字符都以反斜线符号开始,C++ 语言中定义了如下转义字符:

newline

换行符

\n

horizontal tab

水平制表符

\t

vertical tab

纵向制表符

\v

backspace

退格符

\b

carriage return

回车符

\r

formfeed

进纸符

\f

alert (bell)

报警(响铃)符

\a

backslash

反斜线

\\

question mark

疑问号

\?

single quote

单引号

\'

double quote

双引号

\"

   

We can write any character as a generalized escape sequence of the form

我们可以将任何字符表示为以下形式的通用转义字符:

     \ooo

where ooo represents a sequence of as many as three octal digits. The value of the octal digits represents the numerical value of the character. The following examples are representations of literal constants using the ASCII character set:

这里 ooo 表示三个八进制数字,这三个数字表示字符的数字值。下面的例子是用 ASCII 码字符集表示字面值常量:

     \7 (bell)      \12 (newline)     \40 (blank)
     \0 (null)      \062 ('2')        \115 ('M')

The character represented by '\0' is often called a "null character," and has special significance, as we shall soon see.

字符’\0’通常表示“空字符(null character)”,我们将会看到它有着非常特殊的意义。

We can also write a character using a hexadecimal escape sequence

同样也可以用十六进制转义字符来定义字符:

     \xddd

consisting of a backslash, an x, and one or more hexadecimal digits.

它由一个反斜线符、一个 x 和一个或者多个十六进制数字组成。

Character String Literals

字符串字面值

All of the literals we've seen so far have primitive built-in types. There is one additional literalstring literalthat is more complicated. String literals are arrays of constant characters, a type that we'll discuss in more detail in Section 4.3 (p. 130).

之前见过的所有字面值都有基本内置类型。还有一种字面值(字符串字面值)更加复杂。字符串字面值是一串常量字符,这种类型将在第 4.3 节详细说明。

String literal constants are written as zero or more characters enclosed in double quotation marks. Nonprintable characters are represented by their underlying escape sequence.

字符串字面值常量用双引号括起来的零个或者多个字符表示。不可打印字符表示成相应的转义字符。

     "Hello World!"                 // simple string literal
     ""                             // empty string literal
     "\nCC\toptions\tfile.[cC]\n"   // string literal using newlines and tabs

For compatibility with C, string literals in C++ have one character in addition to those typed in by the programmer. Every string literal ends with a null character added by the compiler. A character literal

为了兼容 C 语言,C++ 中所有的字符串字面值都由编译器自动在末尾添加一个空字符。字符字面值

     'A' // single quote: character literal

represents the single character A, whereas

表示单个字符 A,然而

     "A" // double quote: character string literal

represents an array of two characters: the letter A and the null character.

表示包含字母 A 和空字符两个字符的字符串。

Just as there is a wide character literal, such as

正如存在宽字符字面值,如

        L'a'

there is a wide string literal, again preceded by L, such as

也存在宽字符串字面值,一样在前面加“L”,如

      L"a wide string literal"

The type of a wide string literal is an array of constant wide characters. It is also terminated by a wide null character.

宽字符串字面值是一串常量宽字符,同样以一个宽空字符结束。

Concatenated String Literals

字符串字面值的连接

Two string literals (or two wide string literals) that appear adjacent to one another and separated only by spaces, tabs, or newlines are concatenated into a single new string literal. This usage makes it easy to write long literals across separate lines:

两个相邻的仅由空格、制表符或换行符分开的字符串字面值(或宽字符串字面值),可连接成一个新字符串字面值。这使得多行书写长字符串字面值变得简单:

     // concatenated long string literal
     std::cout << "a multi-line "
                  "string literal "
                  "using concatenation"
               << std::endl;

When executed this statement would print:

执行这条语句将会输出:

     a multi-line string literal using concatenation

What happens if you attempt to concatenate a string literal and a wide string literal? For example:

如果连接字符串字面值和宽字符串字面值,将会出现什么结果呢?例如:

     // Concatenating plain and wide character strings is undefined
     std::cout << "multi-line " L"literal " << std::endl;

The result is undefinedthat is, there is no standard behavior defined for concatenating the two different types. The program might appear to work, but it also might crash or produce garbage values. Moreover, the program might behave differently under one compiler than under another.

其结果是未定义的,也就是说,连接不同类型的行为标准没有定义。这个程序可能会执行,也可能会崩溃或者产生没有用的值,而且在不同的编译器下程序的动作可能不同。

Multi-Line Literals

多行字面值

There is a more primitive (and less useful) way to handle long strings that depends on an infrequently used program formatting feature: Putting a backslash as the last character on a line causes that line and the next to be treated as a single line.

处理长字符串有一个更基本的(但不常使用)方法,这个方法依赖于很少使用的程序格式化特性:在一行的末尾加一反斜线符号可将此行和下一行当作同一行处理。

As noted on page 14, C++ programs are largely free-format. In particular, there are only a few places that we may not insert whitespace. One of these is in the middle of a word. In particular, we may not break a line in the middle of a word. We can circumvent this rule by using a backslash:

正如第 1.4.1 节提到的,C++ 的格式非常自由。特别是有一些地方不能插入空格,其中之一是在单词中间。特别是不能在单词中间断开一行。但可以通过使用反斜线符号巧妙实现:

      // ok: A \ before a newline ignores the line break
      std::cou\
      t << "Hi" << st\
      d::endl;

is equivalent to

等价于

      std::cout << "Hi" << std::endl;

We could use this feature to write a long string literal:

可以使用这个特性来编写长字符串字面值:

           // multiline string literal
           std::cout << "a multi-line \
      string literal \
      using a backslash"
                    << std::endl;
          return 0;
      }

Note that the backslash must be the last thing on the lineno comments or trailing blanks are allowed. Also, any leading spaces or tabs on the subsequent lines are part of the literal. For this reason, the continuation lines of the long literal do not have the normal indentation.

注意反斜线符号必须是该行的尾字符——不允许有注释或空格符。同样,后继行行首的任何空格和制表符都是字符串字面值的一部分。正因如此,长字符串字面值的后继行才不会有正常的缩进。

Advice: Don't Rely on Undefined Behavior

建议:不要依赖未定义行为

Programs that use undefined behavior are in error. If they work, it is only by coincidence. Undefined behavior results from a program error that the compiler cannot detect or from an error that would be too much trouble to detect.

使用了未定义行为的程序都是错误的,即使程序能够运行,也只是巧合。未定义行为源于编译器不能检测到的程序错误或太麻烦以至无法检测的错误。

Unfortunately, programs that contain undefined behavior can appear to execute correctly in some circumstances and/or on one compiler. There is no guarantee that the same program, compiled under a different compiler or even a subsequent release of the current compiler, will continue to run correctly. Nor is there any guarantee that what works with one set of inputs will work with another.

不幸的是,含有未定义行为的程序在有些环境或编译器中可以正确执行,但并不能保证同一程序在不同编译器中甚至在当前编译器的后继版本中会继续正确运行,也不能保证程序在一组输入上可以正确运行且在另一组输入上也能够正确运行。

Programs should not (knowingly) rely on undefined behavior. Similarly, programs usually should not rely on machine-dependent behavior, such as assuming that the size of an int is a fixed and known value. Such programs are said to be nonportable. When the program is moved to another machine, any code that relies on machine-dependent behavior may have to be found and corrected. Tracking down these sorts of problems in previously working programs is, mildly put, a profoundly unpleasant task.

程序不应该依赖未定义行为。同样地,通常程序不应该依赖机器相关的行为,比如假定 int 的位数是个固定且已知的值。我们称这样的程序是不可移植的。当程序移植到另一台机器上时,要寻找并更改任何依赖机器相关操作的代码。在本来可以运行的程序中寻找这类问题是一项非常不愉快的任务。

Exercises Section 2.2

Exercise 2.7:

Explain the difference between the following sets of literal constants:

解释下列字面值常量的不同之处。

  (a) 'a',L 'a',"a",L"a"
  (b) 10, 10u, 10L, 10uL, 012, 0xC
  (c) 3.14, 3.14f, 3.14L
Exercise 2.8:

Determine the type of each of these literal constants:

确定下列字面值常量的类型:

      (a) -10 (b) -10u (c) -10. (d) -10e-2
Exercise 2.9:

Which, if any, of the following are illegal?

下列哪些(如果有)是非法的?

      (a) "Who goes with F\145rgus?\012"
      (b) 3.14e1L          (c) "two" L"some"
      (d) 1024f            (e) 3.14UL
      (f) "multiple line
           comment"
Exercise 2.10:

Using escape sequences, write a program to print 2M followed by a newline. Modify the program to print 2, then a tab, then an M, followed by a newline.

使用转义字符编写一段程序,输出 2M,然后换行。修改程序,输出 2,跟着一个制表符,然后是 M,最后是换行符。

Team LiB
Previous Section Next Section