Team LiB
Previous Section Next Section

2.1. Primitive Built-in Types

2.1. 基本内置类型

C++ defines a set of arithmetic types, which represent integers, floating-point numbers, and individual characters and boolean values. In addition, there is a special type named void. The void type has no associated values and can be used in only a limited set of circumstances. The void type is most often used as the return type for a function that has no return value.

C++ 定义了一组表示整数、浮点数、单个字符和布尔值的算术类型,另外还定义了一种称为 void 的特殊类型。void 类型没有对应的值,仅用在有限的一些情况下,通常用作无返回值函数的返回类型。

The size of the arithmetic types varies across machines. By size, we mean the number of bits used to represent the type. The standard guarantees a minimum size for each of the arithmetic types, but it does not prevent compilers from using larger sizes. Indeed, almost all compilers use a larger size for int than is strictly required. Table 2.1 (p. 36) lists the built-in arithmetic types and the associated minimum sizes.

算术类型的存储空间依机器而定。这里的存储空间是指用来表示该类型的位(bit)数。C++标准规定了每个算术类型的最小存储空间,但它并不阻止编译器使用更大的存储空间。事实上,对于int类型,几乎所有的编译器使用的存储空间都比所要求的大。int表 2.1 列出了内置算术类型及其对应的最小存储空间。

Table 2.1. C++: Arithmetic Types
表 2.1. C++ 算术类型

Type

类型

Meaning

含义

Minimum Size

最小存储空间

bool

boolean

NA

char

character

8 bits

wchar_t

wide character

16 bits

short

short integer

16 bits

int

integer

16 bits

long

long integer

32 bits

float

single-precision floating-point

6 significant digits

double

double-precision floating-point

10 significant digits

long double

extended-precision floating-point

10 significant digits


Because the number of bits varies, the maximum (or minimum) values that these types can represent also vary by machine.

因为位数的不同,这些类型所能表示的最大(最小)值也因机器的不同而有所不同。



2.1.1. Integral Types

2.1.1. 整型

The arithmetic types that represent integers, characters, and boolean values are collectively referred to as the integral types.

表示整数、字符和布尔值的算术类型合称为整型

There are two character types: char and wchar_t. The char type is guaranteed to be big enough to hold numeric values that correspond to any character in the machine's basic character set. As a result, chars are usually a single machine byte. The wchar_t type is used for extended character sets, such as those used for Chinese and Japanese, in which some characters cannot be represented within a single char.

字符类型有两种:charwchar_tchar 类型保证了有足够的空间,能够存储机器基本字符集中任何字符相应的数值,因此,char 类型通常是单个机器字节(byte)。wchar_t 类型用于扩展字符集,比如汉字和日语,这些字符集中的一些字符不能用单个 char 表示。

The types short, int, and long represent integer values of potentially different sizes. Typically, shorts are represented in half a machine word, ints in a machine word, and longs in either one or two machine words (on 32-bit machines, ints and longs are usually the same size).

shortintlong 类型都表示整型值,存储空间的大小不同。一般, short 类型为半个机器字长,int 类型为一个机器字长,而 long 类型为一个或两个机器字长(在 32 位机器中 int 类型和 long 类型通常字长是相同的)。

Machine-Level Representation of The Built-in Types

内置类型的机器级表示

The C++ built-in types are closely tied to their representation in the computer's memory. Computers store data as a sequence of bits, each of which holds either 0 or 1. A segment of memory might hold

C++ 的内置类型与其在计算机的存储器中的表示方式紧密相关。计算机以位序列存储数据,每一位存储 0 或 1。一段内存可能存储着

     00011011011100010110010000111011 ...

At the bit level, memory has no structure and no meaning.

在位这一级上,存储器是没有结构和意义的。

The most primitive way we impose structure on memory is by processing it in chunks. Most computers deal with memory as chunks of bits of particular sizes, usually powers of 2. They usually make it easy to process 8, 16, or 32 bits at a time, and chunks of 64 and 128 bits are becoming more common. Although the exact sizes can vary from one machine to another, we usually refer to a chunk of 8 bits as a "byte" and 32 bits, or 4 bytes, as a "word."

让存储具有结构的最基本方法是用块(chunk)处理存储。大部分计算机都使用特定位数的块来处理存储,块的位数一般是 2 的幂,因为这样可以一次处理 8、16 或 32 位。64 和 128 位的块如今也变得更为普遍。虽然确切的大小因机器不同而不同,但是通常将 8 位的块作为一个字节,32 位或 4 个字节作为一个“字(word)”。

Most computers associate a numbercalled an addresswith each byte in memory. Given a machine that has 8-bit bytes and 32-bit words, we might represent a word of memory as follows:

大多数计算机将存储器中的每一个字节和一个称为地址的数关联起来。对于一个 8 位字节和 32 位字的机器,我们可以将存储器的字表示如下:

736424

0

0

0

1

1

0

1

1

736425

0

1

1

1

0

0

0

1

736426

0

1

1

0

0

1

0

0

736427

0

0

1

1

1

0

1

1


In this illustration, each byte's address is shown on the left, with the 8 bits of the byte following the address.

在这个图中,左边是字节的地址,地址后面为字节的 8 位。

We can use an address to refer to any of several variously sized collections of bits starting at that address. It is possible to speak of the word at address 736424 or the byte at address 736426. We can say, for example, that the byte at address 736425 is not equal to the byte at address 736427.

可以用地址表示从该地址开始的任何几个不同大小的位集合。可以说地址为 736424 的字,也可以说地址为 736426 的字节。例如,可以说地址为736425的字节和地址为 736427 的字节不相等。

To give meaning to the byte at address 736425, we must know the type of the value stored there. Once we know the type, we know how many bits are needed to represent a value of that type and how to interpret those bits.

要让地址为 736425 的字节具有意义,必须要知道存储在该地址的值的类型。一旦知道了该地址的值的类型,就知道了表示该类型的值需要多少位和如何解释这些位。

If we know that the byte at location 736425 has type "unsigned 8-bit integer," then we know that the byte represents the number 112. On the other hand, if that byte is a character in the ISO-Latin-1 character set, then it represents the lower-case letter q. The bits are the same in both cases, but by ascribing different types to them, we interpret them differently.

如果知道地址为 736425 的字节的类型是8位无符号整数,那么就可以知道该字节表示整数 112。另外,如果这个字节是 ISO-Latin-1 字符集中的一个字符,那它就表示小写字母 q。虽然两种情况的位相同,但归属于不同类型,解释也就不同。


The type bool represents the truth values, true and false. We can assign any of the arithmetic types to a bool. An arithmetic type with value 0 yields a bool that holds false. Any nonzero value is treated as true.

bool 类型表示真值 truefalse。可以将算术类型的任何值赋给 bool 对象。0 值算术类型代表 false,任何非 0 的值都代表 true

Signed and Unsigned Types
带符号和无符号类型

The integral types, except the boolean type, may be either signed or unsigned. As its name suggests, a signed type can represent both negative and positive numbers (including zero), whereas an unsigned type represents only values greater than or equal to zero.

bool 类型外,整型可以是带符号的(signed)也可以是无符号的(unsigned)。顾名思义,带符号类型可以表示正数也可以表示负数(包括 0),而无符号型只能表示大于或等于 0 的数。

The integers, int, short, and long, are all signed by default. To get an unsigned type, the type must be specified as unsigned, such as unsigned long. The unsigned int type may be abbreviated as unsigned. That is, unsigned with no other type implies unsigned int.

整型 intshortlong 都默认为带符号型。要获得无符号型则必须指定该类型为 unsigned,比如 unsigned longunsigned int 类型可以简写为 unsigned,也就是说,unsigned 后不加其他类型说明符意味着是 unsigned int

Unlike the other integral types, there are three distinct types for char: plain char, signed char, and unsigned char. Although there are three distinct types, there are only two ways a char can be represented. The char type is respresented using either the signed char or unsigned char version. Which representation is used for char varies by compiler.

和其他整型不同,char 有三种不同的类型:plain charunsigned charsigned char。虽然 char 有三种不同的类型,但只有两种表示方式。可以使用 unsigned char 或 signed char 表示 char 类型。使用哪种 char 表示方式由编译器而定。

How Integral Values Are Represented
整型值的表示

In an unsigned type, all the bits represent the value. If a type is defined for a particular machine to use 8 bits, then the unsigned version of this type could hold the values 0 through 255.

无符号型中,所有的位都表示数值。如果在某种机器中,定义一种类型使用 8 位表示,那么这种类型的 unsigned 型可以取值 0 到 255。

The C++ standard does not define how signed types are represented at the bit level. Instead, each compiler is free to decide how it will represent signed types. These representations can affect the range of values that a signed type can hold. We are guaranteed that an 8-bit signed type will hold at least the values from 127 through 127; many implementations allow values from 128 through 127.

C++ 标准并未定义 signed 类型如何用位来表示,而是由每个编译器自由决定如何表示 signed 类型。这些表示方式会影响 signed 类型的取值范围。8 位 signed 类型的取值肯定至少是从 -127 到 127,但也有许多实现允许取值从 -128 到 127。

Under the most common strategy for representing signed integral types, we can view one of the bits as a sign bit. Whenever the sign bit is 1, the value is negative; when it is 0, the value is either 0 or a positive number. An 8-bit integral signed type represented using a sign-bit can hold values from 128 through 127.

表示 signed 整型类型最常见的策略是用其中一个位作为符号位。符号位为 1,值就为负数;符号位为 0,值就为 0 或正数。一个 signed 整型取值是从 -128 到 127。

Assignment to Integral Types
整型的赋值

The type of an object determines the values that the object can hold. This fact raises the question of what happens when one tries to assign a value outside the allowable range to an object of a given type. The answer depends on whether the type is signed or unsigned.

对象的类型决定对象的取值。这会引起一个疑问:当我们试着把一个超出其取值范围的值赋给一个指定类型的对象时,结果会怎样呢?答案取决于这种类型是 signed 还是 unsigned 的。

For unsigned types, the compiler must adjust the out-of-range value so that it will fit. The compiler does so by taking the remainder of the value modulo the number of distinct values the unsigned target type can hold. An object that is an 8-bit unsigned char, for example, can hold values from 0 through 255 inclusive. If we assign a value outside this range, the compiler actually assigns the remainder of the value modulo 256. For example, we might attempt to assign the value 336 to an 8-bit signed char. If we try to store 336 in our 8-bit unsigned char, the actual value assigned will be 80, because 80 is equal to 336 modulo 256.

对于 unsigned 类型来说,编译器必须调整越界值使其满足要求。编译器会将该值对 unsigned 类型的可能取值数目求模,然后取所得值。比如 8 位的 unsigned char,其取值范围从 0 到 255(包括 255)。如果赋给超出这个范围的值,那么编译器将会取该值对 256 求模后的值。例如,如果试图将 336 存储到 8 位的 unsigned char 中,则实际赋值为 80,因为 80 是 336 对 256 求模后的值。

For the unsigned types, a negative value is always out of range. An object of unsigned type may never hold a negative value. Some languages make it illegal to assign a negative value to an unsigned type, but C++ does not.

对于 unsigned 类型来说,负数总是超出其取值范围。unsigned 类型的对象可能永远不会保存负数。有些语言中将负数赋给 unsigned 类型是非法的,但在 C++ 中这是合法的。

In C++ it is perfectly legal to assign a negative number to an object with unsigned type. The result is the negative value modulo the size of the type. So, if we assign 1 to an 8-bit unsigned char, the resulting value will be 255, which is 1 modulo 256.

C++ 中,把负值赋给 unsigned 对象是完全合法的,其结果是该负数对该类型的取值个数求模后的值。所以,如果把 -1 赋给8位的 unsigned char,那么结果是 255,因为 255 是 -1 对 256 求模后的值。



When assigning an out-of-range value to a signed type, it is up to the compiler to decide what value to assign. In practice, many compilers treat signed types similarly to how they are required to treat unsigned types. That is, they do the assignment as the remainder modulo the size of the type. However, we are not guaranteed that the compiler will do so for the signed types.

当将超过取值范围的值赋给 signed 类型时,由编译器决定实际赋的值。在实际操作中,很多的编译器处理 signed 类型的方式和 unsigned 类型类似。也就是说,赋值时是取该值对该类型取值数目求模后的值。然而我们不能保证编译器都会这样处理 signed 类型。

2.1.2. Floating-Point Types

2.1.2. 浮点型

The types float, double, and long double represent floating-point single-, double-, and extended-precision values. Typically, floats are represented in one word (32 bits), doubles in two words (64 bits), and long double in either three or four words (96 or 128 bits). The size of the type determines the number of significant digits a floating-point value might contain.

类型 floatdoublelong double 分别表示单精度浮点数、双精度浮点数和扩展精度浮点数。一般 float 类型用一个字(32 位)来表示,double 类型用两个字(64 位)来表示,long double 类型用三个或四个字(96 或 128 位)来表示。类型的取值范围决定了浮点数所含的有效数字位数。

The float type is usually not precise enough for real programsfloat is guaranteed to offer only 6 significant digits. The double type guarantees at least 10 significant digits, which is sufficient for most calculations.

对于实际的程序来说,float 类型精度通常是不够的——float 型只能保证 6 位有效数字,而 double 型至少可以保证 10 位有效数字,能满足大多数计算的需要。



Advice: Using the Built-in Arithmetic Types

建议:使用内置算术类型

The number of integral types in C++ can be bewildering. C++, like C, is designed to let programs get close to the hardware when necessary, and the integral types are defined to cater to the peculiarities of various kinds of hardware. Most programmers can (and should) ignore these complexities by restricting the types they actually use.

C++ 中整型数有点令人迷惑不解。就像 C 语言一样,C++ 被设计成允许程序在必要时直接处理硬件,因此整型被定义成满足各种各样硬件的特性。大多数程序员可以(应该)通过限制实际使用的类型来忽略这些复杂性。

In practice, many uses of integers involve counting. For example, programs often count the number of elements in a data structure such as a vector or an array. We'll see in Chapters 3 and 4 that the library defines a set of types to use when dealing with the size of an object. When counting such elements it is always right to use the library-defined type intended for this purpose. When counting in other circumstances, it is usually right to use an unsigned value. Doing so avoids the possibility that a value that is too large to fit results in a (seemingly) negative result.

实际上,许多人用整型进行计数。例如:程序经常计算像 vector 或数组这种数据结构的元素个数。在第三章第四章中,我们将看到标准库定义了一组类型用于统计对象的大小。因此,当计数这些元素时使用标准库定义的类型总是正确的。其他情况下,使用 unsigned 类型比较明智,可以避免值越界导致结果为负数的可能性。

When performing integer arithmetic, it is rarely right to use shorts. In most programs, using shorts leads to mysterious bugs when a value is assigned to a short that is bigger than the largest number it can hold. What happens depends on the machine, but typically the value "wraps around" so that a number too large to fit turns into a large negative number. For the same reason, even though char is an integral type, the char type should be used to hold characters and not for computation. The fact that char is signed on some implementations and unsigned on others makes it problematic to use it as a computational type.

当执行整型算术运算时,很少使用 short 类型。大多数程序中,使用 short 类型可能会隐含赋值越界的错误。这个错误会产生什么后果将取决于所使用的机器。比较典型的情况是值“截断(wrap around)”以至于因越界而变成很大的负数。同样的道理,虽然 char 类型是整型,但是 char 类型通常用来存储字符而不用于计算。事实上,在某些应用中 char 类型被当作 signed 类型,在另外一些应用中则被当作 unsigned 类型,因此把 char 类型作为计算类型使用时容易出问题。

On most machines, integer calculations can safely use int. Technically speaking, an int can be as small as 16 bitstoo small for most purposes. In practice, almost all general-purpose machines use 32-bits for ints, which is often the same size used for long. The difficulty in deciding whether to use int or long occurs on machines that have 32-bit ints and 64-bit longs. On such machines, the run-time cost of doing arithmetic with longs can be considerably greater than doing the same calculation using a 32-bit int. Deciding whether to use int or long requires detailed understanding of the program and the actual run-time performance cost of using long versus int.

在大多数机器上,使用 int 类型进行整型计算不易出错。就技术上而言,int 类型用 16 位表示——这对大多数应用来说太小了。实际应用中,大多数通用机器都是使用和 long 类型一样长的 32 位来表示 int 类型。整型运算时,用 32 位表示 int 类型和用 64 位表示 long 类型的机器会出现应该选择 int 类型还是 long 类型的难题。在这些机器上,用 long 类型进行计算所付出的运行时代价远远高于用 int 类型进行同样计算的代价,所以选择类型前要先了解程序的细节并且比较 long 类型与 int 类型的实际运行时性能代价。

Determining which floating-point type to use is easier: It is almost always right to use double. The loss of precision implicit in float is significant, whereas the cost of double precision calculations versus single precision is negligible. In fact, on some machines, double precision is faster than single. The precision offered by long double usually is unnecessary and often entails considerable extra run-time cost.

决定使用哪种浮点型就容易多了:使用 double 类型基本上不会有错。在 float 类型中隐式的精度损失是不能忽视的,而 double 类型精度代价相对于 float 类型精度代价可以忽略。事实上,有些机器上,double 类型比 float 类型的计算要快得多。long double 类型提供的精度通常没有必要,而且还需要承担额外的运行代价。


Exercises Section 2.1.2

Exercise 2.1:

What is the difference between an int, a long, and a short value?

intlongshort 类型之间有什么差别?

Exercise 2.2:

What is the difference between an unsigned and a signed type?

unsignedsigned 类型有什么差别?

Exercise 2.3:

If a short on a given machine has 16 bits then what is the largest number that can be assigned to a short? To an unsigned short?

如果在某机器上 short 类型占 16 位,那么可以赋给 short 类型的最大数是什么?unsigned short 类型的最大数又是什么?

Exercise 2.4:

What value is assigned if we assign 100,000 to a 16-bit unsigned short? What value is assigned if we assign 100,000 to a plain 16-bit short?

当给 16 位的 unsigned short 对象赋值 100 000 时,赋的值是什么?

Exercise 2.5:

What is the difference between a float and a double?

float 类型和 double 类型有什么差别?

Exercise 2.6:

To calculate a mortgage payment, what types would you use for the rate, principal, and payment? Explain why you selected each type.

要计算抵押贷款的偿还金额,利率、本金和付款额应分别选用哪种类型?解释你选择的理由。


Team LiB
Previous Section Next Section