别名 (计算)

别名(Aliasing)是指内存中的一个数据位置可以通过程序中的多个名稱来访问。通过某一個名稱修改数据,其他别名关联的值也會改变,這是程式設計師可能不會預期到的。别名的存在使得程式的理解、分析及优化程序变得困难。别名分析可以分析处理程序中有關别名的信息。

例子

缓冲区溢出

大部份C語言的實現都不會有陣列索引的边界检查。因此,可以利用此一漏洞,寫入在陣列範圍外的資料(缓冲区溢出),根據C語言的標準,這是未定义行为,但在大部份沒有陣列索引边界检查的C語言中,會出現上述的别名效果,用某一個名稱更改資料,而對應別名的數值隨之變化。

若陣列是在呼叫堆疊中產生,而有變數恰好就在陣列位置的前後,寫入陣列索引範圍外的元素,可能就會改到該變數。例如,假設有二個元素的int陣列(其名稱為arr),後面是一個int變數(名稱是i),若arr[2](陣列的第三個元素)位置和i相同,這二個變數就互為別名。

# include <stdio.h>

int main()
{
 int arr[2] = { 1, 2 };
 int i=10;

 /* Write beyond the end of arr. Undefined behaviour in standard C, will write to i in some implementations. */
 arr[2] = 20;

 printf("element 0: %d \t", arr[0]); // outputs 1
 printf("element 1: %d \t", arr[1]); // outputs 2
 printf("element 2: %d \t", arr[2]); // outputs 20, if aliasing occurred
 printf("i: %d \t\t", i); // might also output 20, not 10, because of aliasing, but the compiler might have i stored in a register and print 10
 /* arr size is still 2. */
 printf("arr size: %d \n", (sizeof(arr) / sizeof(int)));
}

在一些C語言的實現中,有可能會出現上述的結果,因為這些實現會為陣列安排一塊連續的記憶體,而陣列元素就是用陣列位置再位移陣列索引值乘以陣列元素大小,再進行間接定址。C語言沒有邊界檢查,因此陣列的存取可能會超過陣列範圍。上述的別名效果其實屬於未定义行为,有些實現方式會不會讓堆疊中的變數緊鄰陣列,例如,依其處理器的長度有對齊功能等。C語言標準沒有特別說明資料在記憶體中擺放的方式(ISO/IEC 9899:1999, section 6.2.6.1)。

若C語言編輯器在存取陣列範圍以外的位置時,沒有別名效果,這也是可以的。

别名指针

另一種程式語言中會出現的別名,是指用不同的變數(例如指標)參考同一個位置的記憶體。例如XOR交換演算法英语XOR swap algorithm,其引數是二個指標,函式會假設二個指標指向不同的位置。若二個指標的位置相同(或互為別名),程式可能會出現錯誤。對於接受指標作為引數的函式來說,這是常見的問題,是否允許二個指標互為別名,需要明確的說明,特別是在會在指標指向記憶區塊,進行複雜處理的函式。

已隱藏部分未翻譯内容,歡迎參與翻譯

Specified aliasing

Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that enabled by memory layout in C). It is common practice in Fortran. The Perl programming language specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,

my @array = (1, 2, 3);

foreach my $element (@array) {
   # Increment $element, thus automatically
   # modifying @array, since $element is ''aliased''
   # to each of @array's elements in turn.
   $element++;
}

 print "@array \n";

will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.

优化时冲突

优化编译器在存在指针时往往对变量做出保守假设。如常量传播能否使用。代码重排序(code reordering)也受别名的影响,这可能会改善指令调度或允许更多的循环优化英语loop optimization.

C语言C99标准,提出了严格别名规则(strict aliasing rule)见section 6.5, paragraph 7。指出使用不同类型的指针访问同一内存位置是违规的。编译器因而可以假定不同类型的指针不会是别名,这可能带来性能的巨大提升。[1]一些著名项目,如Python 2违反了此规则。[2]Linux内核也解决了类似问题。[3] 使用gcc编译选项-fno-strict-aliasing可关闭此规则。

C++11规定下述广义左值类型为严格别名规则的例外情形:

  • 对象的动态类型
  • cv量化版本
  • signed或unsigned版本
  • 聚合类型(如struct、class)或union类型,包含此前所指的类型作为它的元素,或非静态数据成员(包括递归嵌套类型)
  • 动态类型的基类型
  • char或unsigned
已隱藏部分未翻譯内容,歡迎參與翻譯

硬件别名

The term aliasing is also used to describe the situation where, due to either a hardware design choice or a hardware failure, one or more of the available address bits is not used in the memory selection process.[4] This may be a design decision if there are more address bits available than are necessary to support the installed memory device(s). In a failure, one or more address bits may be shorted together, or may be forced to ground (logic 0) or the supply voltage (logic 1).

Example

For this example, we assume a memory design with 8 locations, requiring only 3 address lines (or bits) since 23 = 8). Address bits (named A2 through A0) are decoded to select unique memory locations as follows, in standard binary counter fashion:

A2 A1 A0 Memory location
0 0 0 0
0 0 1 1
0 1 0 2
0 1 1 3
1 0 0 4
1 0 1 5
1 1 0 6
1 1 1 7

In the table above, each of the 8 unique combinations of address bits selects a different memory location. However, if one address bit (say A2) were to be shorted to ground, the table would be modified as follows:

A2 A1 A0 Memory location
0 0 0 0
0 0 1 1
0 1 0 2
0 1 1 3
0 0 0 0
0 0 1 1
0 1 0 2
0 1 1 3

In this case, with A2 always being zero, the first four memory locations are duplicated and appear again as the second four. Memory locations 4 through 7 have become inaccessible.

If this change occurred to a different address bit, the decoding results would be different, but in general the effect would be the same: the loss of a single address bit cuts the available memory space in half, with resulting duplication (aliasing) of the remaining space.

参见

参考文献

  1. ^ Mike Acton. Understanding Strict Aliasing. 2006-06-01 [2017-11-20]. (原始内容存档于2013-05-08). 
  2. ^ Neil Schemenauer. ANSI strict aliasing and Python. 2003-07-17 [2017-11-20]. (原始内容存档于2020-06-05). 
  3. ^ Linus Torvalds. Re: Invalid compilation without -fno-strict-aliasing. 2003-02-26 [2017-11-20]. (原始内容存档于2020-11-12). 
  4. ^ Michael Barr. Software Based Memory Testing. 2012-07-27 [2017-11-20]. (原始内容存档于2020-11-29). 

外部链接