通过定制编程来最大化Java程序性能-技术开发专区

通过定制编程来最大化Java程序性能

作者：IT168 东方蜘蛛编译编辑：覃里 2008-09-10 13:35 来源：IT168�

　　普通排序VS鸽巢(Pigeon-Hole)排序

　　接下来，我们放宽对输入序列的一个限制，允许输入元组作为输入数组的元素，这意味着我们必须在输入数列中保存元素的标识。在这个例子中，我们需要对n个有序整数对进行排序，其中整数对中的第一个值作为排序键(key)。整数对的第二个值被看做“相关数据”。

　　在Listing2中我们显示了使用Collections和Arrays类如何进行排序，从例子中我们看到了OrderedPair类，它包含两个整数，其中k是排序的键值。我们的排序程序所要求的接口是PigeonHoleable。

1 public interface PigeonHoleable
2 {
3     /**
4      * Returns the integer key
5      */
6     int getKey();
7 }
8
9 public class OrderedPair implements PigeonHoleable
10 {
11 private int k;
12     private int x;
13
14 public OrderedPair(int k, int x)
15 {
16 this.k = k;
17 this.x = x;
18 }
19
20     public int getKey()
21 {
22 return k;
23 }
24
25     public String toString()
26     {
27         return "(" + k + "," + x + ")";
28     }
29 }

　　在Listing 3中我们同样使用了Collections和Arrays类来对这些有序对进行排序，但是这次我们还使用了一个可以比较键值的比较器comparator。

　　Listing 3

1 /**
2 * Sorts a list of ordered pairs using java Collections sort
3 */
4 public ArrayList<PigeonHoleable>
5 pairSort(ArrayList<PigeonHoleable> inputSequence)
6 {
7     Collections.sort(inputSequence, new Comparator<PigeonHoleable>(){
8         public int compare(PigeonHoleable o1,PigeonHoleable o2)
9         {
10             return (o1.getKey() - o2.getKey());
11         }
12     });
13
14     return inputSequence;
15 }
16
17 /**
18 * Sorts an array of integers pairs using the Arrays class sort
19 */
20 public PigeonHoleable[]
21 pairSort(PigeonHoleable[] inputSequence)
22 {
23     Arrays.sort(inputSequence, new Comparator<PigeonHoleable>(){
24         public int compare(PigeonHoleable o1,PigeonHoleable o2)
25         {
26             return (o1.getKey() - o2.getKey());
27         }
28     });
29
30     return inputSequence;
31 }

　　在我们的定制程序中包含了鸽巢排序的改进，如Listing 4所示。

1 /**
2 * Implements the pigeon-hole sort algorithm, and maintains
3 * the identity of the elements, so it can deal with other types
4 * than integers. This means that whole records can be sorted
5 * effectively, with the associated fields remaining with the
6 * original element. This is a stable sort.
7 */
8 public PigeonHoleable[] pigeonHoleSort(PigeonHoleable[] inputSequence)
9 {
10     // Find the maximum value
11     int maxValue = -1;
12     for(int i=0; i<inputSequence.length; i++)
13     {
14         int x = inputSequence[i].getKey();
15         if(x > maxValue)
16         {
17             maxValue = x;
18         }
19     }
20
21     // Allocate an array, 1 for each value
22     int[] counts = new int[maxValue + 1];
23
24     // Count the occurences of each key value
25     for(PigeonHoleable b : inputSequence)
26     {
27         counts[b.getKey()] += 1;
28     }
29
30     // Allocate a ragged array of arrays
31     PigeonHoleable[][] slots = new PigeonHoleable[maxValue + 1][];
32     for(int i=0; i<counts.length; i++)
33     {
34         int size = counts[i];
35         if(size>0)
36         {
37             slots[i] = new PigeonHoleable[size];
38         }
39
40         // Zero the counts array as we go so we can reuse them
41         // for indexes
42         counts[i]=0;
43     }
44
45     // Now spin through the input inserting elements directly
46     // into the correct slots
47     for(int i=0; i<inputSequence.length; i++)
48     {
49         PigeonHoleable bs  = inputSequence[i];
50         int k = bs.getKey();
51         int j = counts[k]++;
52         slots[k][j]=bs;
53     }
54
55     // Spin through re-inserting the slot references back into
56     // the input sequence in the correct order
57     for(int i = 0, j = 0; i <= maxValue; i++)
58     {
59         int c = counts[i];
60         for(int k = 0; k < c; k++, j++)
61         {
62             inputSequence[j] = slots[i][k];
63         }
64     }
65
66     return inputSequence;
67 }

　　和前面我们所使用的计数排序一样，这个程序开始先查找输入序列中的最大键值，然后使用它来创建一个计数数组counts。这个输入数列然后再次被检查，在计数数组中相应位置再累加每一个键值出现的次数。接下来是与前面计数排序的不同之处，利用counts数组中的数值，我们创建一个元素为数组的数组slots，并将counts数组的所有值归零。如Figure 1所示。

　　Figure 1

　　然后程序对输入数列进行遍历，并直接将每一个输入序列中元素的插入到slots数组中，使用键值作为第一个索引，counts数组中的值作为第二个索引。在这个循环中，counts数组被用于其它用处，来保存“下一个索引”。这就是为什么每一个counts数组的值要被归零的原因。过程如Figure 2所示。

　　Figure 2

　　最后对这个二维数组的数组进行遍历，按顺序读出每一个指向，就得到了我们要排序的结果。即，如果我们最初的输入序列是(3,6)(1,2)(4,8)(7,1)(1,5) (4,9)(0,4)，则得到排序的结果为(0,4)(1,2)(1,5)(3,6)(4,8)(4,9)(7,1)。

　　这个鸽巢排序算法的实现共对原始序列进行了四次遍历，每一次所需要的时间复杂度为O(n).一旦发现最大值(k)后，第二次累加不同数值出现的次数，第三次则插入指向到子数组中，最后将循环读取子数组的指向到排序后的列表中。在空间方面，这个算法分配了两个数组，counts和slots，大小均为k，另外slots中的每一个子数组也分配空间，空间复杂度为O(n)。这样总体时间复杂度和内存使用约为O(n+k)。和我们前面描述的技术排序一样，这个鸽巢排序是一个可靠的排序，不仅与排序元素数量有关，而且和输入元素中数值的大小有关。

　　现在，我们再次对三种排序方法通过图形来比较其性能，同样分别取不同数量、不同最大关键值的情况进行对比。

　　图4中显示关键值大小在0到100，排序元素数量在20000和1000000之间的时间。

　
图4

　　图5中关键值范围在0到10000之间

图5

　　图6中关键值k=1000000

　　
图6

　　从图6中我们可以看到鸽巢排序在20000

　　Java中Arrays对有序对的排序明显要性能低的多，而Collections排序则表现好一些，比Arrays排序快很多。但是它们两个都比鸽巢排序要慢的多，尤其对于键值k比较小的情况下，鸽巢排序要比Arrays排序性能提高15到17倍，比Collections排序要提高12到13倍。当键值k变大的时候，性能提高倍数不再那么明显，大约有3-4倍，原因可能是指定内存的时间随着k值的增大而变得非常耗时。

　　通过预先分配必要内存来执行排序或重用的方法，还有可能提高这个排序算法的性能，不过这是否可行要取决于要排序的元素的特点。

第1页：高效编程第2页：普通排序VS计数排序第3页：普通排序VS鸽巢(Pigeon-Hole)排序第4页：结论

关注我们