This is why it takes me so long to write (or rewrite) a book. Examples like this, which I think are done, rear up and gobble lots of time. The good thing is that I end up understanding all the problems in much more depth. But it makes books -- especially one the size of
Thinking in Java fourth edition -- take forever. The following was taken directly from the book:
To prevent code duplication and to provide consistency among
tests, Ive put the basic functionality of the test process into a framework.
The following code establishes a base class from which you create a list of anonymous inner classes, one for each different test. Each of these inner classes is
called as part of the testing process. This approach allows you to easily add
and remove new kinds of tests.
This is another example of the Template Method design
pattern. Although you follow the typical approach of overriding the template
method Test.test( ) for each particular test, in this case the core
code (that doesnt change) is in a separate Tester class. The
type of container under test is the generic parameter C:
//: containers/Test.java
// Framework for performing timed tests of containers.
public abstract class Test<C> {
String name;
public Test(String name) { this.name = name; }
// Override this Template Method for different tests.
// Returns actual number of repetitions of test.
abstract int test(C container, TestParam tp);
} ///:~
Each Test object stores the name of that test. When
you call the test() method, it must be given the container to be tested
along with a messenger or data transfer object that holds the various parameters for that particular test. The parameters include size,
indicating the number of elements in the container, and loops, which
control the number of iterations for that test. These parameters may or may not
be used in every test.
Each container will undergo a sequence of calls to test(),
each with a different TestParam, so TestParam also contains static
array() methods that make it easy to create arrays of TestParam
objects. The first version of array() takes a variable argument list
containing alternating size and loops values, and the second
version takes the same kind of list except that the values are inside Stringsthis
way it can be used to parse command-line arguments:
//: containers/TestParam.java
// A "data transfer object."
import java.util.*;
public class TestParam {
public final int size;
public final int loops;
public TestParam(int size, int loops) {
this.size = size;
this.loops = loops;
}
// Create an array of TestParam from a varargs sequence:
public static TestParam[] array(int... values) {
int size = values.length/2;
TestParam[] result = new TestParam[size];
int n = 0;
for(int i = 0; i < size; i++)
result[i] = new TestParam(values[n++], values[n++]);
return result;
}
// Convert a String array to a TestParam array:
public static TestParam[] array(String[] values) {
int[] vals = new int[values.length];
for(int i = 0; i < vals.length; i++)
vals[i] = Integer.decode(values[i]);
return array(vals);
}
} ///:~
To use the framework, you pass the container to be tested
along with a List of Test objects to a Tester.run() method
(these are overloaded generic convenience methods which reduce the amount of
typing necessary to use them). Tester.run() calls the appropriate
overloaded constructor, then calls timedTest(), which executes each test
in the list for that container. timedTest() repeats each test for each
of the TestParam objects in paramList. Because paramList
is initialized from the static defaultParams array, you can change the paramList
for all tests by reassigning defaultParams, or you can change the paramList
for one test by passing in a custom paramList for that test:
//: containers/Tester.java
// Applies Test objects to lists of different containers.
import java.util.*;
public class Tester<C> {
public static int fieldWidth = 8;
public static TestParam[] defaultParams= TestParam.array(
10, 5000, 100, 5000, 1000, 5000, 10000, 500);
// Override this to modify pre-test initialization:
protected C initialize(int size) { return container; }
protected C container;
private String headline = "";
private List<Test<<C>> tests;
private static String stringField() {
return "%" + fieldWidth + "s";
}
private static String numberField() {
return "%" + fieldWidth + "d";
}
private static int sizeWidth = 5;
private static String sizeField = "%" + sizeWidth + "s";
private TestParam[] paramList = defaultParams;
public Tester(C container, List<Test<<C>> tests) {
this.container = container;
this.tests = tests;
if(container != null)
headline = container.getClass().getSimpleName();
}
public Tester(C container, List<Test<<C>> tests,
TestParam[] paramList) {
this(container, tests);
this.paramList = paramList;
}
public void setHeadline(String newHeadline) {
headline = newHeadline;
}
// Generic methods for convenience :
public static <C> void run(C cntnr, List<Test<<C>> tests){
new Tester<C>(cntnr, tests).timedTest();
}
public static <C> void run(C cntnr,
List<Test<<C>> tests, TestParam[] paramList) {
new Tester<C>(cntnr, tests, paramList).timedTest();
}
private void displayHeader() {
// Calculate width and pad with '-':
int width = fieldWidth * tests.size() + sizeWidth;
int dashLength = width - headline.length() - 1;
StringBuilder head = new StringBuilder(width);
for(int i = 0; i < dashLength/2; i++)
head.append('-');
head.append(' ');
head.append(headline);
head.append(' ');
for(int i = 0; i < dashLength/2; i++)
head.append('-');
System.out.println(head);
// Print column headers:
System.out.format(sizeField, "size");
for(Test test : tests)
System.out.format(stringField(), test.name);
System.out.println();
}
// Run the tests for this container:
public void timedTest() {
displayHeader();
for(TestParam param : paramList) {
System.out.format(sizeField, param.size);
for(Test<C> test : tests) {
C kontainer = initialize(param.size);
long start = System.nanoTime();
// Call the template method:
int reps = test.test(kontainer, param);
long duration = System.nanoTime() - start;
long timePerRep = duration / reps; // Nanoseconds
System.out.format(numberField(), timePerRep);
}
System.out.println();
}
}
} ///:~
The stringField() and numberField() methods
produce formatting strings for outputting the results. The standard width for
formatting can be changed by modifying the static fieldWidth value. The displayHeader()
method formats and prints the header information for each test.
If you need to perform special initialization, override the initialize( )
method. This produces an initialized container object of the appropriate
sizeyou can either modify the existing container object or create a new one.
You can see in test( ) that the result is captured in a local
reference called kontainer, which allows you to replace the
stored member container with a completely different initialized
container.
The return value of each Test.test() method must be
the number of operations performed by that test, which is used to calculate the
number of nanoseconds required for each operation. You should be aware that System.nanoTime()
typically produces values with a granularity that is greater than one (and this
granularity will vary with machines and operating systems), and this will
produce a certain amount of rattle in the results.
The results may vary from machine to machine; these tests
are only intended to compare the performance of the different containers.
Here is a performance test for the most essential of the List operations. For comparison, it also shows the most important Queue operations. Two separate lists of tests are created for testing each
class of container. In this case, Queue operations only apply to LinkedLists.
//: containers/ListPerformance.java
// Demonstrates performance differences in Lists.
// {Args: 100 500} Small to keep build testing short
import java.util.*;
import net.mindview.util.*;
public class ListPerformance {
static Random rand = new Random();
static int reps = 1000;
static List<Test<List<Integer>>> tests =
new ArrayList<Test<List<Integer>>>();
static List<Test<LinkedList<Integer>>> qTests =
new ArrayList<Test<LinkedList<Integer>>>();
static {
tests.add(new Test<List<Integer>>("add") {
int test(List<Integer> list, TestParam tp) {
int loops = tp.loops;
int listSize = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
for(int j = 0; j < listSize; j++)
list.add(j);
}
return loops * listSize;
}
});
tests.add(new Test<List<Integer>>("get") {
int test(List<Integer> list, TestParam tp) {
int loops = tp.loops * reps;
int listSize = list.size();
for(int i = 0; i < loops; i++)
list.get(rand.nextInt(listSize));
return loops;
}
});
tests.add(new Test<List<Integer>>("set") {
int test(List<Integer> list, TestParam tp) {
int loops = tp.loops * reps;
int listSize = list.size();
for(int i = 0; i < loops; i++)
list.set(rand.nextInt(listSize), 47);
return loops;
}
});
tests.add(new Test<List<Integer>>("iteradd") {
int test(List<Integer> list, TestParam tp) {
final int LOOPS = 1000000;
int half = list.size() / 2;
ListIterator<Integer> it = list.listIterator(half);
for(int i = 0; i < LOOPS; i++)
it.add(47);
return LOOPS;
}
});
tests.add(new Test<List<Integer>>("insert") {
int test(List<Integer> list, TestParam tp) {
int loops = tp.loops;
for(int i = 0; i < loops; i++)
list.add(5, 47); // Minimize random access cost
return loops;
}
});
tests.add(new Test<List<Integer>>("remove") {
int test(List<Integer> list, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
list.addAll(new CountingIntegerList(size));
while(list.size() > 5)
list.remove(5); // Minimize random access cost
}
return loops * size;
}
});
// Tests for queue behavior:
qTests.add(new Test<LinkedList<Integer>>("addFirst") {
int test(LinkedList<Integer> list, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
for(int j = 0; j < size; j++)
list.addFirst(47);
}
return loops * size;
}
});
qTests.add(new Test<LinkedList<Integer>>("addLast") {
int test(LinkedList<Integer> list, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
for(int j = 0; j < size; j++)
list.addLast(47);
}
return loops * size;
}
});
qTests.add(
new Test<LinkedList<Integer>>("rmFirst") {
int test(LinkedList<Integer> list, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
list.addAll(new CountingIntegerList(size));
while(list.size() > 0)
list.removeFirst();
}
return loops * size;
}
});
qTests.add(new Test<LinkedList<Integer>>("rmLast") {
int test(LinkedList<Integer> list, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
list.clear();
list.addAll(new CountingIntegerList(size));
while(list.size() > 0)
list.removeLast();
}
return loops * size;
}
});
}
static class ListTester extends Tester<List<Integer>> {
public ListTester(List<Integer> container,
List<Test<List<Integer>>> tests) {
super(container, tests);
}
// Fill to the appropriate size before each test:
@Override protected List<Integer> initialize(int size){
container.clear();
container.addAll(new CountingIntegerList(size));
return container;
}
// Convenience method:
public static void run(List<Integer> list,
List<Test<List<Integer>>> tests) {
new ListTester(list, tests).timedTest();
}
}
public static void main(String[] args) {
if(args.length > 0)
Tester.defaultParams = TestParam.array(args);
// Can only do these two tests on an array:
Tester<List<Integer>> arrayTest =
new Tester<List<Integer>>(null, tests.subList(1, 3)){
// This will be called before each test. It
// produces a non-resizeable array-backed list:
@Override protected
List<Integer> initialize(int size) {
Integer[] ia = Generated.array(Integer.class,
new CountingGenerator.Integer(), size);
return Arrays.asList(ia);
}
};
arrayTest.setHeadline("Array as List");
arrayTest.timedTest();
Tester.defaultParams= TestParam.array(
10, 5000, 100, 5000, 1000, 1000, 10000, 200);
ListTester.run(new ArrayList<Integer>(), tests);
ListTester.run(new LinkedList<Integer>(), tests);
ListTester.run(new Vector<Integer>(), tests);
Tester.fieldWidth = 12;
Tester<LinkedList<Integer>> qTest =
new Tester<LinkedList<Integer>>(
new LinkedList<Integer>(), qTests);
qTest.setHeadline("Queue tests");
qTest.timedTest();
}
} /* Output: (Sample)
--- Array as List ---
size get set
10 130 183
100 130 164
1000 129 165
10000 129 165
--------------------- ArrayList ---------------------
size add get set iteradd insert remove
10 121 139 191 435 3952 446
100 72 141 191 247 3934 296
1000 98 141 194 839 2202 923
10000 122 144 190 6880 14042 7333
--------------------- LinkedList ---------------------
size add get set iteradd insert remove
10 182 164 198 658 366 262
100 106 202 230 457 108 201
1000 133 1289 1353 430 136 239
10000 172 13648 13187 435 255 239
----------------------- Vector -----------------------
size add get set iteradd insert remove
10 129 145 187 290 3635 253
100 72 144 190 263 3691 292
1000 99 145 193 846 2162 927
10000 108 145 186 6871 14730 7135
-------------------- Queue tests --------------------
size addFirst addLast rmFirst rmLast
10 199 163 251 253
100 98 92 180 179
1000 99 93 216 212
10000 111 109 262 384
*///:~
Each test requires careful thought to ensure that you are
producing meaningful results. For example, the add test clears the List
and then refills it to the specified list size. The call to clear() is
thus part of the test, and may have an impact on the time, especially for small
tests. Although the results here seem fairly reasonable, you could imagine
rewriting the test framework so that there is a call to a preparation method
(which would, in this case, include the clear() call) outside of
the timing loop.
Note that for each test, you must accurately calculate the
number of operations that occur and return that value from test(), so
the timing is correct.
The get and set tests both use the random
number generator to perform random accesses to the List. In the output,
you can see that, for a List backed by an array and for an ArrayList,
these accesses are fast and very consistent regardless of the list size,
whereas for a LinkedList the access times grow very significantly for
larger lists. Clearly, linked lists are not a good choice if you will be
performing many random accesses.
The iteradd test uses an iterator in the middle of
the list to insert new elements. For an ArrayList this gets expensive as
the list gets bigger, but for a LinkedList it is relatively cheap, and
constant regardless of size. This makes sense because an ArrayList must
create space and copy all its references forward during an insertion. This
becomes expensive as the ArrayList gets bigger. A LinkedList only
needs to link in a new element, and doesnt have to modify the rest of the
list, so you expect the cost to be roughly the same regardless of the list
size.
The insert and remove tests both use
location number 5 as the point of insertion or removal, rather than either end
of the List. A LinkedList treats the end points of the List
speciallythis improves the speed when using a LinkedList as a Queue.
However, if you add or remove elements in the middle of the list, you include
the cost of random access, which weve already seen varies with the different List
implementations. By performing the insertions and removals at location five,
the cost of the random access should be negligible and we should see only the
cost of insertion and removal, but we will not see any specialized optimization
for the end of a LinkedList. You can see from the output that the cost
of insertion and removal in a LinkedList is quite cheap and doesnt vary
with the list size, but with an ArrayList, insertions especially are very
expensive, and the cost increases with list size.
From the Queue tests, you can see how quickly a LinkedList
can insert and remove elements from the endpoints of the list, which is optimal
for Queue behavior.
Normally, you can just call Tester.run(), passing the
container and the tests list. Here, however, we must override the initialize()
method so that the List is cleared and refilled before each
testotherwise the List control over the size of the List would
be lost during the various tests. ListTester inherits from Tester
and performs this initialization using CountingIntegerList. The run()
convenience method is also overridden.
Wed also like to compare array access to container access
(primarily against ArrayList). In the first test in main(), a
special Test object is created using an anonymous inner class. The initialize()
method is overridden to create a new object each time it is called (ignoring
the stored container object, so null is the container
argument for this Tester constructor). The new object is created using Generated.array( )
(which was defined in the Arrays chapter) and Arrays.asList().
Only two of the tests can be performed in this case, because you cannot insert
or remove elements when using a List backed by an array, so the List.subList()
method is used to select the desired tests from the tests list.
For random-access get( ) and set( )
operations, a List backed by an array is slightly faster than an ArrayList,
but the same operations are dramatically more expensive for a LinkedList
because it is not designed for random-access operations.
Vector should be avoided; its only in the library
for legacy code support (the only reason it works in this program is because it
was adapted to be a List for forward compatibility).
The best approach is probably to choose an ArrayList
as your default and to change to a LinkedList if you need its extra
functionality or you discover performance problems due to many insertions and
removals from the middle of the list. If you are working with a fixed-sized
group of elements, either use a List backed by an array (as produced by Arrays.asList( )),
or if necessary, an actual array.
CopyOnWriteArrayList is a special implementation of List
used in concurrent programming, and will be discussed in the Concurrency
chapter.
Depending on the behavior you desire, you can choose a TreeSet, a HashSet, or a LinkedHashSet. The
following test program gives an indication of the performance trade-off between these implementations:
//: containers/SetPerformance.java
// Demonstrates performance differences in Sets.
// {Args: 100 5000} Small to keep build testing short
import java.util.*;
public class SetPerformance {
static List<Test<Set<Integer>>> tests =
new ArrayList<Test<Set<Integer>>>();
static {
tests.add(new Test<Set<Integer>>("add") {
int test(Set<Integer> set, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
set.clear();
for(int j = 0; j < size; j++)
set.add(j);
}
return loops * size;
}
});
tests.add(new Test<Set<Integer>>("contains") {
int test(Set<Integer> set, TestParam tp) {
int loops = tp.loops;
int span = tp.size * 2;
for(int i = 0; i < loops; i++)
for(int j = 0; j < span; j++)
set.contains(j);
return loops * span;
}
});
tests.add(new Test<Set<Integer>>("iterate") {
int test(Set<Integer> set, TestParam tp) {
int loops = tp.loops * 10;
for(int i = 0; i < loops; i++) {
Iterator<Integer> it = set.iterator();
while(it.hasNext())
it.next();
}
return loops * set.size();
}
});
}
public static void main(String[] args) {
if(args.length > 0)
Tester.defaultParams = TestParam.array(args);
Tester.fieldWidth = 10;
Tester.run(new TreeSet<Integer>(), tests);
Tester.run(new HashSet<Integer>(), tests);
Tester.run(new LinkedHashSet<Integer>(), tests);
}
} /* Output: (Sample)
------------- TreeSet -------------
size add contains iterate
10 746 173 89
100 501 264 68
1000 714 410 69
10000 1975 552 69
------------- HashSet -------------
size add contains iterate
10 308 91 94
100 178 75 73
1000 216 110 72
10000 711 215 100
---------- LinkedHashSet ----------
size add contains iterate
10 350 65 83
100 270 74 55
1000 303 111 54
10000 1615 256 58
*///:~
The performance of HashSet is generally superior to TreeSet,
but especially when adding elements and looking them up, which are the two most
important operations. TreeSet exists because it maintains its elements
in sorted order, so you use it only when you need a sorted Set. Because
of the internal structure necessary to support sorting and because
iteration is something youre more likely to do, iteration is usually faster
with a TreeSet than a HashSet.
Note that LinkedHashSet is more expensive for
insertions than HashSet; this is because of the extra cost of
maintaining the linked list along with the hashed container.
This program gives an indication of the trade-off between Map implementations:
//: containers/MapPerformance.java
// Demonstrates performance differences in Maps.
// {Args: 100 5000} Small to keep build testing short
import java.util.*;
public class MapPerformance {
static List<Test<Map<Integer,Integer>>> tests =
new ArrayList<Test<Map<Integer,Integer>>>();
static {
tests.add(new Test<Map<Integer,Integer>>("put") {
int test(Map<Integer,Integer> map, TestParam tp) {
int loops = tp.loops;
int size = tp.size;
for(int i = 0; i < loops; i++) {
map.clear();
for(int j = 0; j < size; j++)
map.put(j, j);
}
return loops * size;
}
});
tests.add(new Test<Map<Integer,Integer>>("get") {
int test(Map<Integer,Integer> map, TestParam tp) {
int loops = tp.loops;
int span = tp.size * 2;
for(int i = 0; i < loops; i++)
for(int j = 0; j < span; j++)
map.get(j);
return loops * span;
}
});
tests.add(new Test<Map<Integer,Integer>>("iterate") {
int test(Map<Integer,Integer> map, TestParam tp) {
int loops = tp.loops * 10;
for(int i = 0; i < loops; i ++) {
Iterator it = map.entrySet().iterator();
while(it.hasNext())
it.next();
}
return loops * map.size();
}
});
}
public static void main(String[] args) {
if(args.length > 0)
Tester.defaultParams = TestParam.array(args);
Tester.run(new TreeMap<Integer,Integer>(), tests);
Tester.run(new HashMap<Integer,Integer>(), tests);
Tester.run(new LinkedHashMap<Integer,Integer>(),tests);
Tester.run(
new IdentityHashMap<Integer,Integer>(), tests);
Tester.run(new WeakHashMap<Integer,Integer>(), tests);
Tester.run(new Hashtable<Integer,Integer>(), tests);
}
} /* Output: (Sample)
---------- TreeMap ----------
size put get iterate
10 748 168 100
100 506 264 76
1000 771 450 78
10000 2962 561 83
---------- HashMap ----------
size put get iterate
10 281 76 93
100 179 70 73
1000 267 102 72
10000 1305 265 97
------- LinkedHashMap -------
size put get iterate
10 354 100 72
100 273 89 50
1000 385 222 56
10000 2787 341 56
------ IdentityHashMap ------
size put get iterate
10 290 144 101
100 204 287 132
1000 508 336 77
10000 767 266 56
-------- WeakHashMap --------
size put get iterate
10 484 146 151
100 292 126 117
1000 411 136 152
10000 2165 138 555
--------- Hashtable ---------
size put get iterate
10 264 113 113
100 181 105 76
1000 260 201 80
10000 1245 134 77
*///:~
Insertions for all the Map implementations except for
IdentityHashMap get significantly slower as the size of the Map
gets large. In general, however, lookup is much cheaper than insertion, which
is good because youll typically be looking items up much more often than you
insert them.
Hashtable performance is roughly the same as HashMap.
Since HashMap is intended to replace Hashtable, and thus uses the
same underlying storage and lookup mechanism (which you will learn about later)
this is not too surprising.
A TreeMap is generally slower than a HashMap.
As with TreeSet, a TreeMap is a way to create an ordered list.
The behavior of a tree is such that its always in order and doesnt have to be
specially sorted. Once you fill a TreeMap, you can call keySet( ) to get a Set view of the keys, then toArray( ) to produce an array of
those keys. You can then use the static method Arrays.binarySearch( )
to rapidly find objects in your sorted array. Of course, this only makes sense
if the behavior of a HashMap is unacceptable, since HashMap is
designed to rapidly find keys. Also, you can easily create a HashMap from
a TreeMap with a single object creation or call to putAll(). In
the end, when youre using a Map, your first choice should be HashMap,
and only if you need a constantly sorted Map will you need TreeMap.
LinkedHashMap tends to be slower than HashMap
for insertions because it maintains the linked list (to preserve insertion
order) in addition to the hashed data structure. Because of this list,
iteration is faster.
IdentityHashMap has different performance because it
uses == rather than equals( ) for comparisons. WeakHashMap
is described later in this chapter.