One of the things that make numerical work on computers so much fun is that often, with a bit of cleverness, you can improve a computation tremendously. You may have a problem that your best computer can not solve in anywhere near an acceptable time, but reformulate it and you may be able to do it easily on your PC or Mac at home.
In introductory numerical classes, the often mentioned example is using Cramer's rule to solve for 25 unknown numbers. Many students have learned in their math classes how to solve 3 or 4 unknowns using Cramers's rule, but nobody warned them to use a different method for more unknowns. Solving for 25 unknowns is no big deal, there are many simple methods that can do it very quickly , but Cramer's rule is not one of them! It is about 100,000,000,000,000,000,000,000 times slower than the standard method of doing it. The best supercomputers would be dust before they finished the computation.
As you might guess, this sort of difference is exceptional. It is due to choosing a particularly poor numerical method, Cramer's rule. The problem that Elke and I faced did not look open to improvement at all. We wanted to compute flows of air or water around obstacles such as wires or wings. The numerical method was to be a `vortex method', in which thousands of small tornados, called vortices, are used to represent the motion of the air or water. The difficulty was, we had to find the velocity of each vortex to follow their motion, and the velocity of each depends on all other vortices.
To see the difficulty, imagine that we have shouting people instead of vortices. For example, suppose that there are 10,000 shouting people scattered throughout a very large field. The question is, how much noise does each person hear. To answer the question, you would for each person determine how much noise he receives from every other person, by finding the distance from that person and computing the noise level at that distance. Sum the noises from all 9,999 other persons to get the total noise.
This works, but since we have to compute 9,999 distances for each person, the amount of computational work is proportional to 10,000 times 9,999. It is proportional to the square of the number of people. If there would be 20,000 people instead of 10,000, we would need to ompute about four times as many distances, not two times. For large number of people, the work per person simply becomes too much to handle.
You might wonder how we could avoid computing all distances, since clearly all noises need to be summed to get the total noise. But suppose that there is a group of people all fairly close to each other, and that we replace the noise produced by all these people by a crying baby in the middle which is as loud as the whole group of people together. Of course, that would not work if you stood nearby the group of people; you would hear the difference between the noises coming from different directions and the one coming from the baby at the center. But it would work if you stood too far away to make out the difference, and under such conditions, only the distance to the baby has to be computed, not all the distances to the individual people.
So using the baby, we can eliminate some of the distances we would have to compute. We can repeat this for other groups of nearby people. And there are still other games we can play. Suppose there are two nearby babies replacing different groups of people. At really large distances, we could replace the two babies by a superbaby which is again twice as loud. (He may need some spinach.)
As you might guess, the basic idea turned out to be not particularly
original; the real test was in doing it effectively for people (vortices)
spread about in an iregular pattern.
Here is a picture how Elke and I grouped vortices together:
The picture is for a flow around a wire, seen at the left. The vortices may be seen as dark or light points mainly within the smallest squares.
Did it work? Well, here are some computational times, in seconds, first if we compute all distances and then using the new method we developed:
Vortices: 400 800 1600 3200 6400 12800 25600 Old time: 9.1 36.3 151.9 586.8 2354.1 9404.8 37556.0 New time: 4.7 10.6 25.1 55.4 128.5 286.3 597.8 Ratio: 2 3 6 11 18 33 63So, the new method is already 63 times faster for 25,000 vortices and this just will get more for more vortices.
To put this in perspective, it is the difference between getting the results of your computation back the next day or getting it back in two months. You may be able to live with the first, but probably not with the second. I am still using this scheme for my computations, although other researchers such as Carrier, Greengard and Rokhlin have formulated similar schemes that are still faster. They are also more complex: the people are wearing headphones and receive the noise of the babies over the radio. Someday I will write or get a version of one of those procedures.
Van Dommelen, L. L. (1985) Adaptive-panel vortex summation for the CYBER 205. 38th Annual Meeting Div. Fluid Dynamics, Bulletin of the American Physical Society, 30.
Van Dommelen, L. L. & Rundensteiner, E. A. (1989) Fast solution of the two-dimensional Poisson equation with point-wise forcing. Journal of Computational Physics 83, 126-147.
Carrier, J., Greengard, L. \& Rokhlin, V. (1986) A fast adaptive multipole algorithm for particle simulations. Yale University Research Report YALEU/DCS/RR-496.
Greengard, L. \& Rokhlin, V. (1988) On the efficient implementation of the fast multipole algorithm. Yale University Research Report YALEU/DCS/RR-602.