# The evolution of massively parallel processes in geophysics

From compression to pre-stack volumes

Evolution of processed data volumes (a) and computer power (b) over the past decade. The vertical scales are expressed relative to the 1990 figures. |

Almost all the major theoretical ground-work in the field of geophysics was laid down well before the birth of any of the present geophysical contractors or oil companies. In this regard, the field of geophysical data processing is one that has been allowed to reach maturity by an external factor: the evolution of digital computers. The parallel development of computer technology and commercial geophysics has thus under-pinned the profession.

For the most part, seismic data processing has its roots in the wave equation, Fourier theory, and signal processing. The origins of the wave equation date back to the work by Hooke on springs (1678), with the formulations as we know them today resulting from subsequent works in the mid-1800s by Navier, Cauchy, Green, Stokes, and Kelvin. In fact, not much fundamental work has been added to this body of knowledge since the works on elastic wave propagation by Rayleigh, Love, and Zoeppritz almost a century ago.

Data transforms (primarily that of Fourier) have been at the heart of many mathematical applications since their introduction in the 1800s. Additionally, the works of Laplace, Bessel, and Hilbert have also contributed to the ability to represent phenomena with mathematical models. Again, little fundamental development has occurred in these domains in recent decades, except perhaps that of Morlet et al (1982) and others, who introduced the wavelet transform.

Signal processing owes much of its fundamental background to the work of Wiener (1930, 1949), and also to later contributions by Levinson (1947), Robinson (1954), and Treitel (1970). Associated with the topic of signal processing, there has been significant development in inversion theory (Backus & Gilbert, 1967), and its associated topic: deconvolution.

Inversion has lent itself to many topics in contemporary geophysics, from solving statics problems to constructing velocity-depth models for migration, while deconvolution has permitted compression of waveforms to permit consistent interpretation and de-reverberation of multiple contaminated data. Some of the more notable methods in inversion include the Monte-Carlo simulation techniques, notably simulated annealing. The latter has been applied to the statics problem (Rothman, 1986), and also for acoustic impedance inversion (Gluck et al, 1997).

#### Digital computer

Time slice through a multi-azimuth common receiver gather before (a) and after (b) 3D least squares filtering to remove the ground roll (red). |

Setting aside the early attempts in the 1800s by Babbage to build a "difference engine," the real onset of the digital computer age was in the late 1950s and early 1960s. This was driven in part by the US space exploration program when programmable digital computers began to be used for both military and scientific/engineering applications. The theoretical background was set a little earlier by such prominent scientists as John von Neumann and Alan Turing.

The availability of fast accurate computing power permitted the field of geophysics to blossom: computations that would have been inconceivable were suddenly tractable on a routine basis. The strength of the geophysical community was in keeping pace with the available tools: moving from single-trace processes (such as deconvolution and NMO) to multi-channel filtering as the hardware permitted.

The earliest computers were more massive than they were powerful, and as computational speed increased, many physical limitations had to be overcome.

Present day machines are mostly either massively parallel or clustered. In other words, the industry relies on many small (but individually fast) processors to handle the computations, where the data is broken into small chunks, each processed separately by the processors, and the final results combined at a later stage.

The current problem remains the sheer volume of data being pro-cessed. As the hardware has improved (disc storage is now acceptably cheap for each terabyte of data), we have grown more ambitious in what we try to accomplish (and client expectations have grown in line with these capabilities). In addition, with the advent of multi-component data acquisition, we have also tripled the amount of data we acquire. For example, a 3D marine multi-component survey can easily consist of about 15 terabytes of data.

#### Algorithmic development

The real expanse in our field was not so much in the underlying theory, as in the development of discrete (digital) solutions to the long established equations. In addition, certain problems (for example, extraction of the eigenvalues of a complex matrix) were considered intractable until the advent of digital computers.

Originally, the sheer data volumes involved still required various forms of data compression to be used. The most basic of these was stacking. In other words, rather than processing all the elemental traces acquired in the field, the industry transformed the traces from the finite offsets obtained with the given recording geometry to zero-offset (via the NMO transform - Dix, 1955). In this way, all subsequent processes were simplified in two ways: there were fewer traces, and the equations to be numerically solved were simplified from the finite-offset case to the zero-offset case.

Subsequent improvements such as DMO moved to two-dimensional treatment of data using a partial migration to permit more accurate imaging while still avoiding full pre-stack processing. DMO was first introduced in the pioneering work of Sherwood in his 'Devilish' program (1978), and later improved by Yilmaz & Claerbout (1980) and Hale (1983).

The fast Fourier transform (Herbert, 1962; Cooley & Tukey, 1965) enabled access to the frequency domain for solving all manner of signal processing problems: perhaps this development alone was the single-most influential step in allowing access to computational solutions of geophysical problems. Fast transforms allowed us to work on data in three dimensions (rather than needing to split the data into subsets of dimensions). With such techniques at our disposal, advanced multi-dimensional algorithms could be implemented, for example, noise filtering techniques.

In addition, Claerbout's application of finite differencing to the solution of the acoustic wave equation (1970) was another of the first major advances in formulating geophysical equations in ways that computers could handle (see also Lowenthal et al, 1976).

Again, the trick employed by processing geophysicists lay in keeping up with the available hardware technology. For example, in order to provide an algorithmic solution to a physical problem, technologists have to first describe the problem, by some approximate set of equations that can be solved numerically.

An example of this is migration. For a constant velocity medium, an expanding wavefront is spherical. Thus, a migration impulse response should appear as a semicircle in a seismic section. However, solving the correct form of the wave equation was not viable with the computer power available 20 years ago, hence approximations were made. The solution taken was to represent the wave equation with another (more easily solved) equation; namely that of a heart shape (cardioid). The approximation is acceptable for small dips, but is noisy. However, this was all we could afford to do with the hardware available.

As hardware speed increased, we were able to code more accurate algorithms (still for our compressed data: the stack), and in recent years, we have been able to work directly on the uncompressed (pre-stack) data. Moving from the stack to full processing of all the individual elemental traces has permitted greater accuracy and precision in forming images of the subsurface. Migrating all the pre-stack data with a 3D pre-stack depth migration algorithm, we are better able to image the sub-salt structure.

Impulse response of a 3D migration operator using (a) a fast and inexpensive cardioid approximation to the wave equation versus (b) a more exact solution. |

The evolution of seismic data processing has been inexorably tied to the emergence of massive computing power. Long understood and well-formulated problems could only be addressed once rapid numerical solutions could be implemented on digital computers. In addition, the availability of massive computing power has encouraged us to move beyond the necessity of data compression (stacking) and to work directly on the full pre-stack data volumes, as acquired from the field recordings.

*A complete list of references is available from the authors.*