Seismic Data Label Design
for Machine Learning

  • Designed a multi-feature federated filtering algorithm to innovatively create machine learning labels for seismic data decoupling and noise reduction

  • Applied support vector machine (SVM) model to test classification results of new algorithm and successfully improved the accuracy by 3% under an acceptable cost

  • Finished mathematical theory of new algorithm and implemented core codes for seismic data processing APIs using C++

Background

Seismic data decoupling is very significant in seismic imaging. Seismic data usually contains lots of noise, such as surface wave, multiples, scattered noise and random noise, etc. Among them, surface wave is the most obvious interference wave. It has the characteristics of strong amplitude, low frequency, low apparent velocity and dispersion, etc. In reflection seismic exploration, surface wave and reflected wave in the seismic record are coupled with each other, which seriously reduces the signal-to-noise ratio of the seismic data and affects the quality of seismic imaging. In order to improve the quality of the seismic imaging, whose results reflect the geological structure, it is necessary to suppress or eliminate the surface wave in data preprocessing.

Keywords: Data Decoulping, Label Design, Federated Filter, Support Vector Machine

Introduction

In this project, by studying the frequency, direction, energy and dispersion characteristics of noise data (surface wave), a federated filter using multiple features is designed to improve the filtering accuracy. Meanwhile, a machine learning algorithm is tested given new labels, which proves its feasibility in seismic data processing.

Raw Data

As shown in the raw data images, it is a distance-time data with distance (meter) on x-axis and time (milisecond) on y-axis. (1), (2) and (3) are data we want to keep, and (4) is the noise data which is coupled with useful data. In space-time domain, we cannot eliminate the noise directly. Based on traditional methods, we can separate useful and noise data by data transformation techniques. However, all the single methods have their own drawbacks and cannot suppress the noise with high accuracy or with litte harm to useful signals. Therefore, we will design a multi-feature filter to improve the accuracy of noise recognition.

Frequency Feature

Now we take a look on some single receivers’ data. In time domain, data in red circles are noise, and the rest are useful signals. It is obvious that noise data account for a big part, and that useful data are separately distributed. In real practice, there are several hundred receivers in one seismic data record and there are also hundreds of records so that we can get detailed underground information. If we want to pick useful data record by record and receiver by receiver, it will take us unimaginably much money and long time. Thus, we have to figure out data transformation techniques to effectively deal with data denosing.

We notice that if we transfer time into frequency by a mathematical technique (Fourier Transform) and get the space-frequency domain data, althought there is still a little coupling in useful and noise data, it is much easier to decouple those data because noise has a relatively low frequency range. In frequency domain, red square represents the noise data and yellow square represents the useful data. We can set a threshold to directly eliminate noise data and do the inverse transformation to go back to space-time domain. However, we have to admit that there are important data with low frequency within the range the same as noise data. Thus, we do harm to our useful data and will lose some information.

Single Receiver in Time Domain
Single Receiver in Frequency Domain

As shown in the filtered results, we can clearly see that there are useful data lost in filtered part. Actually, the accuracy of frequency-based method is around 70%, which is not a satisfying number.

Label Creation

dictionarty: {(space, time): (space, frequency)}

set a frequency threshold F

  • {(space, time): (space, frequency > F)} -> label 0
  • {(space, time): (space, frequency <= F)} -> label 1

0 for useful data

1 for noise data

Maintained Part
Filtered Part

Actually, not only time has a frequency, but also space does, which is called wave number. Since we can create a space-frequency domain, we can also create a wave number-frequency domain, a.k.a F-K domain (using 2-dimension Fourier Transform). In this domain, noise data can also be more easily split from useful data.

Blue circle represents the useful data, red circle represents the noise data, and yellow circle represents a kind of numerical error in the transformation, which is also a drawback of this method. As we can see, in this domain, the metric of separation should be slope.

Raw Data
Raw Data in F-K Domain

We can use a slope threshold to effectively remove the noise data with accuracy around 87%, but of course we need more computation resources. In the filtering result, we keep the majority of useful data with a little noise left.

Result of F-K Method
Raw Data in F-K Domain
Useful Data in F-K Domain
Label Creation

dictionarty: {(space, time): (wave number, frequency)}, set a slope threshold S

        • {(space, time): (wave number, frequency)} with slope > S -> label 0, 0 for useful data
        • {(space, time): (wave number, frequency)} with slope <= S -> label 1, 1 for noise data
Direction Feature

From raw data image, the most significant difference between useful and noise data is that they go along different directions, which means they have various velocities. Accordingly, if we can use a transformation technique (Radon Transform, with accuracy around 89%) to collect this direction information, it will also be a good way to split data.

Red circle represents the noise data, yellow circle represents the useful data, and blue circle is the numerical error from this transformation, which is called energy leakage and also the drawback of this method. After the transformation, we can set a region R to keep the useful data while removing the noise data. There is no doubt that it always does harm to useful data on some degree.

Raw Data
Raw Data Direction Feature
Label Creation

dictionarty: {(space, time): (ray parameter, time)}

set a region threshold R

  • {(space, time): (ray parameter, time)}                     in region R -> label 0
  • {(space, time): (ray parameter, time)}                  not in region R -> label 1

0 for useful data

1 for noise data

Maintained Part
Filtered Part
Energy Feature

Energy feature extraction method is similar to the frequency feature extraction method. Typically, in seismic data, the energy of some noise data is even larger than that of effective signals, which means that noise data will cover useful data. Thus, noise reduction is necessary. In frequency method, we only use frequency differnece as an indicator. But actually, the energy of noise data, which is represented as amplitude in frequency domain, also has a dramatic difference with accuracy around 74%. Therefore, according to the statistical information of amplitude, we can also set an energy threshold to discriminate useful and noise data.

Label Creation

dictionarty: {(space, time): (space, (frequency, amplitude))}

set an energy threshold E

        • {(space, time): (space, (frequency, amplitude < E))} -> label 0, 0 for useful data
        • {(space, time): (space, (frequency, amplitude >= E))} -> label 1, 1 for noise data
Dispersion Feature

In seismic data, noise data also have a special feature called dispersion, which makes it look like a broom in space-time domain. This feature can also be axtracted as dispersion curves by methamatical transformation. As shown in the figure, the red line is dispersion feature of noise data. However, since it is very hard to automatically pick this feature, we will just drop it.

Federated Filter
Workflow
Final Labels
Maintained Part
Filtered Part

Now we combine frequency, energy and direction feature of seismic data to find overlapping data labeled as noise. Finally, we get a filter with 91% accuracy. Actually, all methods we used have their advanced algorithms to improve accuracy, but it is time consuming. Combining all those methods, even only using their naive versions to save time and cost, we can get a satisfying result.