## Sparse Regression and Data Preparation for Machine Learning in Seismic Data Processing

**Maxim Ryabinskiy (speaker), Dr. Dmitriy Finikov**

*Russia, Yandex.Terra*

Tasks to build different kinds of regression models appear in seismic data processing very often. In particular, they are used to filter various types of noises, both regular and random. Today, the most common method of processing and analysis of seismic observation is a so-called reflection method. It is assumed that all types of recorded waves, such as direct, surface, multiple reflections and so on, except once reflected, are regular noises. Random noises are a consequence of non-ideal observing conditions and equipment. In addition, very often the data is recorded on an irregular grid, while traditional filtering methods require regular input data, so we need some oversampling procedures. To solve the above-described problems we use specially tuned regression models. The report outlines two of the most popular examples of these models. The first example is a sparse regression that allows you to refer greater weights to the fragments of seismic records that have greater deviation from the general level and,

at the same time, we observe in them a certain regularity. The second example is an approximation of desired signals sequences by a multivariate autoregressive model of finite order. In this case, to tune the model parameters we use different kinds of heuristics, particularly pre-filtering of input data in order to attenuate noises, and selection of auto-regression parameters using the filtered data. After that we apply the resulting model to the raw data to avoid desired signals suppression during pre-filtering. This approach can be used to model the desired signals directly or to model the interference for the purpose of subtracting it from the raw data.

at the same time, we observe in them a certain regularity. The second example is an approximation of desired signals sequences by a multivariate autoregressive model of finite order. In this case, to tune the model parameters we use different kinds of heuristics, particularly pre-filtering of input data in order to attenuate noises, and selection of auto-regression parameters using the filtered data. After that we apply the resulting model to the raw data to avoid desired signals suppression during pre-filtering. This approach can be used to model the desired signals directly or to model the interference for the purpose of subtracting it from the raw data.