PNEC 2019

The Movement of Seismic Data from "Random SEGY0 - ish" to Machine Learning Ready SEG-Y_r2 to Auto-Read Your Data #seismic #bestpractices #digitize #metadata

22 May 19
1:30 PM - 2:00 PM

Tracks: Master and Reference Data Management

Computer processing, interpretation and analysis of seismic data has traditionally been a batch process tailored by a human user and applied to individual datasets. Artificial intelligence technology has provided new machine learning (ML) systems that are becoming increasingly used within diverse scientific fields to automatically mine data for hidden patterns, build models, and make decisions with minimal human intervention. This technology has considerable potential to support exploration and production (E&P) activities by efficiently extracting new information from seismic data of all types, whether legacy 2D and 3D volumes or the latest 4D monitoring and microseismic datasets. However, even the most powerful computer systems and advanced ML algorithms are worthless unless the data with which they are presented are easily decodable and have reliable metadata such as positioning information and date of recording. For analysis of an entire oilfield or licence area the input should ideally include all available vintages and types of seismic data, and these should all be readily accessible and compatible. The Society of Exploration Geophysicists defined its SEG-Y format in 1973, since when it has been the default standard for storing and exchanging processed seismic data. It was developed at a time when most exploration was onshore and the usual recording media was 9-track tape. A revision in 2002 extended the standard to handle 3D acquisition and some high capacity media. Other technological developments and proprietary system requirements led E&P companies and suppliers of seismic services to develop their own deviations and/or additions to the standard SEGY format, leading to inconsistencies that can prohibit problem-free input to ML. In 2017 the SEG formally published the specifications of its SEG-Y revision 2.0 data exchange format. This was the result of several year’s work by the SEG Technical Standards Committee; a team that included representatives from E&P companies and seismic service providers chaired by Jill Lewis, Managing Director of Troika International. The new format is designed to address the extensive recent - and predicted further - developments in data acquisition and processing capabilities such as time-lapse (4D), multicable, multicomponent and seabed data. It aims to ensure accurate capture of these various types of data and associated metadata to future-proof the valuable information whether on disk or modern tape media. The revised format provides for up to 65,535 additional 240-byte trace headers for metadata, more than 4 billion samples per trace, variable sample intervals and trillions of traces per line or ensemble. The format also supports additional data sample formats including IEEE double precision (64 bit), little-endian and pair-wise byte swapping to improve I/O performance, microsecond accuracy in time and date stamps and additional precision on coordinates, depths and elevations. SEG-Y_r2.0 also accommodates an unlimited number and size of self-defining stanzas (sets of related groups of information) not only in the extended textual file headers but also after the last data trace in a file. These stanzas can contain useful metadata such as processing history and coordinate reference system, which defines the specific map projection used for positioning information. The format can also include depth, velocity, electromagnetic, gravity and rotational sensor data. To take full advantage of current and future developments in technologies such as ML, the author encourages owners and suppliers of seismic data to adopt SEG-Y_r2.0 for their new data and to migrate existing datasets to the new standard. The process of migration should include quality control procedures to verify the integrity of the seismic data and check the accuracy of positioning and other important metadata.