Big Data Growth Continues in Seismic Surveys

Big Data Growth Continues in Seismic Surveys
The amount of data gathered in oil and gas marine and land surveys continues to grow, according to an official with CGG.

Geophysical firm CGG officially entered the Big Data market in 1971 with its first 3D seismic acquisition survey for the oil and gas industry.

The company had no commercial system to store or analyze data, and had to create systems to ferret out information critical for clients, said Hovey Cox, senior vice president of marketing & strategy for geology, geophysics and reservoir at CGG. Since then, the company has been pushing the edge in terms of its ability to collect and analyze data, helping its clients make better decisions.

Historically, most of CGG’s analytical capabilities have been deterministic, but today the company has a broad spectrum of statistical and geostatistical analysis tools and new Big Data techniques to glean more insight from data. Techniques such as wide azimuth – which illuminates a target from as many angles as possible –is also generating significantly more data.

The amount of data that CGG collects from its land and marine seismic surveys has grown considerably in recent years, Cox said during a webinar on how Big Data is driving innovation and business success in oil and gas. A land seismic survey conducted in 2005 had 400,000 sensors per square kilometer; by 2009, that number had reached 36 million. From 2005 to 2009, the average volume of data gathered on an eight hour shift grew from 100 gigabytes to more than 2 terabytes, Cox said.

The number of channels – or pixels – on a crew grew from 8,000 in 2005 to 40,000 in 2009, and the number of computers used for processing data has grown from three desktop PCs in 2005 to a 72 PC cluster in 2009. This year, CGG had more than 100,000 channels for a land seismic crew, with the goal of reaching one million, said Cox.

Marine seismic surveys are more complex, with a vessel towing a streamer a kilometer wide and a 10K race distance in length. CGG uses multiple vessels to create wide and full azimuth views to open the second eye and see in stereo below the sea floor surface. These streamers represent the largest moving infrastructure on earth, and can be seen from space as they move and collect data.

“These surveys are intense from all points of operations, requiring a tremendous amount of logistics and management of data,” Cox noted.

Hovey Cox
Hovey Cox, Senior VP of Marketing & Strategy for Geology, Geophysics and Reservoir, CGG
Senior VP of Marketing & Strategy for Geology, Geophysics and Reservoir, CGG

The data gathered in seismic surveys is big from a quantity standpoint.

“Every seven days, data the equivalent of a U.S. Library of Congress in size is gathered by each seismic vessel,” said Cox.

In its StagSeis survey of the Gulf of Mexico, more than 1.5 petabytes of data was gathered, and CGG received four copies of the data from its vessels. These copies represent very large data sets.

The information CGG gathers in its seismic surveys contains both Big Data and Fast Data. In 2009, the amount of Fast Data flowing from its seismic operations was coming in between 200 and 400 megabytes per second, Cox noted.

A significant amount of time is also involved in crunching these data sets, with resolution and speed critical. The raw data gathered in these surveys looks like a red mat, and require a lengthy processing sequence to go from solid red to a format in which patterns can be detected. The company has 40 computing centers worldwide; one of its larger computing centers is within the range of major processing centers such as Sequoia, with more than 100 petabytes/day of active data.

Finer sampling and fuller azimuth means more resolution with raw data, and a better view of legacy and target data for better decision-making. Step-changes in subsalt imaging allowed CGG to get more detail in 2014 out of Gulf of Mexico seismic data from 2006, allowing more confidence in 3D interpretation for well placing and identifying offsetting prospects, Cox said. CGG can now see details such as channels from historic rivers.

“Taking data and being able to bring it together is allowing CGG to better understand rock properties in seismic surveys,” said Cox.

CGG’s data continues to grow in volume, variety and velocity, said Cox, adding that the company will continue to invest in Big Data technologies to address this data. This includes its use of technology for the past decade from Objectivity. CGG has used this technology to store processed data, meta data, and derived data.

Object Modelling Behind Objectivity Technology

Founded in the late 1980s, Objectivity is a developer of high-performance distributed object data technology, with deep domain expertise in fast data fusion and decades of experience in “beyond petabyte” data volumes, said Brian Clark, corporate VP of product at Objectivity, during the webinar.

The technology is based on object data modelling, which came out of the telecommunication and manufacturing industries’ need to deal with a diverse array of data. Object data modelling provided the agility needed to support complex, multidimensional queries. Because of the proliferation of data models and associated techniques, this technology filled a niche role. But the phenomenon of Big Data and complex sensor data triggered the wider use again of object data modelling, which provides support for what is called multi-dimensional indexing.

Like the other companies with whom Objectivity works, CGG needed a technology that would allow it to handle lots of data very quickly. The company’s technology fuses Big Data and Fast Data together so that data can be analyzed and generate business value for a company.

CGG – Schlumberger through its WesternGECO division – are facing similar challenges. Methodologies for collecting data such as wide azimuth scans are generating much more high density data than previous methods did, and overall, seismic surveys are gathering more data than ever. The larger volumes of overall data and high density data are challenging the cost-effectiveness of current systems. Besides cost, another driver behind the adoption of Big Data technology is the need to do more interesting analytics on a particular data set or small data set.

Objectivity’s technology combines Big Data – data gathered from mobile devices that historically is used for batch analytics and can wait – with Fast Data, or sensor data, flowing from many sources of Internet of Things (IoT) technologies. This data is transformed through filtering, refining and cleaning and value add, and then stored beside Objectivity’s application alongside existing data on Hadoop as needed.

Objectivity’s “secret sauce” is a federated database, which is actually a collection of databases that can be spread around the network. The processing can be spread around as well. Due to its architecture, a client application can access data anywhere in the federation – it doesn’t need to know the actual location, said Clark.

“Within the database, we organized the objects and relationship into containers, which are a way of clustering a logical group of objects together for efficient physical access,” Clark said. “In the object database world, the relationships between the data are as important if not more important than the data itself. Each data has its own unique 64-bit ID, meaning that we can address thousands of trillions of unique objects, and thousands of petabytes of storage.”

No matter how many objects or storage, finding an object by the ID is very fast regardless of the number of objects. As a result, data and processing where they’re needed. Designed as a distributed architecture, where data and processing can be spread out the network, allowing for a more cost effective, scalable solution.

Why are objects a good fit? Originally, data fusion dealt with hard data or sensor data, and objects are a good way to express this information. Information fusion brings much more soft, or unstructured data, such as email, social media, text, audio and video, into the picture. Expressing this data as objects allows for a better understanding of this data, such as meaning, context, and ontologies. Both objects and relationships are needed to express this information, Clark said.

Brian Clark
Brian Clark, Corporate VP of Product, Objectivity
Corporate VP of Product, Objectivity

Besides CGG, the company has worked with companies that have been doing the equivalent of information fusion for many years. These companies include Boeing, government agencies, Telco and network customers, technology partners such as databricks and Intel and SI partners like Raytheon.

The coming market for the industrial Internet of Things not only is huge in volume, but in the commercial value of using data to solve business problems. The vast majority of data for industrial IoT will come from sensors delivering streaming and Fast Data, said Clark. At the moment, the biggest obstacle is making sensor data useful to analytics.

“There’s been lots of data collected, but really getting the value out of it is critical.”

Big Data About More Than The Three Vs

Big Data is more than just about the three Vs of volume, velocity or variety, but about finding hidden patterns that are meaningful to a business and making data actionable, said Noel Yuhanna, principal analyst at Forrester Research, during the webinar.

The explosion of structured and unstructured data from many directions in recent years has resulted in more than 3.5 zetabytes of data available on the public net, Yuhanna said.

Oil and gas firms that leverage Big Data are experiencing better business growth and innovation versus their peers, Yuhanna said.

“When you look at data in different directions and the values they bring, it’s fundamentally different from what oil and gas has done in the past. Previously, oil and gas companies primarily worked with structured data, and structured analysis and reviews.”

Today, sources of data such as social, cloud, mobile, video, sensors and Internet of Things brings new opportunities, such as gathering and analyzing data on potential customers, not just existing ones.

Big Data encompasses structured, unstructured and binary data. More than 60 percent of data is unstructured; this data group includes free form texts, emails and tweets, and will continue to grow as the use of mobile devices, sensors and other data sources increase.

Right now, only 15 percent of available data is being used for insights and analysis, Yuhanna said. Lack of tools, approach and initiatives are the reasons why so little of available data is used. By “flipping the iceberg of data” – or using tools available now such as Hadoop – 85 percent of that data could be accessed could make a huge difference.

Noel Yuhanna
Noel Yuhanna, Principal Analyst, Forrester Research
Principal Analyst, Forrester Research

IoT is driving new types of use cases that enable new insights and prediction. IoT is all about gathering sensor data from different types of machines, engines and factories. Cisco Systems predicts that, by 2020, 50 billion of these devices will be connected. Forrest Research estimates that 30 percent usage of IoT in manufacturing; by 2019, this percent is expected to double.

Yuhanna sees lots of advantages of IoT, such as the ability to predict machine failures. The costs of these failures can run into the millions, and could impact other businesses. The predictive capabilities of IoT technologies are being used in John Deere’s future farming vision. John Deere combines weather, satellite, seed and soil data for better crop management.

Companies in the railroad industry are using temperature, vibration and acoustic sensor data to predict maintenance needs, increase safety and reduce costs. Automobile manufacturers also are leveraging Big Data, using geospatial and analysis data, aggregated data sets and telematics/sensor data to optimize routes for self-driving cars and communicate with other cars, as well as link to biometrics and notify local authorities of potholes.

Yuhanna sees numerous use cases for IoT technology in manufacturing, including supply chain management, asset, cargo and container manage, machine diagnostics, machine telemetry, industrial automation and real-time equipment monitoring. Technologies like Hadoop are allowing these use cases in manufacturing to happen because of Hadoop’s ability to scale across billions and billions of petabytes of data.

What’s driving the need for IoT technology? The need for faster real-time access to information, the use of mobile devices, new sensors that track data in real-time, and competitive pressure to support real-time data to access new insights and advanced analytics. But concerns over the security and compliance of data, the integration of data, and support from executives is holding up implementation of IoT technology.

“What we need is to integrate data from systems of engagement, operation and record to drive new insight, interactions and growth,” said Yuhanna. “Just gathering Big Data is not enough. Big Data and fast data must be combined with other types of data. That’s what we see happening in larger, successful organizations.”


Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.