Big Data More Than Just About Numbers
Gathering insight from Big Data is not just about analyzing numbers, but words and phrases as well. Making sense out of Big Data means that industries such as oil and gas need the search engine technology to glean information from structured and unstructured data sets not only to enhance safety, efficiency and productivity, but to predict events before they happen.
The founders of Maana Inc. – the Greek word for meaning – have incorporated machine learning and semantic search engine algorithms to meet the need they saw for search engine technology to help companies in the manufacturing, health care and oil and gas industries get more out of Big Data.
Maana’s solution is not a traditional document-based retrieval engine, which is the mainstay of Open Source Solr/Lucene. Instead, Maana’s solution of machine learning and natural language processing allows for a powerful search paradigm that allows search of data sets by whole text.
The Palo Alto, California-based company recently came out of stealth mode with the announcement of more than $14 million in funding from strategic investors, which include Chevron Technology Ventures, ConocoPhillips Technology Ventures, Intel Capital, GE Ventures and Frost Data Capital.
When they started Maana, its founders sought to address a common problem for companies: how to bring together and make sense of data currently locked behind applications, log files, or data generated by machinery or sensors.
“Even though innovations like Hadoop provide a place to put that data, it’s still very difficult to make sense of it,” Maana Chief Technology Officer Donald Thompson told Rigzone in an interview.
Thompson spent 15 years at Microsoft, where he served as director of engineering, architect, and development manager on projects such as SQL Server 2012 Semantic Engine. Thompson also created Microsoft’s first contextual ad delivery system.
Making sense of the data would still require a lot of manual labor to bring together information across applications. In some companies, one division can have 50 to 60 different applications, all generating data built for specific reasons and involving a slightly different aspect of the same asset, whether it’s a well, a patient or a piece of equipment.
As a result, existing solutions don’t provide the horizontal visibility across all the data in order to solve problems, Thompson noted.
“All sorts of tools are available to address one aspect or another of Big Data, but nothing pulls together different aspects of the data needed to generate and operationalize a solution.”
To address this issue, Maana’s founders designed an industry-agnostic platform, with a search engine that crawls, indexes, joins, classifies and mines data. As the search engine works, it automatically builds and grows a graph structure that represents concepts and relationships. Meanwhile, raw values such as terms, numbers, dates and geo-coordinates are stored in specialized indexes for search – including fuzzy, proximity and range – and efficient computation. These form the data structures Emergent Semantic Graph and the Liquid Index, which are required for this new type of search.
Additionally, users can easily add their own lenses on the data through new graph structure, either manually mapped, based on transformations, or queries.
“Generally, these new structures don’t require reindexing the data, thereby providing both highly flexible data transformation without the overhead in time and space,” company officials said.
Search is a very powerful paradigm that people are familiar with and that is scalable for use by thousands of workers, but the challenge of searching is different when one is searching a technical database, warehouse tables and log files, Thompson said. A document search is not the same as a technical search. In a document search, key words are sought. In a technical search, information on the relationships of data structures is sought.
View Full Article