Near real-time queries produce more credible results for big data users

farm-ccapp.jpg
farm-ccapp.jpg

The Climate Corporation app allows farms, soil, and weather to be monitored from anywhere.

 Image: Lyndsey Gilpin/TechRepublic

Much of the focus on real-time big data has been on the capture and processing of real-time unstructured data from machines, sensors, web pages, and other Internet of Things (IoT) sources, but for a growing number of businesses, it is equally important to be able to receive the results of big data queries in real-time or near real-time.

EDA delivers real-time big data query capabilities to clients through its ability to aggregate disparate big data sources and to provide near real-time answers to subscribers that are looking to maximize selling opportunities to their own end customers.

"In one case, we had a large equipment manufacturer with an extensive dealer network that wanted to know more about its market share, and the buying behaviors of its customers," said Sonny Rivera, vice president of technology at EDA. "We were able to combine market analytics, lead generation data, and the client's end customer buying histories and credit analyses into a dashboard summary where our client could see at a glance whether a customer was ranked 'high,' 'medium,' or 'low' in terms of likelihood of purchase."

Unlike many big data queries, clients don't have to wait hours or days for the results they seek. One of the reasons is EDA's use of innovative CPU utilization technology from Sisense, a purveyor of business intelligence and analytics software.

Sisense uses in-chip technology by exploiting the memory that is directly resident in a computer's CPU chip. By doing so, Sisense says that it can process data 50 to 100 times faster than other in-memory data processing that uses RAM. The economy of data processing doesn't stop there. It extends into the database, which is designed in a columnar construction instead of in rows like a relational database. What this means is that if a data query only requires three columns, even if the query joins a dozen data tables with 500 columns, only the three that are required are used. The columnar database is also optimized for compression. This limits the amount of data that has to move across memory.

Actual data is not "cleaned" any better with Sisense than it is in other types of reporting and queries, but it is more credible because it is newer and fresher. This is made possible because the near-real time query capability that the technology enables helps users avoid the wait times they encounter while waiting for Hadoop (or other big data processors) to finish with the data analysis and reporting tasks. Sometimes, it can take Hadoop days to finish a query and, by that time, the original data used for the query might not be as fresh and credible.

"The data actually 'lives' with CPU cache," explained Eldad Farkash, Sisense's CTO. "It looks like a database, but it acts like a compiler....This technology is about getting as close to the hardware as possible so you can rapidly push business-oriented data down to the business."

It's working in practice.

AGCO, which designs, manufactures, and distributes agricultural solutions, targeted specific owners of new and used equipment models that were likely to buy a new AGCO combine harvester. It developed a highly targeted direct marketing campaign that brought customers into dealerships to preview a new harvester model. Eight-five percent of the initial buyers of the harvester had come from the market intelligence gathered from EDA market intelligence that utilized Sisense technology.

"When we started this business, our initial delivery to our clients was dashboards," said EDA's Rivera, "But our clients quickly let us know that they wanted more. What they wanted was the ability to perform deep queries that could return answers in as little as several seconds. This is where the business intelligence technology from Sisense enters in."

Also read