Hadoop data too big to move? Splunk may have a solution
- By Rutrell Yasin
- Jun 28, 2013
Agencies looking to share big data analytics across different enterprise platforms as well as in cloud infrastructures can run into a problem: What if data is just too big to move?
That is what is happening to some organizations that store, manage and process the vast amounts of data collected from a variety of sources, according to Sanjay Mehta, Splunk’s vice president of marketing.
As a result, Splunk has released a beta version of new software designed to help government agencies and businesses easily visualize and analyze data in Apache’s Hadoop open-source framework.
Hunk: Splunk Analytics for Hadoop is a stand-alone software product that lets organizations give broader user groups insight into their data assets without custom development, data modeling or lengthy batch processing iterations, according to Splunk officials. The software provides interactive data exploration, discovery and analytics so users can more easily gain insights from raw data in Hadoop, the say.
Hunk is another addition to Splunk’s portfolio of software anchored by the company’s flagship platform, Splunk Enterprise, which collects, indexes and harnesses machine data generated by applications, servers and devices, whether they are physical, virtual or in the cloud.
The company has technology to let data analysts move data in and out of Hadoop, but some customers are saying that data is becoming so large they can’t move it any more. So they are looking for ways to move data natively within Hadoop, Mehta said in an interview with GCN.
Hunk lets analysts explore data in Hadoop from one place, allowing them to perform interactive data exploration across large, diverse data sets. They do not have to understand the data upfront. Instead, they can simply point Hunk at the Hadoop cluster and start exploring data immediately, Splunk officials said.
Splunk Virtual Index separates the storage of data from the analytics, which helps speed analysis, Mehta said. The virtual index technology links Hunk into the entire Splunk technology stack. For instance, links to Splunk Search Processing Language let users perform interactive exploration, analysis and visualization of data stored anywhere, as if it was stored in a Splunk software index, Mehta said.
After analysts determine what they want to analyze, they will want to do a deeper dive to explore trends and identify patterns of interest, Mehta said. Hunk helps them detect patterns and find anomalies across large volumes of data. They can correlate data to spot trends, identify patterns and gain further insights by connecting data from external relational databases using Splunk DB Connect. Released in March, DB Connect integrates structured data from relational databases with machine data generated by back-end IT systems, networks, applications and even mobile devices, giving analysts insights about the data and helping them make more informed decisions in real-time.
Analysts can use Hunk to build graphs and charts, combining them into customized dashboards that can be shared to laptops, tablets and other mobile devices. Mehta said.
Splunk is looking for beta users among its Fortune 100 and government clients who can apply Hunk to solve problems from vast and complex data sets, leading up to general availability of the software later this year, Mehta said.
Rutrell Yasin is is a freelance technology writer for GCN.