Big Data and its potential as a business tool have garnered plenty of attention of late, and it’s not hard to see why. Data volumes are exploding, and ECM companies are being called upon to help their customers to leverage data in new ways in their decision-making. So far, the discussion has centered on the structured data that can turned into reports that increase efficiency and visibility. And while that’s certainly extremely valuable, there’s a vast amount of unstructured data out there waiting to mined. It’s being called Big Content.
Systemware has a long history of providing enterprise content management solutions to some of the most recognized names in American business. Much of the time, our customers’ needs have focused on structured data. But we are seeing a growing need among corporations for assistance with unstructured data in its many forms: Microsoft Word documents, Explanation of Benefits, images, sound files, instant messages, emails and blog posts.
Big Content goes beyond traditional search
Gartner’s Craig Roth wrote recently about Big Content, saying that, “Just as Big Data uses Apache Hadoop (with MapReduce) to go beyond traditional BI, Big Content combines technologies to go beyond traditional search. These technologies are applied to text analytics, sentiment analysis, video analysis, semantic web technologies, and attention management.”
Roth said corporations today want answers to all kinds of new questions. He wrote, “Full-text search is not the answer. There may be too much noise in the search results to make them useful. The results may be desired as semantic linkages or sentiment ratings rather than a list of links. The text to be searched may not be accessible by a public search engine like Google or all within a firewall for enterprise search engines.”
Big Content will prove its worth
Roth is certainly correct that full-text search is not the answer. Because it returns all values that match the input terms, search typically generates a large number of results, most of which may not be relevant or useful to the task at hand. Decision makers don’t have time to sort through hundreds of query results to locate a precise piece of information.
Finding highly specific information quickly and efficiently takes an index-based approach that yields a far more focused search result. That focus is achieved by predetermining and defining the values upon which future searches will be based. That way corporate users can use those defined value categories and only those categories as search criteria.
With robust indexing — on both mainframe and distributed platforms — corporations can start to pose those new questions and unlock the potential of Big Content to get their answers. That’s when Big Content will start to show that it’s the next step in Big Data.