From Text Analytics to Data Warehousing - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
5/18/2008
11:08 AM
Seth Grimes
Seth Grimes
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

From Text Analytics to Data Warehousing

IBM recently posted a quite nice page on extracting business value from "unstructured" data. The premise is that because much valuable business information originates in "unstructured" form, you need to look at text analytics as a technology that can unlock value. And naturally, if you already have a BI program and a data warehouse, you'll want to explore integrating text-sourced information into your existing data-analysis infrastructure.

IBM recently posted a quite nice page on extracting business value from "unstructured" data. The page describes use of IBM's own products and formats to be sure, but it is potentially helpful for anyone who wishes to learn about information extraction from textual sources for data warehousing.

IBM's page starts with a brief text-analytics overview. It then dives into implementation with the OmniFind Analytics Edition for DB2 and its pureXML capabilities. It describes a process flow includes XML tagging of document features and the alternatives of mapping the XML schema to relational database structures or use using the XML structures directly for analyses. This text-analytics workflow, and the choices involved in dealing with text-sourced information, are not specific to IBM's tools, however. So which IBM provides diagrams and code listings and an analysis of the alternative approaches that relate to their own products, the lessons apply much more generally.The premise is that because much valuable business information originates in "unstructured" form — e-mail, Web pages, news and blog articles, corporate reports, etc. — you need to look at text analytics as a technology that can unlock value. And naturally, if you already have a BI program and a data warehouse, you'll want to explore integrating text-sourced information into your existing data-analysis infrastructure. You'll want to explore unified analytics.

Information extraction to databases enables unified analytics. I cover approaches in my own text-analytics courses and presentations — I use open-source GATE (General Architecture for Text Engineering) software for illustrations and examples in order to remain independent of any product — but IBM's is the first clear, freely available, and practical technical exposition that I have seen on this topic. If you want to learn more about unified analytics, do visit IBM's From Text Analytics to Data Warehousing page.

Disclosure: IBM is a sponsor of a editorially independent text-analytics report I am writing, which is unrelated to my Intelligent Enterprise writing.
IBM recently posted a quite nice page on extracting business value from "unstructured" data. The premise is that because much valuable business information originates in "unstructured" form, you need to look at text analytics as a technology that can unlock value. And naturally, if you already have a BI program and a data warehouse, you'll want to explore integrating text-sourced information into your existing data-analysis infrastructure.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
News
The State of Chatbots: Pandemic Edition
Jessica Davis, Senior Editor, Enterprise Apps,  9/10/2020
Commentary
Deloitte on Cloud, the Edge, and Enterprise Expectations
Joao-Pierre S. Ruth, Senior Writer,  9/14/2020
Slideshows
Data Science: How the Pandemic Has Affected 10 Popular Jobs
Cynthia Harvey, Freelance Journalist, InformationWeek,  9/9/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
IT Automation Transforms Network Management
In this special report we will examine the layers of automation and orchestration in IT operations, and how they can provide high availability and greater scale for modern applications and business demands.
Slideshows
Flash Poll