Whenever you're trying to design a BI solution, one of the first tasks is discovering where the underlying data for your solution resides. If you're lucky, the majority of the data will be held within existing source systems and applications, where it's stored in a well-structured database that you can interrogate to extract the data you need. If you're unlucky, this data will be held in a mishmash of formats and sources, in everything from desktop databases (such as Access) to spreadsheets, flat files, Word documents, and more.
The prevalence of Microsoft Office on users' desktops has lead to data being stored locally within existing documents, giving rise to new technology terms such as "spreadmart" (the equivalent of a data mart, except built with spreadsheets). When it comes time to bring the data from these data sources together, it's almost impossible to track down and incorporate all these bits of information. Although some BI vendors have started to help you consolidate and report from these "desktop" data sources, a much better way to collect and consolidate information is available.
Enter Microsoft InfoPath, the latest member of the Office family and new to the Office 2003 release. InfoPath (formerly known as "XDocs") is designed to help users and administrators gather and consolidate information that may have previously been held in electronic documents or paper forms. Using InfoPath, you can create electronic forms that can collect data and submit it to a variety of back-end data sources and systems, which you can then use for the basis of your BI solution.
InfoPath is targeted primarily at Office "power users" who may have used spreadsheets, documents, and other forms to gather information. Instead of manually collecting and consolidating this information, they can now create electronic forms to do a large part of the work for them. The other target audience is application developers who have previously used custom Web forms or applications to gather data they can now quickly develop forms to capture this information in a fraction of the time it would have taken to create and maintain a custom application.
When you install InfoPath, a number of sample forms are included (including expense reports, timesheets, and more.) so you can immediately start using the product. When designing your own forms, you can either create them from scratch or use one of the sample forms as a template. When you design a form, you'll be able to specify where you want to put the data that is entered at the most basic level you can create forms that simply save the form results to an XML document, which can then by used in other systems or applications.
A consolidation feature lets you consolidate data from multiple forms into one XML file, and an export feature lets you export form data to Excel. So power users who have basic requirements for data collection and consolidation may be able to use InfoPath almost as a stand-alone application without tying the tool into any back-end databases or applications.
But the real strength of InfoPath is its tight integration with Microsoft SQL Server and the ability to push data back into both SQL Server and Access databases. This will let you create a form that can be submitted to a database table held on either of these platforms. If you don't use either of these database formats in your organization, don't fret. You can also use InfoPath to submit form data directly back to an XML Web service, letting you push data back into a variety of database formats and applications. And there's also a software development kit (SDK) for InfoPath that includes samples of different Web services that demonstrate how this works.
Once you have the data side of things sorted out, the process of creating forms is simplified with a drag-and-drop interface and intuitive controls. The design process resembles creating forms within Visual Studio or other development tools except that no coding is required. One of the real standout features within InfoPath is the data validation, which lets you establish validation rules to ensure that the data entered into your form is correct and valid. Although you can't possibly trap every data entry error or typo that could occur, these validation rules should ensure that the data entered is clean and correct, which is a major concern.
Using InfoPath, you can create forms to meet just about any use. There are a number of specialized controls for creating forms with repeating tables or sections, and you can include optional sections that are collapsible so the user only expands and completes the section where required.
InfoPath also supports multiple "views" within a single form, letting you neatly organize form content and eliminating forms that require you to scroll down through multiple pages.
For power users, the base functionality provided within the InfoPath form designer should meet the majority of their needs, but for hard-core developers looking to create some code, InfoPath supports both Jscript and VBScript scripting. This feature lets developers augment existing InfoPath functionality using a familiar scripting language and environment. Developers will need to have a good understanding of XML as well as some patience, as the debugging tools provided with InfoPath are weak.
You can also create simple workflow applications with InfoPath forms using Outlook 2003 and the built-in email functionality to send forms between users (see Figure 1), but if you want to create more complex workflow or processing applications you may need to consider implementing InfoPath alongside Microsoft BizTalk Server.
And when developing and distributing InfoPath forms, there's one catch each user who will be filling out an InfoPath form will to have a license and will need to have InfoPath installed locally on his or her machine. This can be a deterrent for larger, price-sensitive organizations that may prefer to continue to create custom Web forms or applications rather than buy an InfoPath license for everyone in the organization. When evaluating the cost of the licensing, keep in mind that with one license you can create and distribute as many forms as you like. When you factor in the cost of creating, hosting, and maintaining custom applications for data collection, a one-time InfoPath license may be cheaper in the long run.
Another feature that needs some work is the database support provided within InfoPath. For organizations that don't use Microsoft SQL Server or Access, some extra work is required to integrate InfoPath forms with other database formats. It would be nice to see Microsoft open up this functionality so you could use InfoPath forms to directly update other database formats as well. Although it's a great selling point for Microsoft's database platform if organizations that don't currently use SQL Server consider deploying at least one SQL Server to make data collection easier. (And once the data is on the SQL Server platform you can then use the built-in Data Transformation Services or other tools to update other core systems.)
So for organizations that use Microsoft back-end technology, implementing and using InfoPath will be a breeze. For organizations that use other back-end databases or systems, a little more work will be required to integrate InfoPath (using Web services, XML files, and so on), but it is still a viable solution for data collection and consolidation. Only time will tell if organizations will adopt InfoPath the major hurdle will not be in recognizing the need for such a product, it will be weaning users off storing data in their existing Office documents.
David McAmis [[email protected]] is an IT consultant, journalist, author, broadcaster, and expert in business intelligence who lives and works in Sydney, Australia.
"A Quick Alternative," Sept. 1, 2003
Visit the Business Intelligence Information Center at www.intelligentinterprise.com/info_centers/bi.