Open-Source BI Stretches Beyond Reporting

Can open-source software crack into the lucrative but ultra-competitive business intelligence marketplace by offering packages that include more than a reporting tool?

The Business Intelligence and Reporting Tools initiative (BIRT), from BI vendor Actuate and open-source community The Eclipse Foundation, is one of the more ambitious open BI reporting applications. BIRT is cross-platform, Eclipse-based, XML-driven and dedicated to delivering standardized output. But fast on BIRT's heels, two new organizations are promising broader BI application frameworks.

The first of the upstarts, Pentaho, is adopting the BIRT reporting tool as part of a broader framework. The other, JasperSoft, is counting on building from the a bottom-up, starting with its existing reporting tool base. Here we'll examine the components of the two open-source BI offerings and see how they hope to enter a very competitive market.

Pentaho: A Complete BI Stack

The impressive thing about Pentaho is that these BI veterans from Cognos, Hyperion, IBM, Lawson, Oracle and SAS designed a complete BI stack with reporting, OLAP analysis, data mining, dashboards and workflow capabilities. The only thing missing is an ETL (extract, transform and load) framework. This system will be built on the Eclipse Integrated Development Environment (IDE) and use J2EE servers and XML-based web services. The software already has a number of key components available as open-source projects. However, the fly in the ointment is that the key Framework and integration from Pentaho software will gradually appear over the summer and fall of this year:

OLAP analysis: beta in August/September, release in November OLAP analysis, like reporting, starts with an existing open-source tool, Mondrian, which is written in Java and implements the Microsoft MDX language and XML for Analysis. It also uses a Java OLAP (JOLAP) interface for three development APIs and the JPivot, series of JavaServer Pages (JSP) tag libraries tied in with the Eclipse IDE. Pentaho adds enhanced Scalable Vector Graphics (SVG) output, a dashboard widget and portlet template. The second phase of the project will add a whole series of "designers" including OLAP Model Designer, Analyzer Pivot Table Designer, and centralized content management with framework security.

Dashboard: beta in September, release in November Pentaho's dashboard component is tied closely to the development of the OLAP analysis tool. There will a dashboard component and integration of external content into the dashboard. The dashboard also will integrate the reporting tool and a broader JSP widget for dashboard operations. For November release, the dashboard will gain an Analysis Dashboard Designer along with a series of templates. This will be critical given the three APIs available for analysis, because each will require its own template.

Business Framework: beta in July/August, release in October The Business Framework will have two major parts - a business rules engine with JavaScript and SQL support, plus a document repository with metadata on components, rules, workflow, Web services and portlets. These will be combined in a Framework Solution Engine with security, Java messaging service (JMS), e-mail, and portlet manager. Interfacing to this will be the Pentaho Framework Workbench, which will provide a desktop application to both develop and monitor dashboard workflow and overall Pentaho operations.

Data mining: beta in September, release in December The Pentaho data mining component uses the Weka engine and adds connectivity to the OLAP analysis tool plus portlets and Web services through the Business Framework. Knowledge Explorer and KnowledgeFlow Environments set up and execute the mining workflow. The release phase will add a Data Mining Console and a component to the dashboard tool. The dashboard interface plus templates and how-to examples will complete the data mining tool.

Reporting: beta in July, release in October Pentaho's reporting capabilities will have print bursting which allows major reports to be divided up and automatically routed with necessary details plus summary to specific parties. This report bursting can be over a network to a designated directory or by email. In addition to e-mail, Pentaho supports HTML, and PDF output within a Reporting Workbench that itself includes reusable parts plus dashboard integration. The final release will add an editor for report activity plus templates and final dashboard integration.

Workflow: beta in July, release in October Workflow is the integrating control mechanism for the BI Framework. It will have a "Shark Tool Agent" for controlling workflow actions and a graphical editor. For the final release, there will be a console and tool agent editor which allows users to configure and manage workflows better. Look for Pentaho's workflow component to match up with the BI Framework's Solution Engine and document repository.

In sum, this is a very ambitious framework. Three of the key components are already available as Reporting tool, OLAP engine and data-mining tool. Pentaho still has real software to release during the course of the summer. By adopting the Eclipse, Java, XML, and Web Services approach, Pentaho potentially positions itself well in terms of open standards at every stage -- input, processing and output. Interestingly, Pentaho has adopted Microsoft's MDX language for doing OLAP queries. The OLAP Council agreed to adopt this as a standard.

However, Pentaho will find itself competing against free software from the top three database vendors. For example, Microsoft SQL Server 2005 will be debuting roughly in the same time frame with its own ETL, highly-regarded OLAP and data mining, Reporting Services and Maestro real-time connections. But because Pentaho links up with PostgreSQL, MySQL, Firebird and Apache Derby, Pentaho will be able to offer free and increasingly enterprise-caliber databases to level the playing field.

Pentaho plans to make money supplying support, training and consulting services. This open-source model is similar to what Red Hat and JBoss are doing fairly successfully in the OS and application server fields. In contrast, JasperSoft's approach to its offerings is partly open-source freeware and partly proprietary.

Commercial Open-Source BI

JasperSoft is delivering what it calls Commercial Open Source with its Jasper Reports and Jasper Decisions products. JasperSoft describes Commercial Open Source as follows: "A commercial open-source company will go a step further and offer added value product modules that can be purchased for a reasonable price. These modules are not mandatory for gaining benefits from the open-source product -- but rather they can be added if and when needed by an application based on its evolving requirements." Actuate appears to be taking a similar approach with the BIRT open-source reporting tool. And unlike open-source purists, this reviewer sees potential advantages in Commercial Open Source for both developers and users.

Here are the concerns that open-source purists might have. First, there's the possibility that the process of commercial open-source development is just a loss leader, a variation on shareware or freeware. Critics charge that such "freeware" is designed to get users committed to a system. Then, as more and more future developments and modules get charged for, a new proprietary system emerges. But if that happens, there must be some happy customers -- otherwise, why would they pay for the improvements? In effect, they got to try out a system for low initial cost and risk -- and as it evolved they literally "bought into it". Moreover, open source competitors may well provide some of the proprietary components -- as is now happening with some new open source tools for JasperSoft's proprietary extensions.

Customers not satisfied with new developments or their future direction have three courses of action. First, they can band together with like-minded users and try financially inducing commercial open-source developers to produce the modules they want and need. Second, they can do it themselves or make the same proposition to other open-source developers. Third, if the users have chosen well, they should have many standards-based inputs, outputs and even processing options (MDX, JOLAP, CSS, SVG, XML-configuration, for example) so that moving to another open-source or commercial BI stack should be relatively painless.

Finally, open-source purists will insist that having a mix of open-source and commercial software from the same developer is just inimical. Don't tell BEA, Borland, HP, IBM, Oracle, Sun and maybe even Microsoft the same, because they all have big open source components as part of their total software product mix. But again, the solution is immediately at hand: take the fork in the road -- the code is at the users' disposal.

JasperSoft BI

JasperSoft does not cover the BI stack as comprehensively as Pentaho plans to. Rather, JasperSoft specializes in reporting, some OLAP analytics and workflow. But it has completed and usable products today. We look at those now in more detail:

JasperReports: delivers reports to the screen, printer or into PDF, HTML, XLS, CSV and XML files. JasperReports can stand alone or be embedded directly into a user's application to give it advanced reporting capabilities. Included among the report types available is parameterized reporting for simple "what-if" or drill-down analysis/reporting.

JasperDecisions: offers repository, scheduling, dashboard and security modules as well as a full graphical report designer in a server-based system. The Jasper Scope Creation Suite provides a GUI-based Query Designer for the creation of secure, parameterized queries. These in turn can be used in JasperSoft's Scope Designer to create Web or portal applications with tables, cross tabs and charts. Underlying JasperDecisions and JasperReports is an XML-based Report Definition Language (RDL) that allows the added flexibility for user or programmatic control of reports.

One of the strengths of the Jasper system is the RDL file. The RDL file can contain HTML tags and JavaScript for further client-side refinement or added dynamic responsiveness. The RDL file has five XML-based header sections:

Content - specifies the content of the report in multi-data blocks for each data source Parameters - contains the variables that control the report for content, order, filters, etc. Pagination - determines the page segments for each data block in the report Sorting - determines the index and data sort criteria Layout - determines the visual structure and appearance of the data segments in reports

JasperSoft actively supports the ability to dynamically update reports through RDL changes.

In sum, JasperSoft has taken the server side of reporting and commercialized it. The basic runtime engine is free, and other third party Open Source providers have written their own Designers and tools to drive JasperSoft's reporting engine. However, JasperSoft, as creators of Report Definition Language, have created a complete BI reporting tool that includes dashboard creation and report scheduling, plus security and management of reporting that's available for testing and deployment now.

Just as is the case in databases and application servers, it appears that BI is evolving into mixed markets -- pure proprietary (Business Objects or Microsoft), mixed open source and proprietary (JasperSoft) and nearly pure open source (Pentaho). It's very safe to say the jury is still out on which model will work best in the BI marketplace.

Jacques Surveyer is a writer and consultant, see his work at the OpenSourcery.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Email This  | 
Print  | 
More Insights
Copyright © 2020 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service