Sunday, October 24, 2010

Choosing a BI Vendor - Making the Short List

There is no shortage of business intelligence vendors out there. They all claim to be powerful, easy-to-use, flexible and affordable. So how do you pick the one that is right for you?

In order to be able to choose the right BI vendor from the abundance out there, the best way is to follow high-level, yet restrictive, criteria and only then compare them on a feature-by-feature basis. Here are a few tips that will help you do that, as well as avoid common mistakes typically made when choosing a BI solution. This is the 21st century, and BI solutions are completely different than what you may be used to. If you follow these tips, you’ll end up with a very short list of vendors, and then it’ll just be a matter of choosing the one you feel most comfortable with in terms of specific features, pricing, support, etc:

Find a Complete Solution, Not Just Pretty Visualization.
The visualization of data is important, of course, but the biggest mistake you can make is judge the BI vendor based on the pretty dashboard samples they show you on their website or during a demo. Every BI vendor can do that because visualization software components are a dime a dozen. The real challenge is customizing these dashboards to your own needs and having them show your own data. This part usually takes most vendors months, and costs you bundles. If the BI vendor cannot get your own data to show the way you like it within just a few days, you could probably find a better one.

Beware of the Data Warehouse.
A data warehouse is a centralized database filled with all the business’s data, and for years it’s been making a ton of money for BI vendors and bringing nothing but grief to customers. Today’s BI technology does not require a data warehouse, even when there are multiple data sources involved, large amounts of data or multiple users querying the data. There are very specific scenarios where a data warehouse is a good idea, but they are most likely not relevant to you. If the vendor requires a data warehouse to proceed with implementation, it is most likely you should keep looking.

Beware of the OLAP Cube.
OLAP, which stands for Online Analytical Processing, is 20 year old technology designed to improve query performance over medium to large datasets. OLAP is also very lengthy and costly to implement, and there is really no need for it anymore. Today’s BI technology can handle even huge amounts of data without OLAP, at fractions of the time or cost. If the BI vendor requires OLAP to assure you acceptable query performance, you should probably move on.

Refuse to Make Significant Upfront Investments.
Many BI vendors will promise you the world, but will demand significant upfront investment in preparation projects, hardware and software before you even get to run a single report on your actual data. Do not agree to this, and demand to have at least one solid report or dashboard running over your own data before you commit to anything significant in advance. If the vendor is not willing to do so, it’s probably because they would have to spend weeks on development before they can reach that point. That typically means this vendor is either using very old technology or is simply trying to pull one over you.

Be wary of Vendors whose Business is Prof. Services.
Vendors who sell real home-grown BI software products (in contrast to OEMing someone else's software) do not like engaging in long professional services projects because it hurts their margins. That is why they prefer to create software that is easy enough to be used directly by the customer or through a third party (which usually lives off these professional services contracts). If you choose a BI vendor who makes most of his business off professional services (as opposed to software sales), you can pretty much be sure that they will take their time building your solution. These types of BI vendors also live off on-going maintenance services, so what you initially pay for the solution is actually only the beginning. Whenever possible, try to choose a BI vendor that focuses on selling BI software to the end customer, not to the professional services community.

Make the Vendor Prove it To You.
The most important thing is to make the vendor prove what they claim prior to investing too much money upfront. This proof must be in the form of reports, dashboards or analytics in real life scenarios, running on real data, used by the actual end users and within a reasonable amount of time. If a vendor is not willing to accommodate this simple request, you really should find one that does. Many vendors provide free trial versions, as well as utilize technology that speeds up implementation tremendously. If the one you're in contact with now doesn't, they shouldn't make your short list.

Monday, October 18, 2010

From OLAP Cubes to ElastiCubes – The Natural Evolution of BI

OLAP (Online Analytical Processing) technology is the most prevalent technology used in corporate BI solutions today. And while it does what it’s supposed to do very well, it has a bad (and accurate) reputation for being very expensive and difficult to implement, as well as extremely challenging to maintain. This fact has prevented OLAP technology from gaining wide popularity outside of Fortune 500-scale companies, which are the only ones who have the budgets for company-wide, OLAP-based BI implementations.

Since the inception of BI and consequent entrance of OLAP technology into the space, the need for BI has been rapidly growing. Recognizing that OLAP-based solutions were (and still are) hard to introduce into a wider market, thought leaders and visionaries in the space have been since then trying to bring BI down to the masses through technological and conceptual innovation.

The most recently recognized innovation (even though it’s been around for quite a while) was in-memory technology, whose main advantage was cutting implementation time and simplifying the process as a whole (a definite step in the right direction). However, as described in my recent article, In-Memory BI is Not the Future, It's the Past, using in-memory technology for speedy BI implementation introduces significant compromises, especially in terms of scalability (both for data volumes and support for many concurrent users). Now, after in-memory technology has been on the market for some time, it is clear that it is not really a replacement for OLAP technology, but did in fact expand the BI market to a wider audience. In fact, it is probably more accurate to say that in-memory technology and OLAP technology complement each other, each with its own advantages and tradeoffs.

In that article I also briefly mentioned the new disk-based ElastiCube technology (invented by SiSense). ElastiCube technology basically eliminates the inherent IMDB tradeoffs by providing unlimited scalability using off-the-shelf hardware while delivering both implementation and query response times as fast (or faster) as pure in-memory-based solutions. This claim was the subject of many of the emails and inquires I received following the article’s publication. I was repeatedly asked how ElastiCube technology had achieved what OLAP technology had failed to do for so many years, and what role in-memory technology played in its conception.

Thus, in this article I will describe how ElastiCube technology came to be, what inspired it, what made it possible and how it has already become a game-changer in the BI space, both in large corporations and small startups.

A Brief History of BI and OLAP
OLAP technology started gaining popularity in the late 1990s, and that had a lot to do with Microsoft’s first release of their OLAP Services product (now Analysis Services), based on technology acquired from Panorama Software. At that point in time, computer hardware wasn’t nearly as powerful as it is today; given the circumstances at the time, OLAP was groundbreaking. It introduced a spectacular way for business users (typically analysts) to easily perform multidimensional analysis of large volumes of business data. When Microsoft’s Multidimensional Expressions language (MDX) came closer to becoming a standard, more and more client tools (e.g., Panorama NovaView, ProClarity) started popping up to provide even more power to these users.

While Microsoft was not the first BI vendor around, their OLAP Services product was unique and significantly helped increase overall awareness of the possibilities offered by BI. Microsoft started gaining market share fairly quickly, as more companies started investing in BI solutions.

But as the years passed by, it became very apparent that while the type of multidimensional BI empowered by OLAP technology was a valuable asset to any organization, it seemed to be used mainly by large corporations. OLAP is just too complex and requires too much time and money to be implemented and maintained, thus eliminating it as a viable option for the majority of the market.

See: Microsoft (SSAS), IBM (Cognos)

The Visualization Front-End Craze
As more companies began investing in BI solutions, many vendors recognized the great opportunity in bringing BI to the mass market of companies with less money to spend than Fortune 500 firms. This is where visualization front-end vendors started popping up like mushrooms after the rain, each of them promising advanced business analytics to the end user, with minimal or no IT projects involved. Their appeal was based on radically reducing the infamous total cost of ownership (TCO) of typical BI solutions. These products, many of which are still available today, are full of useful and advanced visualization features.

However, after years of selling these products, it became very clear that they are incapable of providing a true alternative to OLAP-based solutions. Since they fail to provide similar centralized data integration and management capabilities, they found themselves competing mainly with Excel, and were being used only for analysis and reporting of limited data sets by individuals or small workgroups.

In order to work around these limitations (and increase revenues), these tools were introduced connectivity to OLAP sources as well as to the tabular (e.g., spreadsheet) data they supported until then. By doing that, these products basically negated the purpose for which they were initially designed – to provide an alternative to the expensive OLAP-based BI solutions.

See: Tableau Software, Tibco SpotFire, Panorama Software

The In-Memory Opportunity
The proliferation of cheap and widely available 64-bit PCs during the past few years has somewhat changed the rules of the game. More RAM could be installed in a PC, a boon for those visualization front-end vendors struggling to get more market share. More RAM on a PC means that more data can be quickly queried. If crunching a million rows of data on a machine with only 2GB of RAM was a drag, users could now add more gigabytes of RAM to their PCs and instantly solve the problem. But still, without providing centralized data integration and management, this was not a true alternative to OLAP-based solutions that are still prominent in massive organization-wide (or even inter-departmental) implementations.

Strangely enough, out of all the in-memory technology vendors out there, only one realized that using in-memory technology to empower individual users wasn't enough and that the way to gain more significant market share was to provide an end-to-end solution, from ETL to centralized data sharing to a front-end development environment. This vendor is QlikTech and it is no wonder that the company is flying high above the rest of the non-OLAP BI players. QlikTech used in-memory technology to cover a much wider range of BI solutions than any single front-end visualization tool could ever do.

By providing data integration and centralized data access capabilities, QlikTech was able to provide solutions that, for other vendors (in-memory or otherwise), required at least a lengthy data warehouse project if not a full-blown OLAP implementation. By utilizing in-memory technology in conjunction with 64-bit computing, QlikTech solutions work even on substantial amounts of data (significantly more than their traditional disk-based competitors could).

However, QlikTech has not been able to make a case for replacing OLAP yet. I believe this is not only because of the scalability issues and hardware requirements involved when large amounts of data and/or users are involved, but it’s also because they do not inherently support dimensional modeling like OLAP does. Apart from making life simpler for IT when maintaining multiple applications, OLAP’s implementation of a dimensional model also gives end users, via supporting front end tools, a broader range of flexibility in creating their own BI applications.

Microsoft, the newest entry into the in-memory BI game, also started marketing its in-memory PowerPivot solution as an alternative to OLAP, basically admitting it gives up on its Analysis Services as a viable solution for the wider mid-market.

See: QlikTech (QlikView), Microsoft (PowerPivot)

The SaaS/Cloud BI Hype
The SaaS/Cloud hype hasn’t skipped over the BI space, though running BI in the cloud does not dramatically change anything in respect to implementation time and/or complexity of implementation. In fact, cloud BI vendors use the same technologies that are widely used on-premises. There are several startup companies in this space, competing for niche markets.

It’s still hard to tell what impact the cloud would have on the BI space as a whole as none of these companies has yet to prove there’s even a viable business for hosting BI in the cloud. One thing is certain, though: these companies cannot rely on in-memory technology to grow significantly. The costs of hardware and the amount of work required to support the number of customers they would need to thrive are prohibitive, to say the least. For more on the problem with cloud BI, see my earlier post, Would I Use Cloud Business Intelligence?

See: GoodData, YouCalc, Birst, PivotLink, Indicee

ElastiCube: Convergent Technologies for an Optimum Solution
ElastiCube technology was officially introduced to the market in late 2009, after more than five years of research and development conducted in complete secrecy. After being proved practical and effective in the real world (by being successfully implemented at over 100 companies, paying customers in numerous industries, from startups to multinational corporations), SiSense secured a $4 million investment to continue the development of the ElastiCube technology, and to expand awareness of the Prism Business Intelligence product which is based on the technology.

ElastiCube is the result of thoroughly analyzing the strengths and weaknesses of both OLAP and in-memory technologies, while taking into consideration the off-the-shelf hardware of today and tomorrow. The vision was to provide a true alternative to OLAP technology, without compromising on the speediness of the development cycle and query response times for which in-memory technologies are lauded. This would allow a single technology to be used in BI solutions of any scale, in any industry.

Here are the 10 main goals on which SiSense focused when designing the ElastiCube technology:
1. A data warehouse must not be assumed to exist for effectively querying multiple sources.

2. A star schema must not be assumed to exist for effective querying large amounts of data.

3. The solution must provide unlimited scalability, both in terms of number of rows and number of fields, within a finite and reasonable amount of RAM.

4. The solution must be able to operate using off-the-shelf hardware, even for extreme data/user scenarios.

5. The solution must provide high-speed, out-of-the-box query performance, without requiring pre-calculations.

6. There must be a separation between the application layer and the physical data layer via a virtual metadata layer.

7. There must be support for a dimensional model and multidimensional analysis.

8. The same application must be able to support a single user with a laptop to thousands of users via a central, server-based data repository.

9. Without running an SQL database, an SQL layer must be available to conform to industry standards.

10. The solution must offer the ability to incorporate additional/changed data (e.g., new rows, new fields) on the fly, without reprocessing the entire data model.

The presently available version of Prism, based on ElastiCube technology, delivers on every one of these requirements. Even though it would be a lot of fun for me, I unfortunately can’t delve into the nuts and bolts of how these goals are technologically achieved. What I can say is that ElastiCube utilizes columnar storage concepts as well as just-in-time in-memory query processing technology. If you want to read a little about it, you can take a look at SiSense’s ElastiCube technology page.

I can add that the feasibility of ElastiCube was greatly affected by the amazing CPU and disk technologies that now come with any run-of-the-mill personal computer.

ElastiCube is extremely powerful technology that enables speedy implementation of individual, workgroup and corporate-wide BI. As a solution that delivers the promise of OLAP-style BI without the cost, time and IT overhead of OLAP, it is no surprise that Prism is rapidly gaining popularity in the market. Businesses that use ElastiCube technology include household names such as, Target, Yahoo, Cisco, Samsung, Philips and Caterpillar. But a significant portion of business that use ElastiCube are significantly smaller, such as Wix and other startup companies - who otherwise could not afford BI at all.
See: SiSense (Prism)

By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Total Blog Directory Technology Blog Directory Business Intelligence Directory