A new age of business intelligence and analytics is upon us. And it's about time!
Friday, December 10, 2010
Thoughts about Business Intelligence and the Cloud
The advantages of the cloud over on-premises are pretty straight forward. However, as far as business intelligence implementations are concerned, the question to me was always whether the benefits outweigh the unique challenges the cloud introduces. If all business data was in the cloud, there was a definite case to make for implement business intelligence software in the cloud. But since most business data isn’t, the benefits of cloud BI are not as obvious.
The blogosphere and analyst community in the business intelligence space are not sparing any words on the subject. There are several startups in this space as well, such as GoodData, PivotLink and others. But is the business intelligence space really heading in the direction of the cloud? I believe the answer is no.
The main reason I do not believe that the BI space is headed towards the cloud (at least for now) is because business intelligence backbone technology doesn’t seem to be headed there. In fact, it seems to be going in the opposite direction.
If you take a careful look at the new technology promoted by the established business intelligence vendors like SAP, IBM and Microsoft, and even those promoted by slightly less established vendors (yet successful) such as QlikTech and Tableau – it is all technology that is either ‘desktop enabling’ technology or in-memory technology.
These technologies, in-memory in particular, aren’t very cloud friendly and weren’t designed with the cloud in mind at all. They are designed to extract more juice out of a single computer, but very hard to distribute across multiple machines as in the case in most cloud implementations. Also, to benefit significantly from these types of technologies, you need very powerful computers, a premise which goes against proper cloud architecture that dictates that computing operations should be parallelized across multiple cheaper machines.
On the other hand, the current cloud BI platform vendors are using the same traditional backbone technology the on-premises vendors do, and by that they suffer from the same drawbacks most BI vendors do such as complexity and long development cycles. And when these drawbacks come into play, whether the data is in the cloud or on-premises isn’t even the main issue.
Even if the ‘pure cloud’ BI platform vendors did develop better technology more suited for running BI in the cloud, it is still years away. So while you can use the cloud for some types of solutions (mainly around other cloud data sources) the fact of the matter is that the cloud BI hype is at least a few years too early.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Wednesday, December 1, 2010
Business Intelligence Vendor Websites – How to Read Between the Lines
Assuming the vendor does indeed have a Customers page, the first thing you should look for is whether the featured customers are big corporations or smaller companies. This is an important distinction because business intelligence solutions for big corporations often have very different functional (and other) requirements than business intelligence for smaller companies. You should focus on vendors which sell to companies like the one for which you’re seeking a solution. If you need BI for an SMB, a business intelligence vendor that only lists Fortune 500 corporations on their Customers page probably won’t fit your needs. Their solutions are probably too complicated and/or expensive. Similarly, if you need BI for a large corporation and the business intelligence vendor only lists SMB customers, the solution may not deliver the functionality, performance or scalability you need.
The second thing you should look for is whether you recognize any of the customers listed on this page. Having recognizable names says a lot about the credibility of the business intelligence vendor. Well-known companies with recognizable brand names do not trust their business operations to just anyone. A business intelligence vendor with recognizable names on its Customers page is less likely to disappoint you than a company listing only unknown names.
The second thing you should look for on the Partners page is whether the vendor has multiple software/technology vendors listed as partners. Typical business intelligence applications require several tools and technologies to be fully implemented, and when a vendor lists technology/software partners, it usually means they only provide a portion of the business intelligence stack themselves.
Monday, November 15, 2010
Microsoft’s Announcement of BI Road Map – a Public Relations Nightmare
There are two main reasons for this:
1. While Microsoft heavily relies on its partner network for selling more SQL Server licenses, they have been heavily marketing PowerPivot which is positioned as self-service BI. Neither the partners, nor Microsoft, have yet to figure out how to position the two offerings in a way that makes sense to a typical customer.
2. Microsoft’s road map clearly shows a shift away from OLAP architecture and to a new paradigm they call the BI Semantic Model that doesn’t coincide with their partners’ existing and hard earned training and expertise in implementing/selling Microsoft BI solutions. Microsoft partners would need to re-align their entire business based on a product that isn’t even released yet, let alone implemented anywhere.
Following this announcements, high profile evangelists of Microsoft BI solutions have openly expressed their concern regarding this radical move. See examples here and here. Ever since, Microsoft has been working overtime on some major damage control, trying to explain itself and reassure its partner network that OLAP is not going anywhere and that these new plans are complementary to the existing Analysis Services offering (i.e. "bla bla").
Regardless of Microsoft’s post factum attempt to re-establish calm amongst its partner community from a PR perspective, that cat is already out of the bag. And honestly, Microsoft’s move was not unexpected. Check out the article published two months ago titled ‘Business Intelligence Vendors and their Partners – Rough Seas Ahead’ which specifically discusses what PowerPivot (and similar technologies) would mean for the BI implementation business.
Whether Microsoft’s ideas are practical or just pipe dreams remains to be seen. However, one thing is certain – considering the fact Microsoft rely so heavily on its dedicated partners for sales and marketing, I would have expected this to be handled with much more finesse. This announcement was a very poor display of handling public relations.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Friday, November 12, 2010
Microsoft’s BI Roadmap says NO to OLAP Cubes and MDX
“The last few days have been quite emotional for me. I’ve gone from being very angry, to just feeling sad, to being angry again; I’m grateful to the many members of the SSAS dev team who’ve let me rant and rave at them for hours on end and who have patiently explained their strategy – it’s certainly helped me deal with things. So what’s happened to make me feel like this? I’ll tell you: while it’s not true to say that Analysis Services cubes as we know them today and MDX are dead, they have a terminal illness. I’d give them two, maybe three more releases before they’re properly dead, based on the roadmap that was announced yesterday.”
The full post and consequent comments can be found here.
Readers of The ElastiCube Chronicles may recall a previous post titled ‘Is Microsoft Admitting that Analysis Services is not Fit for the Mid-Market?’ published back in August 2010, in response to the official release of PowerPivot. Well, I believe that question has been officially answered - Yes.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
The New Tableau 6.0 Data Engine – First Impressions
"Our new Tableau Data Engine achieves instant query response on hundreds of millions of data rows, even on hardware as basic as a corporate laptop... No other platform allows companies to choose in-memory analytics on gigabytes of data …" Christian Chabot, CEO of Tableau Software, said in a statement.
These are bombastic claims indeed and the underlined segments of the CEO’s quote are particularly interesting. So with the help of my friend, colleague and brilliant database technologist Eldad Farkash, I decided to put these claims to a real life test.
Since this data engine was claimed to be utilizing in-memory technology, we set up a 64-bit computer with adequate amounts of RAM (hardly a corporate laptop) and used a real customer’s data set consisting of 560 million rows of raw internet traffic data. To make it easier, we imported just a single text field out of this entire data set.
Initial Findings:
1. Surprisingly, and unlike what Tableau’s CEO claims, Tableau’s new data engine is not really in-memory technology. In fact, their entire data set is stored on disk after it is imported and RAM is hardly utilized.
2. It took Tableau 6.0 approximately 5 hours to import this single text field, out of which 1.5 hours was pure import and the rest a process Tableau calls ‘Column Optimization’ which we believe is creating an index very similar to that of a regular relational database. For comparison, it took QlikView 50 minutes and ElastiCube 30 minutes to import the same field. That is an x7 difference. All products were using their default settings.
3. Once the import process completed, we asked Tableau to count how many distinct values existed in that field, a common query required for business intelligence purposes. That query took 30 minutes to return. For comparison, it took both QlikView and ElastiCube approximately 10 seconds to return. That’s an x180 difference. Again, both products were used with their default settings.
Initial Conclusions:
Tableau’s new data engine is a step up from their previous engine which was quite similar to that which Microsoft Access had been using in Office 2007. That is good news for individual analysts working with non-trivial amounts of data using earlier versions of Tableau, which were quite poor in this respect. This release, I imagine, also helps Tableau against SpotFire (Tibco), which until now was the only pure visualization player who could claim to have technology aimed for handling of larger data sets.
From a practical perspective, however, the handling of hundreds of millions of rows of data as well as the reference to in-memory analytics are more marketing fluff geared towards riding the in-memory hype than a true depiction of what this technology is or what it is capable of. Tableau’s data engine is not in the same league as in-memory technology, or pure columnar technologies like ElastiCube, when it comes to import times or query response times. In fact, it is slower by several orders of magnitude.
Sunday, October 24, 2010
Choosing a BI Vendor - Making the Short List
Monday, October 18, 2010
From OLAP Cubes to ElastiCubes – The Natural Evolution of BI
Since the inception of BI and consequent entrance of OLAP technology into the space, the need for BI has been rapidly growing. Recognizing that OLAP-based solutions were (and still are) hard to introduce into a wider market, thought leaders and visionaries in the space have been since then trying to bring BI down to the masses through technological and conceptual innovation.
The most recently recognized innovation (even though it’s been around for quite a while) was in-memory technology, whose main advantage was cutting implementation time and simplifying the process as a whole (a definite step in the right direction). However, as described in my recent article, In-Memory BI is Not the Future, It's the Past, using in-memory technology for speedy BI implementation introduces significant compromises, especially in terms of scalability (both for data volumes and support for many concurrent users). Now, after in-memory technology has been on the market for some time, it is clear that it is not really a replacement for OLAP technology, but did in fact expand the BI market to a wider audience. In fact, it is probably more accurate to say that in-memory technology and OLAP technology complement each other, each with its own advantages and tradeoffs.
In that article I also briefly mentioned the new disk-based ElastiCube technology (invented by SiSense). ElastiCube technology basically eliminates the inherent IMDB tradeoffs by providing unlimited scalability using off-the-shelf hardware while delivering both implementation and query response times as fast (or faster) as pure in-memory-based solutions. This claim was the subject of many of the emails and inquires I received following the article’s publication. I was repeatedly asked how ElastiCube technology had achieved what OLAP technology had failed to do for so many years, and what role in-memory technology played in its conception.
Thus, in this article I will describe how ElastiCube technology came to be, what inspired it, what made it possible and how it has already become a game-changer in the BI space, both in large corporations and small startups.
A Brief History of BI and OLAP
OLAP technology started gaining popularity in the late 1990s, and that had a lot to do with Microsoft’s first release of their OLAP Services product (now Analysis Services), based on technology acquired from Panorama Software. At that point in time, computer hardware wasn’t nearly as powerful as it is today; given the circumstances at the time, OLAP was groundbreaking. It introduced a spectacular way for business users (typically analysts) to easily perform multidimensional analysis of large volumes of business data. When Microsoft’s Multidimensional Expressions language (MDX) came closer to becoming a standard, more and more client tools (e.g., Panorama NovaView, ProClarity) started popping up to provide even more power to these users.
While Microsoft was not the first BI vendor around, their OLAP Services product was unique and significantly helped increase overall awareness of the possibilities offered by BI. Microsoft started gaining market share fairly quickly, as more companies started investing in BI solutions.
But as the years passed by, it became very apparent that while the type of multidimensional BI empowered by OLAP technology was a valuable asset to any organization, it seemed to be used mainly by large corporations. OLAP is just too complex and requires too much time and money to be implemented and maintained, thus eliminating it as a viable option for the majority of the market.
See: Microsoft (SSAS), IBM (Cognos)
The Visualization Front-End Craze
As more companies began investing in BI solutions, many vendors recognized the great opportunity in bringing BI to the mass market of companies with less money to spend than Fortune 500 firms. This is where visualization front-end vendors started popping up like mushrooms after the rain, each of them promising advanced business analytics to the end user, with minimal or no IT projects involved. Their appeal was based on radically reducing the infamous total cost of ownership (TCO) of typical BI solutions. These products, many of which are still available today, are full of useful and advanced visualization features.
However, after years of selling these products, it became very clear that they are incapable of providing a true alternative to OLAP-based solutions. Since they fail to provide similar centralized data integration and management capabilities, they found themselves competing mainly with Excel, and were being used only for analysis and reporting of limited data sets by individuals or small workgroups.
In order to work around these limitations (and increase revenues), these tools were introduced connectivity to OLAP sources as well as to the tabular (e.g., spreadsheet) data they supported until then. By doing that, these products basically negated the purpose for which they were initially designed – to provide an alternative to the expensive OLAP-based BI solutions.
See: Tableau Software, Tibco SpotFire, Panorama Software
The In-Memory Opportunity
The proliferation of cheap and widely available 64-bit PCs during the past few years has somewhat changed the rules of the game. More RAM could be installed in a PC, a boon for those visualization front-end vendors struggling to get more market share. More RAM on a PC means that more data can be quickly queried. If crunching a million rows of data on a machine with only 2GB of RAM was a drag, users could now add more gigabytes of RAM to their PCs and instantly solve the problem. But still, without providing centralized data integration and management, this was not a true alternative to OLAP-based solutions that are still prominent in massive organization-wide (or even inter-departmental) implementations.
Strangely enough, out of all the in-memory technology vendors out there, only one realized that using in-memory technology to empower individual users wasn't enough and that the way to gain more significant market share was to provide an end-to-end solution, from ETL to centralized data sharing to a front-end development environment. This vendor is QlikTech and it is no wonder that the company is flying high above the rest of the non-OLAP BI players. QlikTech used in-memory technology to cover a much wider range of BI solutions than any single front-end visualization tool could ever do.
By providing data integration and centralized data access capabilities, QlikTech was able to provide solutions that, for other vendors (in-memory or otherwise), required at least a lengthy data warehouse project if not a full-blown OLAP implementation. By utilizing in-memory technology in conjunction with 64-bit computing, QlikTech solutions work even on substantial amounts of data (significantly more than their traditional disk-based competitors could).
However, QlikTech has not been able to make a case for replacing OLAP yet. I believe this is not only because of the scalability issues and hardware requirements involved when large amounts of data and/or users are involved, but it’s also because they do not inherently support dimensional modeling like OLAP does. Apart from making life simpler for IT when maintaining multiple applications, OLAP’s implementation of a dimensional model also gives end users, via supporting front end tools, a broader range of flexibility in creating their own BI applications.
Microsoft, the newest entry into the in-memory BI game, also started marketing its in-memory PowerPivot solution as an alternative to OLAP, basically admitting it gives up on its Analysis Services as a viable solution for the wider mid-market.
See: QlikTech (QlikView), Microsoft (PowerPivot)
The SaaS/Cloud BI Hype
The SaaS/Cloud hype hasn’t skipped over the BI space, though running BI in the cloud does not dramatically change anything in respect to implementation time and/or complexity of implementation. In fact, cloud BI vendors use the same technologies that are widely used on-premises. There are several startup companies in this space, competing for niche markets.
It’s still hard to tell what impact the cloud would have on the BI space as a whole as none of these companies has yet to prove there’s even a viable business for hosting BI in the cloud. One thing is certain, though: these companies cannot rely on in-memory technology to grow significantly. The costs of hardware and the amount of work required to support the number of customers they would need to thrive are prohibitive, to say the least. For more on the problem with cloud BI, see my earlier post, Would I Use Cloud Business Intelligence?
See: GoodData, YouCalc, Birst, PivotLink, Indicee
ElastiCube: Convergent Technologies for an Optimum Solution
ElastiCube technology was officially introduced to the market in late 2009, after more than five years of research and development conducted in complete secrecy. After being proved practical and effective in the real world (by being successfully implemented at over 100 companies, paying customers in numerous industries, from startups to multinational corporations), SiSense secured a $4 million investment to continue the development of the ElastiCube technology, and to expand awareness of the Prism Business Intelligence product which is based on the technology.
ElastiCube is the result of thoroughly analyzing the strengths and weaknesses of both OLAP and in-memory technologies, while taking into consideration the off-the-shelf hardware of today and tomorrow. The vision was to provide a true alternative to OLAP technology, without compromising on the speediness of the development cycle and query response times for which in-memory technologies are lauded. This would allow a single technology to be used in BI solutions of any scale, in any industry.
Here are the 10 main goals on which SiSense focused when designing the ElastiCube technology:
2. A star schema must not be assumed to exist for effective querying large amounts of data.
3. The solution must provide unlimited scalability, both in terms of number of rows and number of fields, within a finite and reasonable amount of RAM.
4. The solution must be able to operate using off-the-shelf hardware, even for extreme data/user scenarios.
5. The solution must provide high-speed, out-of-the-box query performance, without requiring pre-calculations.
6. There must be a separation between the application layer and the physical data layer via a virtual metadata layer.
7. There must be support for a dimensional model and multidimensional analysis.
8. The same application must be able to support a single user with a laptop to thousands of users via a central, server-based data repository.
9. Without running an SQL database, an SQL layer must be available to conform to industry standards.
10. The solution must offer the ability to incorporate additional/changed data (e.g., new rows, new fields) on the fly, without reprocessing the entire data model.
The presently available version of Prism, based on ElastiCube technology, delivers on every one of these requirements. Even though it would be a lot of fun for me, I unfortunately can’t delve into the nuts and bolts of how these goals are technologically achieved. What I can say is that ElastiCube utilizes columnar storage concepts as well as just-in-time in-memory query processing technology. If you want to read a little about it, you can take a look at SiSense’s ElastiCube technology page.
I can add that the feasibility of ElastiCube was greatly affected by the amazing CPU and disk technologies that now come with any run-of-the-mill personal computer.
ElastiCube is extremely powerful technology that enables speedy implementation of individual, workgroup and corporate-wide BI. As a solution that delivers the promise of OLAP-style BI without the cost, time and IT overhead of OLAP, it is no surprise that Prism is rapidly gaining popularity in the market. Businesses that use ElastiCube technology include household names such as, Target, Yahoo, Cisco, Samsung, Philips and Caterpillar. But a significant portion of business that use ElastiCube are significantly smaller, such as Wix and other startup companies - who otherwise could not afford BI at all.
See: SiSense (Prism)
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Friday, September 24, 2010
In-memory BI is not the future. It’s the past.
The increasing hype surrounding in-memory BI has caused BI consultants, analysts and even vendors to spew out endless articles, blog posts and white papers on the subject, many of which have also gone the extra mile to describe in-memory technology as the future of business intelligence, the death blow to the data warehouse and the swan song of OLAP technology.
I find one of these in my inbox every couple of weeks.
Just so it is clear - the concept of in-memory business intelligence is not new. It has been around for many years. The only reason it became widely known recently is because it wasn’t feasible before 64-bit computing became commonly available. Before 64-bit processors, the maximum amount of RAM a computer could utilize was barely 4GB, which is hardly enough to accommodate even the simplest of multi-user BI solutions. Only when 64-bit systems became cheap enough did it became possible to consider in-memory technology as a practical option for BI.
The success of QlikTech and the relentless activities of Microsoft’s marketing machine have managed to confuse many in terms of what role in-memory technology plays in BI implementations. And that is why many of the articles out there, which are written by marketers or market analysts who are not proficient in the internal workings of database technology (and assume their readers aren’t either), are usually filled with inaccuracies and, in many cases, pure nonsense.
The purpose of this article is to put both in-memory and disk-based BI technologies in perspective, explain the differences between them and finally lay out, in simple terms, why disk-based BI technology isn’t on its way to extinction. Rather, disk-based BI technology is evolving into something that will significantly limit the use of in-memory technology in typical BI implementations.
But before we get to that, for the sake of those who are not very familiar with in-memory BI technology, here’s a brief introduction to the topic.
Disk and RAM
Generally speaking, your computer has two types of data storage mechanisms – disk (often called a hard disk) and RAM (random access memory). The important differences between them (for this discussion) are outlined in the following table:Most modern computers have 15-100 times more available disk storage than they do RAM. My laptop, for example, has 8GB of RAM and 300GB of available disk space. However, reading data from disk is much slower than reading the same data from RAM. This is one of the reasons why 1GB of RAM costs approximately 320 times that of 1GB of disk space.
Another important distinction is what happens to the data when the computer is powered down: data stored on disk is unaffected (which is why your saved documents are still there the next time you turn on your computer), but data residing in RAM is instantly lost. So, while you don’t have to re-create your disk-stored Microsoft Word documents after a reboot, you do have to re-load the operating system, re-launch the word processor and reload your document. This is because applications and their internal data are partly, if not entirely, stored in RAM while they are running.
Disk-based Databases and In-memory Databases
Now that we have a general idea of what the basic differences between disk and RAM are, what are the differences between disk-based and in-memory databases? Well, all data is always kept on hard disks (so that they are saved even when the power goes down). When we talk about whether a database is disk-based or in-memory, we are talking about where the data resides while it is actively being queried by an application: with disk-based databases, the data is queried while stored on disk and with in-memory databases, the data being queried is first loaded into RAM.Disk-based databases are engineered to efficiently query data residing on the hard drive. At a very basic level, these databases assume that the entire data cannot fit inside the relatively small amount of RAM available and therefore must have very efficient disk reads in order for queries to be returned within a reasonable time frame. The engineers of such databases have the benefit of unlimited storage, but must face the challenges of relying on relatively slow disk operations.
On the other hand, in-memory databases work under the opposite assumption that the data can, in fact, fit entirely inside the RAM. The engineers of in-memory databases benefit from utilizing the fastest storage system a computer has (RAM), but have much less of it at their disposal.
That is the fundamental trade-off in disk-based and in-memory technologies: faster reads and limited amounts of data versus slower reads and practically unlimited amounts of data. These are two critical considerations for business intelligence applications, as it is important both to have fast query response times and to have access to as much data as possible.
The Data Challenge
A business intelligence solution (almost) always has a single data store at its center. This data store is usually called a database, data warehouse, data mart or OLAP cube. This is where the data that can be queried by the BI application is stored.The challenges in creating this data store using traditional disk-based technologies is what gave in-memory technology its 15 minutes (ok, maybe 30 minutes) of fame. Having the entire data model stored inside RAM allowed bypassing some of the challenges encountered by their disk-based counterparts, namely the issue of query response times or ‘slow queries.’
Disk-based BI
When saying ‘traditional disk-based’ technologies, we typically mean relational database management systems (RDBMS) such as SQL Server, Oracle, MySQL and many others. It’s true that having a BI solution perform well using these types of databases as their backbone is far more challenging than simply shoving the entire data model into RAM, where performance gains would be immediate due to the fact RAM is so much faster than disk.It’s commonly thought that relational databases are too slow for BI queries over data in (or close to) its raw form due to the fact they are disk-based. The truth is, however, that it’s because of how they use the disk and how often they use it.
Relational databases were designed with transactional processing in mind. But having a database be able to support high-performance insertions and updates of transactions (i.e., rows in a table) as well as properly accommodating the types of queries typically executed in BI solutions (e.g., aggregating, grouping, joining) is impossible. These are two mutually-exclusive engineering goals, that is to say they require completely different architectures at the very core. You simply can’t use the same approach to ideally achieve both.
In addition, the standard query language used to extract transactions from relational databases (SQL) is syntactically designed for the efficient fetching of rows, while rare are the cases in BI where you would need to scan or retrieve an entire row of data. It is nearly impossible to formulate an efficient BI query using SQL syntax.
So while relational databases are great as the backbone of operational applications such as CRM, ERP or Web sites, where transactions are frequently and simultaneously inserted, they are a poor choice for supporting analytic applications which usually involve simultaneous retrieval of partial rows along with heavy calculations.
In-memory BI
In-memory databases approach the querying problem by loading the entire dataset into RAM. In so doing, they remove the need to access the disk to run queries, thus gaining an immediate and substantial performance advantage (simply because scanning data in RAM is orders of magnitude faster than reading it from disk). Some of these databases introduce additional optimizations which further improve performance. Most of them also employ compression techniques to represent even more data in the same amount of RAM.Regardless of what fancy footwork is used with an in-memory database, storing the entire dataset in RAM has a serious implication: the amount of data you can query with in-memory technology is limited by the amount of free RAM available, and there will always be much less available RAM than available disk space.
The bottom line is that this limited memory space means that the quality and effectiveness of your BI application will be hindered: the more historical data to which you have access and/or the more fields you can query, the better analysis, insight and, well, intelligence you can get.
You could add more and more RAM, but then the hardware you require becomes exponentially more expensive. The fact that 64-bit computers are cheap and can theoretically support unlimited amounts of RAM does not mean they actually do in practice. A standard desktop-class (read: cheap) computer with standard hardware physically supports up to 12GB of RAM today. If you need more, you can move on to a different class of computer which costs about twice as much and will allow you up to 64GB. Beyond 64GB, you can no longer use what is categorized as a personal computer but will require a full-blown server which brings you into very expensive computing territory.
It is also important to understand that the amount of RAM you need is not only affected by the amount of data you have, but also by the number of people simultaneously querying it. Having 5-10 people using the same in-memory BI application could easily double the amount of RAM required for intermediate calculations that need to be performed to generate the query results. A key success factor in most BI solutions is having a large number of users, so you need to tread carefully when considering in-memory technology for real-world BI. Otherwise, your hardware costs may spiral beyond what you are willing or able to spend (today, or in the future as your needs increase).
There are other implications to having your data model stored in memory, such as having to re-load it from disk to RAM every time the computer reboots and not being able to use the computer for anything other than the particular data model you’re using because its RAM is all used up.
A Note about QlikView and PowerPivot In-memory Technologies
QlikTech is the most active in-memory BI player out there so their QlikView in-memory technology is worth addressing in its own right. It has been repeatedly described as “unique, patented associative technology” but, in fact, there is nothing “associative” about QlikView’s in-memory technology. QlikView uses a simple tabular data model, stored entirely in-memory, with basic token-based compression applied to it. In QlikView’s case, the word associative relates to the functionality of its user interface, not how the data model is physically stored. Associative databases are a completely different beast and have nothing in common with QlikView’s technology.PowerPivot uses a similar concept, but is engineered somewhat differently due to the fact it’s meant to be used largely within Excel. In this respect, PowerPivot relies on a columnar approach to storage that is better suited for the types of calculations conducted in Excel 2010, as well as for compression. Quality of compression is a significant differentiator between in-memory technologies as better compression means that you can store more data in the same amount RAM (i.e., more data is available for users to query). In its current version, however, PowerPivot is still very limited in the amounts of data it supports and requires a ridiculous amount of RAM.
The Present and Future Technologies
The destiny of BI lies in technologies that leverage the respective benefits of both disk-based and in-memory technologies to deliver fast query responses and extensive multi-user access without monstrous hardware requirements. Obviously, these technologies cannot be based on relational databases, but they must also not be designed to assume a massive amount of RAM, which is a very scarce resource.These types of technologies are not theoretical anymore and are already utilized by businesses worldwide. Some are designed to distribute different portions of complex queries across multiple cheaper computers (this is a good option for cloud-based BI systems) and some are designed to take advantage of 21st-century hardware (multi-core architectures, upgraded CPU cache sizes, etc.) to extract more juice from off-the-shelf computers.
A Final Note: ElastiCube Technology
The technology developed by the company I co-founded, SiSense, belongs to the latter category. That is, SiSense utilizes technology which combines the best of disk-based and in-memory solutions, essentially eliminating the downsides of each. SiSense’s BI product, Prism, enables a standard PC to deliver a much wider variety of BI solutions, even when very large amounts of data, large numbers of users and/or large numbers of data sources are involved, as is the case in typical BI projects.When we began our research at SiSense, our technological assumption was that it is possible to achieve in-memory-class query response times, even for hundreds of users simultaneously accessing massive data sets, while keeping the data (mostly) stored on disk. The result of our hybrid disk-based/in-memory technology is a BI solution based on what we now call ElastiCube, after which this blog is named. You can read more about this technological approach, which we call Just-in-Time In-memory Processing, at our BI Software Evolved technology page.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Wednesday, September 1, 2010
Business Intelligence Vendors and their Partners – Rough Seas Ahead
The Relationship between BI Software Vendors and their VARs
As in all partnerships, both sides need to have something significant to gain for their partnership to be successful. In the business intelligence industry, this has indeed been the case for a long time. The software vendors use their channel partners to distribute their software to a larger audience and these, in turn, have made a pretty penny from commissions, consulting and implementation fees.
The software vendors, for their part, want to sell as many software licenses as they can to new customers, as well as to charge software maintenance fees from their existing clientele. This provides them a steady income stream from existing customers while new customers grow the business. Their partners, on the other hand, prefer long and complex implementation projects from which they generate significantly more revenue than they do from commissions on software license sales.
This symbiosis used to be great. Since most traditional BI companies are focused on high-end corporations with huge budgets, there was enough to go around. These customers have large numbers of employees who can benefit from BI (read: big money selling software licenses for the software vendors) and who have no problem spending hundreds of thousands (or millions) of dollars on implementation projects (read: significant income from project fees for the implementer).
Mutually Beneficial Relationships?
It so happens, however, that changing conditions over the past couple of years (and particularly during 2010) have brought the traditional business intelligence industry to a point where the mutual vendor-VAR benefits are not as obvious anymore. While these conditions have contributed to a deterioration in relationships between BI software vendors and their partners, the good news is that companies exploring business intelligence options stand to benefit substantially from the situation.Let’s take a look at some of the conditions affecting the BI industry in recent years:
1. Tough Economic Times
Obviously, the economic crisis which began in 2008 affected everyone, vendors and customers alike. Business intelligence as a concept was actually positively affected by this crisis as it became painfully obvious how important it is to track a business’s operational and financial performance. On the other hand, available budgets shrank significantly and there was a smaller pie to share between BI software vendors and their partners. This fact has been causing friction between the two sides as each attempts to vigorously protect its own piece of the pie.2. Too Many Partners
In an attempt to gain more market share, software vendors invested extra effort in recruiting more and more VARs for their partner networks. While this had a positive effect on software vendors’ revenues, it wasn’t as good for those in the partner network. Having more partners leads to more competition which, in turn, means more investment in marketing and sales (and lower profits). To make matters worse, in a further attempt to increase revenues, some software vendors actually began competing with their own partners on implementation deals.3. QlikTech and their IPO
Ever since QlikTech began gaining popularity, their main sales pitch has been shorter implementation times and reduced ongoing costs (due to the supposedly fewer IT personnel required to maintain their BI solution). While this holds mighty appeal to BI customers, it flies in the face of the entire premise of BI resellers, which rely on project implementation and BI maintenance revenues. QlikTech addressed this issue by providing their VARs higher commissions on software license sales (as compared to those offered by Microsoft, Cognos or Business Objects, for example). Coupled with the implementation and maintenance work a QlikTech solution still requires, the higher commissions provide reasonable revenues for their partners.Along with their impressive sales and growth numbers, QlikTech’s recent IPO revealed that they generated $157M in revenues during 2009 with total expenses of $150M. The resulting profit of $7M is not great.
Whether QlikTech’s intentions are to be acquired soon or to keep growing their business remains a mystery, but either way their partners should pay close attention. If they do seek a quick exit, their partners face an uncertain future. If they intend on growing their business and improving profitability, they will have to raise their prices and/or expand their partner network significantly and/or increase their direct involvement in both software sales and implementation. Existing partners will not be pleased with either of these alternatives.
As the successful pioneer of a newer, faster, easier approach to BI, the QlikTech example should be considered carefully by VARs as an indication of what the future may hold for the BI industry as whole.
4. The Self-Service BI Hype
The hottest thing in the BI industry today is the self-service BI concept. Regardless of whether it’s promoted by vendors providing personal analysis tools or cloud BI platforms, the basic idea behind it is the same: traditional BI is too expensive, takes too long to implement and is a big pain to maintain. Instead, the customer wants tools to enable self-reliance (as opposed to relying on external consultants/implementers who live off service fees). Whether these solutions actually deliver what they promise is beside the point (you can read my opinion about cloud BI here), but the buzz is out there and the market hears it, so it’s getting harder these days to justify long and expensive BI projects.5. Microsoft PowerPivot
PowerPivot is Microsoft’s attempt to promote the self-service BI concept. By introducing PowerPivot, Microsoft is basically giving up on penetrating the mid-market with SQL Server Analysis Services and is trying instead to do it by introducing stronger BI capabilities in their Office product. While some believe that PowerPivot is just a lot of hot air, the fact remains that Microsoft is investing a lot of effort and money on marketing it. This places their existing partners – who rely on SQL Server sales – in a very problematic situation. These partners prefer SQL Server-based solutions, which provide more license commissions and more project hours, yet they need to fight Microsoft’s own marketing machine which is now essentially promoting self-service BI. Not an enviable situation to be in, to say the least.What Does the Future Hold?
It’s great that so much emphasis is being placed on simplifying business intelligence and making it accessible to companies that do not have multimillion dollar budgets. Since established players and new startups alike are now beginning to focus on this type of approach, it is actually realistic to expect that self-service BI is on its way to gradually becoming a commodity. Customers will benefit greatly from this trend.On the other hand, business intelligence VARs must understand that this is where the market is headed – and adjust their business models accordingly. A company selling BI solutions based on existing BI platforms will need to provide real added value to the customer in order to stay in business. In the not-too-distant future, this value will almost certainly come from industry-specific professional knowledge and experience (as opposed to purely technical expertise). More and more customers will no longer accept lengthy R&D projects to achieve BI and, with the new software and technologies now emerging, it is no longer justifiable.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Tuesday, August 24, 2010
Business Intelligence? Yes Minister!
In this episode, the minister asked his assistant, Bernard, to inquire about a new hospital where there are supposedly no patients and a ridiculous amount of administrative staff. Bernard conducts some research and returns to the minister with his results.
Here’s a short transcript:
Bernard- You asked me to find out about an alleged empty hospital in north London.
Well, think that instead of a conversation between a minister and his assistant about a new hospital, this conversation is actually between a CEO and his CIO regarding an ongoing business intelligence project. You've got yourself a conversation which is still happening more often than not in BI projects, 30 years later.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Thursday, August 12, 2010
Is Microsoft Admitting that Analysis Services is not Fit for the Mid-Market?
There are, of course, other reasons which the author did not mention, such as Microsoft trying to get a fighting chance against QlikView, which has been constantly beating Microsoft at mid-sized and departmental deals.
In addition, Microsoft is trying to motivate their customers to upgrade to Excel 2010, in which PowerPivot is provided for free in the form of an add-in. Microsoft is not a natural BI company and their cash cows are still Windows and Office, so it only makes sense. Will it work? Who knows. Will it change the BI space? Probably not.
To me, the most interesting thing about this post is the fact that PowerPivot is meant to promote the self-service BI concept, which in most people's minds is the complete and utter opposite of what Analysis Services delivers, namely a heavy, IT-centric business intelligence solution.
If this is true, Microsoft is basically admitting on their own blog that Analysis Services has failed to provide a viable solution for mid-sized companies and departments (where self-service BI is widely used) and that their new BI marketing strategy is based on Office, not SQL Server.
This fact is well known to people who are familiar with the trends and nuances of the BI space, but Microsoft saying this on their blog is, to me, a very big deal.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Thursday, August 5, 2010
The Google AdWords Broad Match Modifier
For the record, I have always disliked the Broad matching option. Especially since Google introduced Expanded Broad matching where Google shows you ads for every word they deem close enough to the actual word you bid on, not just the exact word.
Ever since they did that, it's become very difficult to control Adwords campaigns because the search phrases you bid on become much less targeted and cost you many irrelevant click$.
The new Broad match modifier is a step in the right direction as it allows you to better control the phrases that bring up your ad.
Let's say you bid on the phrase: Banana Cake Recipe
Setting this phrase to Broad matching, will show your ad to surfers searching for Chocolate Cake Store and Growing Bananas in Brazil which may be irrelevant to what you're advertising.
If you use the new modifier like so: +Banana +Cake +Recipe
Your ad will show only to phrases that contain some variation of all three words. That means that the search phrase must contain all three words, or whichever word Google decide is synonymous to the each of the words you specified. They may, for example, bring your ad up to someone searching for Banana Muffin Recipe. You may like that or you may not.
You can also choose to apply this modifier to only a subset of the words in the phrase, but then you need to determine whether it's worth your while to split the phrase into two separate ones.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
The Catch 22 of Traditional Business Intelligence
So much has already been said about how much of a pain business intelligence is. The complexity, the constant IT bottlenecks, the crazy cost of hardware, software, consultants and whatnot. Gil Dibner of Gemini Venture Funds (formerly of Genesis Partners) described it very eloquently and in great detail in his blog post about the SiSense investment round.
Since business intelligence imposes so many limitations and challenges, every existing BI vendor picks its favorite ones and positions itself as the best at addressing those specific ones. Some focus on providing easy-to-use front-end tools for the business user, some on handling complex ETL scenarios and large data sets, others on open source software to remove software licensing costs and so on.
Business intelligence vendors have been constantly rolling out new functionality and technology through the years. But still, it seems like business intelligence has been standing still. No progress has been made in expanding it to the wider market that can't afford long and costly development/customization cycles. In fact, most of the BI vendors which do not sell enterprise-class solutions (e.g., SAP, IBM and Microsoft) haven't been able to grow much and remain focused on niche markets.
Well, my friends, it's time somebody told it to you straight.
Business intelligence can deliver on its promise, but the entire idea needs a complete overhaul. As long as vendors keep improving specific functions within the traditional BI paradigm, no progress will be made. The traditional business intelligence paradigm needs to be scrapped and replaced by something that is much easier, much faster, much cheaper and much more manageable.
What is the core of the traditional BI problems? The traditional paradigm contains an inherent flaw that prevents it from taking BI to the next level where ROI is indisputable and business users get another powerful tool added their arsenal - in companies of all (or most) sizes.
The Inherent Flaw in the Traditional BI Paradigm
If you search "why business intelligence projects fail" in Google you will find an abundance of white papers and articles (mostly written by BI vendors themselves) giving their two cents' worth. When BI vendors pick their top reasons, they usually pick issues dealt with by their offerings and not by the competitions'. Marketing 101. Fair enough.
But one top reason on which they all seem to agree for a BI project's failure is the lack of up-front planning. That is to say, in order for a business intelligence project to succeed, you must compile your requirements ahead of time, coordinate with all the relevant parties (IT, business departments and executives) and plan the project in accordance to those requirements. Otherwise, you are destined to fail.
In other words, they blame you - the consumer - for a failed BI project. Had you planned ahead, the project would have been a success and you wouldn't have flushed hundreds of thousands of dollars worth of software licenses, hardware and personnel time down the toilet.
Sadly, they have a point. Since traditional BI solutions aren't very sympathetic to unplanned changes from an architectural point of view, anything you don't think of in advance is difficult and expensive to introduce later. So you better think long and hard about what you need, because otherwise any requirements you miss could mean the difference between a successful project and a complete mess.
But herein lies the catch.
It doesn't matter who you are or how much experience you have, it is utterly impossible to know in advance what your requirements are when it comes to BI. BI is highly dynamic and requirements change all the time because the business changes all the time and because business users come up with new ideas all the time. A report you need now is not the report you need later, an analysis you do now may only be relevant for a short period of time and meaningless shortly thereafter.
Most importantly - if you are a department or company seeking a BI solution but without any BI development experience, you have no way of knowing how a particular requirement will affect the architecture of your solution. Thus, you could easily find yourself disregarding the immediate testing of some particular capability because it seems trivial to you, just to discover later that the entire solution comes tumbling down when you actually try to use it, and that without it the system is useless.
You cannot imagine how often this happens, especially when a solution calls for OLAP cubes built over a data warehouse (blah).
It's the traditional BI vendors who made up the rules for this game over 10 years ago. They are the ones who've been aggressively promoting a paradigm where everything needs to be thought of in advance - otherwise you are sure to fail. It makes sense because these vendors focus on enterprise-wide BI for fortune 500 companies where the complexity of a BI project is masked by the complexity of the corporation's own business processes. These organizations are used to things taking years to reach perfection because every other process they have pretty much takes the same amount of time.
But trying to implement the same concepts on slightly smaller corporations is the exact reason why most BI projects fail.
Don't get me wrong. It's always good to plan ahead. But know this - business intelligence requirements are impossible to predict and nearly impossible to measure until the end users use it on real data - in real-life scenarios - over time.
You cannot do this with traditional BI without investing a TON beforehand, and even then you have no guarantees. When you go for BI as advocated by the traditional platform players, you are basically throwing hundred dollar bills down a wishing well and hoping for the best.
Learn from the thousands and thousands of companies who have already learned this harsh lesson with blood and tears. Don't do it. There are ways to change the rules of the game while still getting the same class of business intelligence, without compromising on functionality, speed or capacity. But you cannot expect to find it by turning to the traditional BI players who have an over-sized BI developer eco-system for which they need to provide work. This can only be done by younger, innovative BI companies armed with new technologies, fresh ideas and sensible pricing models.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog