Sunday, July 18, 2010

Would I Use Cloud Business Intelligence?

This post was inspired by the latest announcement made by Tibco, that they are providing SpotFire Silver - their cloud-BI offering - free for one year. Every few months, another BI player announces such a hosted BI service/product. This comes in addition to number of smaller companies that focus on these types of hosted solutions such as Gooddata, Birst and PivotLink.

To-date, no one has proved that hosted (cloud) BI is a sustainable business. None of the startups doing this have skyrocketed (yet?), and more of the larger players (Tibco included, in my opinion) are joining the effort on marketing hype alone. I doubt if they really know how they're going to be making money out of it.

All that being said, would I use a cloud/hosted BI service? In spite of its promise in terms of cost of ownership and easy deployment, the answer to that is an emphatic no. There are several reasons for this and they all revolve around the following points:


One of the common problems typical business intelligence solutions suffer from is their heavy reliance on IT involvement - data warehousing, OLAP cubes, and even report creation and/or customization. The IT department quickly becomes a bottleneck and just as quick the effectiveness of the BI solution you paid so much for relies on you adding more (expensive) IT people to tend to requests. Otherwise you'd be frustrating the users and prevent the solution from expanding throughout the department or company.

Hosted BI solutions are simply hosted IT departments with an arsenal of home-made or 3rd party software. In theory, as long as your solution is not too complicated you could save some money on recruitments to your IT department, but if you thought internal IT can be a bottleneck, you can only imagine how an IT department who is located in a different city or country can respond to your requests (and other customers' as well!).

As long as I have an option, IT-centric BI (hosted or not) is not a good idea as it contradicts what BI is supposed to be. A fast and flexible tool for the business user. But if I need IT to support my BI efforts, I would rather they be close.

Privacy and Security

This one's a big issue for me. I'm not so much worried they'll get hacked, as I am worried about the vendor itself (I have trust issues, I know). I am a heavy BI user and wherever I work a lot of the secret sauce relies on how use BI is used and what KPIs are tracked. Taking all this information and putting it on the server of a BI vendor, just to find it in the next version of their product could turn out to be disastrous. BI gives a tangible form to a business strategy, and that is something I would want to protect without compromise.

Working with Data

You could call anything BI. But basic reporting aside, getting the real gems in BI always involves a lot of data, and it's usually not all in one place. These are the main hurdles you must face, before you can use some sort of a reporting/visualization tool (or even Excel) and extract the answers or insights you're looking for. The mere thought of doing all these ETL tasks over the WWW gives me the shivers. This process is gruesome enough without having to wait for data to be transferred over the internet.

Data Size vs. Cost

Hosted BI vendors charge you for the hardware they use. They have to in order to remain in business. It's commonly known that BI solutions typically require sturdy hardware, particularly with strong multiple CPUs and dozens of GBs of RAM.

The CPU and RAM requirements for a solution are pretty closely bound to the amount of data being stored and queried. Because of this, with a hosted BI solution there is a very clear choice you need to make - pay a lot of money to perform direct queries of medium-to-large data sets hosted on the expensive cloud machine or limit the amount of data you store thus damaging the scope of business intelligence you will be doing.

This is a choice I prefer not to make.

By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog

Tuesday, July 13, 2010

SiSense Secures $4 Million Investment

Big news today.

I can finally share with you guys that SiSense has closed a $4M series A investment and that we can finally push forward towards what we've been working on so hard -- a fresh new approach to enabling high-quality business intelligence, in any company. Not just Fortune 1000s.

The two Venture Capital firms we are excited to partner with are Genesis Partners (Israel) and Opus Capital (USA). Our private investor, Eli Farkash, who has been financing and helping the company since its founding also joined this round. Gary Gannot (Genesis) and Dan Avida (Opus) have joined our board of directors. They are both exceptionally talented people with amazing track records and extensive knowledge. On behalf of the SiSense founders and team, we welcome them aboard and look forward to working with them.

The press release can be found here:

It's going to be fun!

Sunday, July 11, 2010

Comparing BI Vendors Based on Technology

I've recently come across an interesting online discussion where several posters discuss working with large amounts of data and its implications on business intelligence implementations. I wouldn't have noticed it if one of the posters had not referred to SiSense in one of the comments.

The main reason for the post was purely technological, putting on display the internals of QlikView's in-memory database technology. This lasted for about 5 posts, after which it turned into a bashing match between QlikView supporters and what you could call QlikView non-sympathizers in regards to whether it would even make sense to use in-memory database technology for large (1TB and greater) BI implementations.

As part of this discussion, several vendors were mentioned including: SiSense, QlikView, Lyza, Vertica and Microsoft. Some of these vendors do not even directly compete with each other. There were also several types of technologies mentioned, from in-memory databases (IMDB), to columnar databases (CDBMS), and even compression.

Apart from this discussion being interesting and even entertaining (for some), it is indicative of a common mistake that people sometimes make when they compare business intelligence vendors and products based on the technology they use.

Technology is important as it is the foundation on which everything is based, but every vendor takes its technology down different paths, and in many cases comparing two BI vendors is like comparing a Boeing airplane to a Toyota family car. I could easily say that a plane's engine is more powerful than a car's, right? Does that mean you, the consumer, would want an airplane engine stuffed under your car's hood? Your car would theoretically drive faster, thats for sure. But in practice, most civilized areas impose speed limits that would prevent you from gaining any benefit from your automobile's super-fast engine. Not to mention the ridiculous amounts of money you'd be spending on maintaining and refueling your car.

Wanna take the kids out to McDonalds? Better notify the FAA. ;-)

There are significant differences between the above-mentioned vendors which are important to understand. These differences may come from the particular strengths and weaknesses the internal data technology in use has, but it usually goes way beyond that.

QlikView targets departments with reasonable amounts of data that is centralized and accessed by multiple users. QlikView is a developer tool for creating canned BI solutions based on a design made in advance, not as much for ad-hoc analysis. QlikView utilizes an in-memory database to address performance. It is a good solution for small-medium implementations, not as good for larger ones (tons of data and/or too many users). QlikView competes with the giants, such as Oracle, Microsoft, SAP and IBM for end-to-end BI implementations.

Microsoft PowerPivot is a pivoting add-in to Excel 2010. Because it comes with an in-memory database, it removes the 1M row limit imposed by Excel 2007, assuming you have 64-bit machine with adequate RAM. It targets power analysts, like Excel advanced features always have. PowerPivot is really single user BI and is not applicable to multiple users, unless you include SQL Server and SharePoint in the package.

Lyza targets individual power analysts as well, but they rather assume abundance of disk than abundance of RAM. They have created a tool that let's you perform ETL-based filters and analysis over large amounts of data, even on a 32-bit computer (similar in concept to SSIS). They do this by using a columnar database. Lyza is also BI without a centralized data repository, which doesn't make it very effective for multi-user scenarios. It will be interesting to see how Lyza is impacted by Microsoft PowerPivot.

Vertica is a data warehouse software vendor. Their technology is based on an open source project called C Store, which is also a columnar database. Vertica competes with other data warehouse vendors such as Greenplum and InfoBright. They do not currently provide a BI front end for reporting or analysis.

SiSense targets departments and businesses looking for centralized business intelligence accessed by multiple users. SiSense uses both a columnar database for storage and in-memory query processing to make sure it is both infinitely scalable without infinite amounts of RAM and provides viable query performance without having to go down the OLAP path. SiSense also provides its own reporting/analysis front end and competes with the BI giants, as well as QlikView.

As a BI consumer, you are buying a BI solution, not BI technology. Don't get confused by marketing people throwing technological buzzwords at you because most likely you won't be able to identify which of the marketing blather is actually relevant to you. Make sure you get what you need, functionality-wise, and that the solution will still hold water a year from now as your data grows and more users use it.

By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog

Friday, July 9, 2010

Web Marketing Analytics: Hosted or In-House?

When we first began marketing and selling our software over the Internet, we used Google Analytics extensively. SiSense was a young and small company then, with a tiny marketing budget relying on cost-effective online marketing to acquire customers, and therefore any free tool that gave us more information than we already had on how our marketing dollars were being spent was a crucial weapon for surviving this tough economic climate.

Using paid services, such as Omniture, wasn't an option strictly for pricing reasons. We used what we could afford, and at that point we couldn't afford anything commercial (expensive!). So we ended up using Google Analytics for simple traffic reporting and in-house applications built over our own Prism software for everything else.

Not long after, we started acquiring customers at a better rate than we had hoped and the marketing budget started growing. It still wasn't huge, but it was enough to generate quite a lot of marketing data to work with for optimization of existing campaigns and developing new marketing initiatives.

We had a lot of ideas, but we didn't know where to start. So we decided to do what every marketing person should do: we analyzed the marketing data. At least this way, we would be betting our budget on educated guesses instead of clueless luck. Sounds easy enough, but life is never that easy. An innocent desire to make informed marketing decisions turned out to be an R&D nightmare.

I'll give you an example.

The first thing we wanted to determine was which traffic sources generated the quickest ROI. We wanted to understand where customers who take the least time from registration to payment come from. Do they come from particular search keywords? From particular referring sites? Do they even come from the website? This way we could at least have a better risk assessment of our marketing channels.

Well, we knew we have a lot of data in Google Analytics. We could tell which keywords and sites were driving traffic to our website and we also knew how many of them were registering or purchasing online (each defined as separate Google Analytics 'conversions').

But one thing we quickly noticed was that Google Analytics reports were showing us visitor counts, but not the visitors themselves. If we had 3 'registration' conversions on one day and 2 'purchase online' conversions on another, we had no way of telling whether these two groups contained the same people. Stuck.

A quick brainstorming session revealed that we have the customer sales data we need in our billing system's database. But this would turn this marketing initiative to a cooperative effort by both marketing and R&D.

After getting shouted at by the DBA on the mere suggestion of touching the operational billing system, we compromised that the DBA would put a few days into creating an automated process to get the data for us every day in the form of a flat file. Not a great solution, but better than nothing.

A few days later we got the customer data we needed which was basically the customers' personal information and first acquisition date (the billing system did not store the date of registration). All we needed to do now is match this information to the traffic sources data in Google Analytics.

Since Google Analytics was only providing us access to visitor counts and not individual visitors, our only other option of cross-referencing between Google and the in-house billing data was by purchase date. The problem was that multiple customers purchase on the same date making this data link not very usable. We also couldn't calculate the time span from registration to purchase, because the billing data didn't have it. Stuck again.

This experience, and similar heartbreaking attempted marketing initiatives which never got off the ground due to data integration fiascoes, are the main reason why we do not use Google Analytics anymore. But the interesting thing is, that the scenario I described is not limited to Google. It's the same with all the big hosted analytics vendors. It's expensive to store the required amount of data for as many users as Google or Omniture have. Since Google is free, they just don't store it. For some paid services like Omniture, you have to pay extra to access this information.

It's important to understand that if you use any type of commercial hosted analytics solution, you will either have no access to user-level data or you will pay a hefty sum to get it on-demand. Most of the companies I encountered, including my own, moved away from these types of services to internal data collection because of this reason, as well as to have the traffic data physically closer to the rest of the company's data, making it easier to use and access.

By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog

Avoid these Pitfalls when Choosing a Business Intelligence Solution

Business intelligence software and services have gone mainstream. In the past, only large enterprises typically invested in BI solutions. These days, however, more and more small and medium-sized businesses are actively looking for ways to tap into their information and use it to increase revenues, reduce expenses and realize competitive advantage.

Even though the concept of business intelligence has been around for over 20 years, a substantial percentage of BI implementations are still considered failures. Given the large number of vendors selling BI products and solutions, one would think that at least one of them would be able to get it right. Why are successful BI solutions so elusive?

The answer lies in how success is measured. The only way to establish the success of a BI project is to determine the return on investment (ROI) realized by the company as a result of deploying the system. The problem is that BI deployments usually require very large up-front investments in software, hardware, analyst services and developer time. After the system goes live, there are also high ongoing costs of customization and maintenance. By the time the company realizes whether or not the effort is actually improving its business metrics (months or years later), so much time and money has been spent that discovering a positive ROI is difficult. A key factor in achieving positive ROI is fast, hands-on business user success – when they are able to actually use and benefit from the software within a short period of time.

The best way to avoid this common trap – and to make sure that you quickly and indisputably demonstrate the value of the business intelligence solution you choose – is to select an option with very low up-front costs and fast implementation times.

- Don't Pay ANYTHING before your Users are Actually Using the Solution on Real Data!

This piece of advice may sound unrealistic – until you become familiar with BI solutions based on modern technology. With traditional BI solutions, which use software based on older technologies, it is inevitable that you will have commit substantial time and money to your BI project long before your users ever get their hands on it. This is a risky, outdated approach and one best avoided.

The value of a BI system can only be known once the business users are actually using it to answer important questions and solve real-life business problems. In almost all cases, this requires a process of revisions and improvements during which users want to incrementally change how the system works based on how they are using it. If they cannot quickly add new questions and analytic processes to the software, it may prove only marginally valuable. At worst, they simply won’t use it and the entire investment will be for naught.

Unfortunately, most BI vendors require the customer to invest in serious hardware and data integration/preparation work before the end users even get to play with the data and attempt to adapt it to their own business processes. In typical systems, most changes will require modifications to the back-end data structures and/or report definitions – adding customization costs, taking time and frustrating your users.

The goal has to be to get your users gaining real-world benefits from a BI solution with minimal up-front investments of time and money. Using today's BI technologies, you can deliver an initial working solution using the standard PC hardware you already have, with no more than a day or two of data preparation (sometimes much less). A standard desktop PC with 12GB of RAM can handle huge amounts of data (even 500 million rows) with reasonable response times. That should be enough to get started. Get a more powerful server machine only once you understand exactly what your users will be doing. (Maybe you won't need one at all.) For your users, there are several user-friendly, drag-and-drop reporting/analytic tools. This should get them started quite nicely, and most of those have free trial versions!

- Beware of Data Warehouses and OLAP Cubes

A data warehouse is a centralized repository of data, usually organized in specific ways designed to make it efficient for an OLAP cube, which in turn speeds up query response times to previously-defined queries. If this sounds too technical, that’s because it IS too technical. And like anything very technical, it requires technical people and several weeks or months of work to set up. Thereafter, maintenance and improvements are also time-consuming, constrained and expensive.

Modern BI technologies (e.g., column-based storage and in-memory query processing) do not require a data warehouse/OLAP architecture in order to provide excellent query response times, simply by better utilizing hardware resources. Going down the OLAP path, especially before the system has gone live, is a sure way of punching a big hole in your IT department's budget long before you even know if the system is worth anything to the actual business users it is intended to serve! Data warehouse and OLAP implementations are complex, risky, time-consuming affairs and difficult to maintain. These days, there are better alternatives! Try them first. The implementation time is significantly shorter and easier, so your up-front investment will be much smaller.

- Consider the Scalability of the BI Solution you Choose

Successful BI systems almost always face three ongoing pressures for expansion: (1) the quantity of data with which they must function is ever-growing, (2) the demands from users for new and ever-more-complex queries/reports/analytics keep coming and (3) growing numbers of users in the organization will want access to the system.

These demands often prove extremely challenging to those who maintain a BI system. Scalability issues are usually addressed by either throwing more hardware at the problem or making architectural changes to the system. Both approaches result in system down-time, IT management overhead and significant expense. OLAP cube-based solutions are particularly vulnerable to scalability challenges.

When you are evaluating BI solutions, consider at what rate you expect your data to grow, how user demands might tax the system down the road and how many users might potentially need access to the system in the future. Ask BI vendors what kind of hardware would be required to handle your future expected demands and what limitations the system might introduce. For example, you will never want to be forced, for reasons of system architecture limitations, to divide your data into separate silos or to significantly limit the amount of historical data available. These are both common vendor-suggested approaches which introduce extensive data management headaches and business-level reporting constraints. If you are going to compromise on this, you better know you're not doing it because the solution is using out-dated technology.

Consider carefully the costs and limitations you may face down the line. You don't want to invest in a BI solution with a foreseeable expiration date.

- Outsourcing vs. In-house BI

If your business does not have people with the in-house skills and knowledge required to implement a BI solution, outsourcing is clearly the way to go. Even in this case, it is important to be very involved every step of the way, making sure that any choices made by your consultants are in line with your own strategic and tactical priorities. The downside of using outside people to implement and maintain your BI system is that your access to the system is very limited. When you need new reports, for example, your requests may get delayed because your service provider is simply too busy doing other things.

It is always a good idea for an organization to manage its own BI, whenever possible. Since BI is a core strategic asset and source of competitive advantage, updates and improvements to a BI system need to be flexible and swift. With traditional BI solutions, your lack of in-house expertise may entirely obviate this option. However, BI technologies have evolved tremendously over the past few years, and you might be surprised at how the expertise you do have is more than enough to deploy and maintain a powerful BI solution. In most cases, a sophisticated business user with solid Excel skills will be able to integrate large amounts of data from multiple sources and create the interactive dashboards and reports your business needs with no need for consultants or IT professionals.

- Think Twice before you Run your Data in the Cloud

Despite the hype, cloud infrastructure is expensive. For light processing of modest quantities of data, it's great. However, BI solutions are resource-intensive computing applications, requiring extensive amounts of memory and CPU horsepower. Using a cloud-based BI solution operationally will siphon off a lot of your budget to your cloud provider or force you to significantly limit the amount of data available for business reporting.

If you have compelling reasons to run your BI in the cloud, it is very important that you choose a BI solution with cloud-friendly technology. Look for a system optimized for efficient use of hardware resources under cloud conditions, e.g., to utilize more hard disk space (cheaper) and less RAM and CPU (expensive). Otherwise, you will be limiting yourself to small data sets or expending huge budgets. Either way, that may be fine in the short-term, but down the road you might not be so happy with your BI choices.
By: Elad Israeli | The ElastiCube Chronicles - Business Intelligence Blog
Total Blog Directory Technology Blog Directory Business Intelligence Directory