The importance of pi when doing BI projects

You’ve probably heard the shortest joke in the IT industry. It goes like this: “I’m almost done”.
When doing BI and Performance Management projects with our clients, we occasionally have the same problem. Projects take longer than we (or our client) anticipated and there is a variety of possible reasons: technical problems with the software or installation, data quality problems on the ETL side, or incomplete requirement specifications. Or different role perception: We see ourselves as coaches for the Palo-BI Suite and we expect the customer to do part of the work, especially at later phases of the project, while the customer expects to be handed over a 100% completed project.

20 years ago, to cope with this and similar types of issues, my brother Peter (former CEO and founder of MIS AG who unfortunately died in a car accident in 2004) came up with a surprisingly easy formula to calculate realistic project duration (both for internal software development projects as well as consulting projects with clients). Simply take the gut feeling of the developer or the consultant and multiply the number of days or weeks that he thinks are realistic with pi (3,1415). In other words, if somebody thinks 10 days are realistic, the project will most probably take 31,5 days.

20 years after Peter had told me about this rule, I finally had a look at the theory behind it. In my opinion it goes like this: If a person takes an educated guess about how long it will take, he will automatically just see the work to get 80% or 90% of the desired functionality done. He simply neglects the 10% to 20% percent that is needed to make the product or the project 100% perfect. But how long will it take to complete these 10%-20%? Very long as you can see in the following drawing which shows a typical curve based on the Pareto principle (80/20 rule). Doing between 80% to 90% of the work will only take you around a third of the total effort (MD gf stands for man days estimated using gut feeling)

Bild

So if you only look at 80% to 90% of the total functionality you will come to the conclusion that the project will only take you 33% of the time it actually really would if you look at the 100% result. That is where pi comes into play. Multiply the number of days a developer or a consultant estimates for the project by the magic 3,1415 and you most probably have a realistic amout of time it will take to really get done 100%. The following drawing explains this.

Bild

PS: At Jedox we decided to multiply the “gut feeling” estimate by 2. This is not pi but then we also have a clause in our general terms of contract where we ask for an additionaly 60% range that we can charge above our original estimate. And 2 x 160% roughly equals pi.

SPSS: an informed decision?

According to IBM every third decision made by managers is not based on access to the right information. Let’s hope that the planned takeover of SPSS belongs to the 66 per cent in the know!

It’s certainly remarkable to see how quickly specialist providers are disappearing from the market. Speaking from experience, it’s exactly these companies which offer users constant innovation and improvement. Instead, Cognos, Business Objects and, now, SPSS clients are finding themselves under the “Big Player” banner and their growth, product and price strategies have to toe the line accordingly.

The gulf between the provider/end-user agenda is widening and the only reasonable future alternative is in the Open-Source arena. For it’s here that we see the chance for a real relationship in the software branch – and a model based on fairness and innovation.

Catamaran High-Speed Business Intelligence

Inspired by summer and vacation time, I figured it might be interesting to do a post about catamarans for a change. You may have seen on the Jedox website that we are sponsoring a catamaran sailing team. Palo is about speed and so are catamarans. This and the fact that our town, Freiburg, is home of a world and European champion seemed reason enough to back up and support this very successful team.

Let me introduce Sebastian Moser and Thomas Posch, here in their element on Lake Como in northern Italy:

Bild

Sebastian and Thomas as well as Sebastian’s father Alexander make up the Team Moser. Sebastian and Thomas ride the Topcat catamaran as well as the Tornado catamaran which is even faster.

Sebastian maintains an interesting website at www.team-moser.com, a worthwhile site to visit. You can learn a lot about catamaran sailing there and they also provide interesting links. The YouTube video which I came across on their page, gives a good impression of what catamaran sailing is all about.

You might also want to check out this video (if you have never seen a sailing boat tip over in forward direction).

After winning the Topcat World- and European championships the previous years, Sebastian and Thomas have had a very good seasons beginning and they are already leading the German Tornado Ranking. They are now preparing for the “German Open Tornado” which will take place in Hamburg on the 13th to 16th of August.

Talking about “Open” – this is a good opportunity to remind you of the Palo Open (previously known as Palo User Conference).

Palo Open - The Palo Conference

We offer an early bird special until August 31th and you can find additional information about this conference following this link.

Bild

PS: It was very nice of Sebastian and Thomas to take me to one of the trainings in spring 2009 as you can see in the following picture. Sebastian is holding the tiller … and yes, I do like sailing a catamaran.

Bild

Get ready for the next BI generation!

Business intelligence software has been with us for some time and we’ve seen many new technologies enter the BI stage. There’s no disputing that BI can offer tremendous benefits as an IT investment, but the market is changing fast and providers (note the BI Giants) need to gear up for the next generation – which goes far beyond the latest macro trend.

Driven by the need to cut costs and make room in the BI budget for new investments, Generation-Why, the ‘query generation’ will ask tough questions and have an insatiable need not just for data or insight, but for fast answers.

More than ever, the focus will be on response times: a recent survey from BARC revealed that 38% of users are unhappy with the performance of their SAP BI modules.Thanks to “In-Memory Computing”, SAP may be aiming to increase BI response times by up to 100%, but others, particularly the smaller niche players, can still win the race – both with a speedier and more affordable option.

I see the future for speedy business application answers not in the central processing unit (CPU) as SAP does, but in the graphics processing unit (GPU). Modern graphic cards have 100s of special processors that can be used for parallel computing: an approach that the Universities of Freiburg and Western Australia are helping us research. Intel and AMD also need to think anew.

Alongside In-Memory-Analysis, many new technologies have entered the BI arena. Many call this BI 2.0, but what does this really mean? At the end of the day to most of the new BI vendors it means more fancy charts, graphs and dashboards – with some larger fonts, brighter colours and rounded corners thrown in for “interactivity”. All this is well and good, but this is just pure marketing if it doesn’t help the user make better decisions. If this BI approach is to be seen as the “next generation” then it needs to go further.

BI applications and tools need to be rooted in the workflow – and should be cognisant of the types of decisions that need supporting. BI needs to be aware of the domain context – i.e. which industry, which department. Because without this, the best BI can do is to provide pretty visuals and hope and pray that the user knows how to translate them into intelligence. This is not the Jedox way, as usability and business value need to take first priority…..new technologies alone with little value add will not make customers satisfied.

Very often, it’s the smaller BI operations and niche players who are often the innovators here. Most of the BI-g Boys haven’t kept up with the market.

Looking at the BI tools and software available, one has to ask the question, what’s the next generation really looking for and are we ready? And we don’t need expensive answers here, but super fast ones. In an ideal world, Open Source answers.

Excel, Excel plus Palo and beyond Excel

An estimated 500 Million people worldwide use Microsoft Excel. Out of these 500 Million users about 5%-10% perform some kind of analysis, reporting or planning tasks with Excel. While Excel is very flexible and highly accepted in all organisations and enterprises, people will – as soon as the company has more than a few employees – very soon end up in something commonly referred to as spreadsheet hell.

With this in mind, I recently posted an article about the Beauty of Palo. Palo’s beauty lies in the introduction of centralized data cells and in a more than 2-dimensional (i.e. multidimensional) writeback-enabled database which is centrally hosted on a secure server within the organisation. With Palo, many users can simultaneously work with consistent data in Excel. Data entries in one sheet (or even data imported from CRM and ERP systems) will automatically be applied to all worksheets in the organisation. Therefore, “Excel plus Palo” is a better solution than Excel alone. “Excel plus Palo” provides all advantages of a centralized Business Intelligence solution without the cost and the time it usually takes to introduce a dedicated Business Intelligence solution.

Bild

Now, what do I mean by “beyond Excel”? While “Excel plus Palo” is already a great advance in terms of data integrity and process efficiency, could there be even more? The answer is yes. For example one important feature that is missing in todays spreadsheets is “structure dynamics”. When you change a value or filter criteria in a spreadsheet, all dependent cells are updated automatically . This is “data dynamics”. The data is recalculated, but the row and column structure of the spreadsheet remains untouched. But with structure dynamics the spreadsheet would not only adjust the data but also the row and column structure necessary to display the data (for example an ABC Analysis may require 5 or 10 or 100 rows to display all A products depending on the query parameters).

While traditional spreadsheets are structure static, the new online spreadsheet available with the upcoming Palo BI Suite will be structure dynamic by introducing a new technology called “Dynaranges”. While in design mode, Dynaranges are displayed as a rectangle, covering a range of cells within the spreadsheet. These cells are actually linked to a data query (for now restricted to Palo queries, but MS Analysis Server and other data sources to follow within the next few months).

Bild

At run time the Dynaranges unfold (horizontally and/or vertically) and create as many rows and columns as the underlying data (metadata like dimensions or attribute table) require. When the underlying data or the query parameter change, the Dynaranges will automatically adjust the row and column layout of the resulting view.

Bild

Dynaranges are only one example of the advanced spreadsheet technology that we built into the Palo BI Suite. Other examples are server based scrolling, dynamic report repositories or the new chart types that we are adding to the product. I will talk more about these on a future post.

Writeback enabled OLAP Cubes in Excel Pivot Tables

We have some exciting developments to announce in regards to our free-of-charge Open-Source OLAP Product Palo for Excel. We decided to include an ODBO driver (MDX) in the free download of Palo for Excel . The new ODBO connectivity allows to access Palo OLAP cubes via Excel Pivot Tables in Excel. It will be another 2 weeks from now and then everybody can do advanced OLAP-based Pivot Table queries in Excel without having to buy expensive licenses for Microsoft SQL Server Analysis Services.

While Pivot Tables in Excel are read-only Palo users will have the option to writeback values from Excel directly to the OLAP cubes. This is done using the Palo Excel Add-In which is also included in the free download of Palo for Excel. By adding the writeback ability to OLAP cubes, Palo becomes an interesting choice for a server- and Excel based planning application in the mid-market space. With Microsoft recently announcing the end of PerformancePoint Planning this is a huge opportunity for people looking for an inexpensive but powerful centralized planning application under the familiar Excel user interface.

The following screenshot shows a Palo cube in an Excel 2007 Pivot Table (it works in Excel 2003 as well). You can also see the ribbon menu bars of the Palo Excel Add-In which allows to write back values, change the dimensions and the cube structures of the Palo cubes and even includes access to a powerful data import wizard to fill Palo cubes with external data.

Bild

One final note that I would like to mention. If we talk about free download for Palo for Excel this is not only the Palo Excel Client. Actually you can download a complete client-server installation for free with as many servers and as many clients installations as you need.

The curse of the spreadsheet: what happens when Excel gives you (false) security

Excel excels at giving users a strong sense of security. How many times have you witnessed this scene played out in organisations: armed with a ‘dashboard report’ the likes of which the company has never seen, your controller struts into that board meeting, looking every inch the fiscal superstar. Excel is often seen as the magic formula for making a whole host of business decisions. But is Excel in the midst of its own mid-life crisis?

With the 30th anniversary of the first spreadsheet upon us, the original “killer app” and vehicle on which the PC first rode to fame doesn’t quite add up. The existence of “rogue” spreadsheets spreading through an organisation can turn logic on its head. The problem is tracing the errors in such a manual process. Is it down to Microsoft’s “interoperability”or just your colleague having a bad keystroke day?

Bild

Are spreadsheets responsible for the downturn? When spreadsheets, not people, become the decision makers then certainly, chaos can ensue. Consider the following: The European Spreadsheet Risk Interest Group analyses and quantifies the cost of spreadsheet errors worldwide. In the last six months alone, they have reported various situations, including:

A well-known medical and consumer imaging company had to amend its third-quarter loss by $9 million, announcing that the adjustment was needed because too many zeros were added to an employee’s accrued severance on a spreadsheet; the company’s CFO characterised the situation as “an internal control deficiency”

The many add-on tools that have appeared on the market to cushion Excel’s weakness when it comes to error protection and auditing is the strongest witness to the case. Excel is seen as ‚BI Tool No. 1′ but financial controllers must ensure that a company’s numbers aren’t spread across numerous desktops in isolation; and apply enduring standards, for example by connecting Excel, OpenOffice Calc and web-based online spreadsheets by one common, commercial-strength Open Source database.

Palo takes the risk out of Excel and delivers a single version of the truth: safety in numbers, down the Chinese walls.

The Beauty of Palo

There are many ways to explain how Palo is enhancing spreadsheets like Microsoft Excel or OpenOffice.org Calc. Today I will try a very generic approach. Let’s have a look at the following spreadsheet that only uses 6 cells, whereby cell C6 displays the actual number of units sold in 2007 for Germany. It is 919.665 units.

Bild

Now lets change the row title in cell B6 from “Units” to “Turnover”.

Bild

You see that cell C6 now displays the value 6.196.660 which is the actual turnover for Germany 2007. How is it done? Obviously there is a formula in cell C6 but I reassure you that the spreadsheet does not contain lookup tables or links to other spreadsheets.

To make it more mysterious, lets add two more row titles in cell B7 and B8, labeled “Cost of Sales” and “Gross Profit”. Using the copy fill command, we copy the cell formula in cell C6 to cells C7 and C8. As a result, cells C7 and C8 now display reasonable values for Cost of Sales and Gross Profit.

Bild

The same miracle works if we add France and Belgium as additional column titles. After copying the formulas in cells C6:C8 to D6:E8, the values for France and Belgium appear automatically.

Bild

How does the magic work? Very simple: Cells C6:E6 contain a Palo function, called the PALO.DATA function. The PALO.DATA function uses a special syntax to identify cell coordinates of data, which is very different from the well known A1-Style that you know from Excel.

For Example: let’s take a look at the Palo data function in cell C6. This function retrieves the actual year total turnover for all products in the region Germany in 2007. Expressed in the Palo syntax this is PALO.DATA(…,…,”All Products”,”Germany”,”Year”,”2007″,”Actual”,”Turnover”).

Bild

But where does the Palo function retrieve this value from? It retrieves it from Palo, more precisely from a Palo cube in a Palo database on some Palo server. The “…,…,” expression in the previously cited Palo function specifies the name of the Palo server and the Palo database (“localhost/Demo”) and the name of the Palo cube (“Sales”). So the complete syntax of this Palo data function in cell C6 is PALO.DATA(“localhost/Demo, “Sales”, “All Products”, “Germany”, “Year”, “2007″, “Actual”, “Turnover”).

With the Palo Function it is possible to work with numbers in Excel which are actual not stored in the spreadsheet, but in an external database (i.e. Palo). So if two or more people on different workplaces looking on the actual year total turnover for all products in the region Germany in 2007, even on different spreadsheets, they can be sure they all see exactly the same value. This is the end of the Excel-Chaos where people have multiple versions of spreadsheets with outdated data leading to multiple versions of the truth.

Bild

It is also important to know that you can enter values directly on PALO.DATA functions. The values do no overwrite the function, instead the values are “beamed” to the corresponding cell in the Palo cube and the function stays intact, displaying the newly entered value. To the end user it looks like he entered a value in Excel, but in the background Palo makes sure that the value is actually stored in the Palo database making it centrally available for all Excel users that are connected to the specified Palo cube.

I should mention that you can assign read and write access rights to every user that is connected to the Palo server. The access rights go as far a the element level, so for example one user could edit values for France, but would only have read access for Germany and would not have any access for Belgium.

So what is the beauty of Palo? It is the easy way how Palo adds consistency, multidimensionality and centrality to Excel. By the way, Palo is not limited to Excel, but that is another story and will be covered in a future post. Another beauty of Excel Palo? It is free and Open-Source. Both client and server. You can download it from the Jedox website at www.jedox.com .

Introduction

Hello everybody. My name is Kristian Raue and I am the CEO and Founder of Jedox AG. Jedox is the leading Open-Source Performance Management vendor in Europe. Jedox sponsors the development of the Palo OLAP Server for Excel and the web-based Palo Business Intelligence Suite .

I live in Freiburg, Germany where Jedox was founded  in 2002. In the meantime, Jedox has opened offices in Germany, France and the UK with additional development resources in Bosnia, Romania and Austria. 60 people currently work for Jedox.

The idea of this blog is to express the vision behind Jedox and Palo directly from the CEO perspective, to give additional insights and to talk about issues and ideas that go beyond what we could communicate over our website at www.jedox.com .

Expect new posts on this blog about once a week.


Excel, Microsoft, Microsoft Excel are registered trademarks of Microsoft Corporation