Archive for the 'Palo for Excel' Category

Parallel algorithms for Palo Cube Rules

In the previous weeks several people asked me, why Jedox so far is the only BI company that invests in the GPU technology. GPUs make sense when the speed of execution matters. And speed does matter for Palo users, especially when it comes down to financial planning and simulation.

Whenever planning data or planning assumptions are changed at the base level, all aggregations have to be recalculated as quickly as possible to get new consolidated results for a new planning scenario. To deliver this speed, already back in 2005 the Palo developers decided to use an in-memory technology for Palo, which by itself delivers more speed than a disk-based or relational approach.

Choosing in-memory was a wise decision and a lucky one as well, because GPU acceleration actually is only effective in an in-memory architecture (also including the graphic memory of a GPU). GPUs are not helping much on a hard disk or inside a relational database.

Recently I had an interesting conversation with Dr. Tobias Lauer from the Institute of Computer Science at the University of Freiburg. Tobias is one of the research genius behind Palo GPU and he explained how Palo benefits from the parallel algorithms that run in today’s GPUs. This is what I understood from him:

A parallel algorithm utilizes hardware architectures with multiple processing units (processors or processor cores) by executing simultaneously (= in parallel) individual steps of a program that would otherwise be computed sequentially. Depending on the number of available processors, one can distinguish multi-core moderate parallelism (e.g., 2-16 cores) and massive parallelism (hundreds or more processors).

The latter category includes modern GPUs, each consisting of several hundred processing units. Since all the individual processors of a GPU usually execute the same code at the same time, this architecture is suitable for data-parallel (the same operation on many different data) rather than task-parallel applications (different things to be executed simultaneously).

A very simple example from the business intelligence context would be the function

turnover(P) = quantity(P) x price (P)

for a product P. Instead of storing all three figures in the OLAP database, it is sufficient (and for reasons of memory requirement and data consistency even desirable) to save only the quantity and price for a product P and to calculate the turnover dynamically (by an Cube Rule) from those.

For the calculation of the total turnover of a whole group W of products, the equation turnover(W) = quantity(W) x price (W) will lead to a wrong result if quantity(W) is the cumulative total number of all goods and price(W) is the aggregated price. Hence, the individual turnover for each product in the group W must be calculated first, before they can finally be summed up (or, using Palo terminology: we have to use an N-rule). Sequential programs need to run each of the calculations after one another, roughly like this:

1. For each product P in W do (sequentially):
a. Find the quantity and price of P
b. Multiply the two values
c. Add that product to the result
2. Return result

Our new approach is to do these individual calculations in parallel, i.e. to calculate simultaneously. Graphics processors (GPUs) are ideal architectures for this: the same operation (here: multiplication) is executed on many different data sets (here: quantities and prices of all products). A bit over-simplified, our algorithm performs the following steps:

1. Find quantities and prices for all the products P in W simultaneously.
2. Match these records so that quantity and price of the same product are placed next to each other (very quick through parallel sort)
3. Multiply all related pairs (quantity, price) simultaneously and store the results in an array.
4. Add up the array to get the overall result (very quickly by parallel reduce)
5. Return result

Unlike the above sequential algorithm, our parallel approach can perform two steps – finding data and multiplication – for all data sets almost simultaneously. The sorting and the final summation are accomplished by standard algorithms of parallel computing which are also very fast.

In initial tests we have seen very promising results, where our parallel approach has achieved significant speedups compared to the sequential algorithm currently used in Palo.

Palo OS, SOS and Premium

Don’t worry. Jedox is not sending an SOS signal. Jedox is doing fine. Downloads and sales are on the rise, products are advancing, new people on board and a new website as of today.

SOS stands for “Supported Open Source” and is a new offering by Jedox starting today. In 2009, Palo users still had to choose between using the Open Source Edition of Palo (without professional support) or buying the Enterprise Edition with full support and maintenance. With Supported Open Source we are now introducing a third option which fills the gap between those two extremes.

Supported Open Source is based on the Open-Source Edition of Palo (both Palo for Excel and the Palo Suite). You download an Open-Source Edition of Palo and then you can buy a support subscription for a low monthly fee (starting at 187 € per month) directly from our website. The SOS subscription is tied to one specific Palo installation but with no limit in terms of number of users or number of CPUs. It includes 10 support tickets per year (additional support tickets are available).

Also included is a Jedox Software Assurance which basically means that we safeguard you from any intellectual property issue with Palo Open-Source software components delivered by Jedox. These assurances include (a) substituting the infringing portion of the software, (b) changing the software so that its use becomes non-infringing, or (c) acquiring the rights necessary for a customer to continue its use of the software without interruption.

Bild

We also decided to change the pricing and licensing model of the Premium Edition of Palo. It is now available as a monthly subscription starting at 19 € per month and named user (Palo for Excel Premium). Looking at a minimum of 10 users, the monthly subscription comes out at 195 € per month (Palo for Excel Premium) and 390 € (Palo Suite Premium). A feature comparison and price calculator on our new website make it easy to learn about the different subscription options. And as always promised, Palo for Excel and Palo Suite are available as free-of-charge Open-Source Editions as well.

Marathon disciplines for software companies, speed and heavyweights

Business life involves sprints and marathons. Product development, above all, resembles a marathon, especially in the software industry. The most important thing is finding the right people for your team. It takes good consultants and sales, people who understand the customers and know the market, strong developers.

We have been lucky to hire some new specialists with in-depth knowledge of business intelligence. The most important addition is our Senior Product Manager. Matthias Krämer has many years of experience in product development and product management of Business Intelligence software. Matthias was previously employed with Infor as a Product Manager with worldwide responsibility for BI products and their integration within the Infor ERP systems. He will therefore, at the interface between customers, consultants and the development process, from now on have responsibility for the development of the Palo Suite. Even if Matthias knows the market and the job perfectly well, being a product manager always remains a challenge. Product development inherently involves conflict. Apart from dealing with a founder like me, who has de facto been doing the product manager’s work and all the decision-making up until now, there might be conflict between purchasers and end users, sales and marketing, development and consulting. Addressing these conflicts productively is part of the product management marathon.

Development is another marathon discipline for software companies. We are also expanding our Development Department. In Prague, Jedox acquired an entire development team with combined 40+ years of experience in the development of multidimensional databases. The same team had in recent years been responsible for the development of the MIS Alea, now the Infor OLAP. With their expertise in OLAP databases, these OLAP specialists will help to increase the pace of development of our OLAP engine. Our R+D team is now about 30 people strong, while the company as a whole counts around 60 employees and 15 developers nearshore. The new OLAP specialists will also work on integrating our GPU research into the development of our in-memory OLAP software. So we are speeding up both our software by using multiprocessor GPU hardware to achieve exponential performance gains for number crunching in BI, and at the same time our development work by adding state of the art OLAP knowledge. Performance is a key point in BI and CPM, and the speed of development is a key factor for an Open Source software company competing against heavyweights. Speed and heavyweights, on the other side, don’t go together, which is even more true if it is a marathon race. As no other BI vendor is developing GPU based BI and CPM software, and as we’ve gained further momentum in development, I’m confident that we are in a good position for medal winning in the BI race.

Finance people will demand Gaming Cards in their PCs

Today’s computer games deliver 3D video sequences in photorealistic quality. To do this in real-time, the hardware industry developed high-end graphical processing units, also called GPU. A GPU has unbelievable processing power. Instead of 2, 4 or 8 processor cores as known from the traditional Intel/AMD CPU the GPU uses an arrays of hundreds of parallel floating point processors to compute images in their internal graphics memory.

What value does this bring to finance people? The answer is simple: When doing analysis, planning, budgeting, forecasting, scenarios or reporting a lot of number crunching happens, especially if you are looking at aggregated and multidimensional OLAP data models as we usually do in Business Intelligence or Corporate Performance Management. Number crunching consumes enormous processing power. The number one complaint about BI and CPM software is slow query performance, as BI and OLAP Analyst Nigel Pendse points outs.

Bild

So our Palo researchers had a look at the GPU hardware architecture and discovered that GPUs are the perfect hardware accelerator for in-memory OLAP server like Palo. They expect a performance increase by a factor of 20 (not 20%) at least. This would be a performance breakthrough that has never been seen before in the BI industry. The reason why this works so well is the fact that Palo uses an in-memory technology. Since today’s GPUs have 4 GB of Graphic memory it is possible to load the entire cube directly in the GPU RAM. So there is no bottle neck like disk IO etc. that would decrease the GPU power.

Bild

And it gets even better: We just had the GPU Technology Conference in San Jose. There NVDIA announced the Fermi Architecture. This new GPU technology is due in 2010 and will again increase the processing power by the factor of 5 (against today’s Tesla technology).

And by the way: Did I tell you that you can combine GPUs? Here at Jedox we run TESLA hardware with 4 parallel GPUs and 16 GB RAM in one server and it still scales almost linear. So this makes 20 x 4 x 5 = 400. A query that took 40 seconds to calculate on a CPU will be done in 0,1 seconds with GPU. Theoretically of course. Results in practice will be seen on CeBIT 2010.

How many CFOs are still relying on Excel only?

In today’s fiercely competitive economy, business leaders are turning to the finance group to do a lot more than accounting. Yet despite the world of finance (and the role of the CFO) being turned on its head, spreadsheet functionality has remained basically the same for years. Today’s version of Excel would be instantly familiar to someone who hadn’t used it for a decade.

Whilst CFOs are being increasingly relied upon to turn companies around, how many are still only relying on a static, relatively unchanged formula – the good old electronic spreadsheet? Microsoft gauges the number of Excel users worldwide at more than 400 million, and Forrester Research estimates 50 to 80 percent of enterprises still rely on stand-alone spreadsheets for critical applications like financial reporting or budget forecasting.

Yet increasingly, a common lament for CFO users is that piecing together cash flow, profitability and operations data with Excel spreadsheets and static reports is inefficient and inadequate. Even within companies who have BI tools in-house and have spent significant money on this, when it comes to ad-hoc reports or shared forecast processes, the “last mile” will still be done from many CFOs with an Excel spreadsheet.

This tug-of-war between data consistency and flexibility will plague users of stand-alone spreadsheets for the foreseeable future. CFOs need to take aim squarely – or at least obliquely – at Excel. The answer lies in CPM and BI tools, like Palo, that are absolutely essential to optimize spreadsheets: online modules that are low or no cost, and truly Web-based allowing for file sharing on a global basis. Because, more than ever, companies compete based on the decisions they make.

Excel, Excel plus Palo and beyond Excel

An estimated 500 Million people worldwide use Microsoft Excel. Out of these 500 Million users about 5%-10% perform some kind of analysis, reporting or planning tasks with Excel. While Excel is very flexible and highly accepted in all organisations and enterprises, people will – as soon as the company has more than a few employees – very soon end up in something commonly referred to as spreadsheet hell.

With this in mind, I recently posted an article about the Beauty of Palo. Palo’s beauty lies in the introduction of centralized data cells and in a more than 2-dimensional (i.e. multidimensional) writeback-enabled database which is centrally hosted on a secure server within the organisation. With Palo, many users can simultaneously work with consistent data in Excel. Data entries in one sheet (or even data imported from CRM and ERP systems) will automatically be applied to all worksheets in the organisation. Therefore, “Excel plus Palo” is a better solution than Excel alone. “Excel plus Palo” provides all advantages of a centralized Business Intelligence solution without the cost and the time it usually takes to introduce a dedicated Business Intelligence solution.

Bild

Now, what do I mean by “beyond Excel”? While “Excel plus Palo” is already a great advance in terms of data integrity and process efficiency, could there be even more? The answer is yes. For example one important feature that is missing in todays spreadsheets is “structure dynamics”. When you change a value or filter criteria in a spreadsheet, all dependent cells are updated automatically . This is “data dynamics”. The data is recalculated, but the row and column structure of the spreadsheet remains untouched. But with structure dynamics the spreadsheet would not only adjust the data but also the row and column structure necessary to display the data (for example an ABC Analysis may require 5 or 10 or 100 rows to display all A products depending on the query parameters).

While traditional spreadsheets are structure static, the new online spreadsheet available with the upcoming Palo BI Suite will be structure dynamic by introducing a new technology called “Dynaranges”. While in design mode, Dynaranges are displayed as a rectangle, covering a range of cells within the spreadsheet. These cells are actually linked to a data query (for now restricted to Palo queries, but MS Analysis Server and other data sources to follow within the next few months).

Bild

At run time the Dynaranges unfold (horizontally and/or vertically) and create as many rows and columns as the underlying data (metadata like dimensions or attribute table) require. When the underlying data or the query parameter change, the Dynaranges will automatically adjust the row and column layout of the resulting view.

Bild

Dynaranges are only one example of the advanced spreadsheet technology that we built into the Palo BI Suite. Other examples are server based scrolling, dynamic report repositories or the new chart types that we are adding to the product. I will talk more about these on a future post.

Writeback enabled OLAP Cubes in Excel Pivot Tables

We have some exciting developments to announce in regards to our free-of-charge Open-Source OLAP Product Palo for Excel. We decided to include an ODBO driver (MDX) in the free download of Palo for Excel . The new ODBO connectivity allows to access Palo OLAP cubes via Excel Pivot Tables in Excel. It will be another 2 weeks from now and then everybody can do advanced OLAP-based Pivot Table queries in Excel without having to buy expensive licenses for Microsoft SQL Server Analysis Services.

While Pivot Tables in Excel are read-only Palo users will have the option to writeback values from Excel directly to the OLAP cubes. This is done using the Palo Excel Add-In which is also included in the free download of Palo for Excel. By adding the writeback ability to OLAP cubes, Palo becomes an interesting choice for a server- and Excel based planning application in the mid-market space. With Microsoft recently announcing the end of PerformancePoint Planning this is a huge opportunity for people looking for an inexpensive but powerful centralized planning application under the familiar Excel user interface.

The following screenshot shows a Palo cube in an Excel 2007 Pivot Table (it works in Excel 2003 as well). You can also see the ribbon menu bars of the Palo Excel Add-In which allows to write back values, change the dimensions and the cube structures of the Palo cubes and even includes access to a powerful data import wizard to fill Palo cubes with external data.

Bild

One final note that I would like to mention. If we talk about free download for Palo for Excel this is not only the Palo Excel Client. Actually you can download a complete client-server installation for free with as many servers and as many clients installations as you need.

The curse of the spreadsheet: what happens when Excel gives you (false) security

Excel excels at giving users a strong sense of security. How many times have you witnessed this scene played out in organisations: armed with a ‘dashboard report’ the likes of which the company has never seen, your controller struts into that board meeting, looking every inch the fiscal superstar. Excel is often seen as the magic formula for making a whole host of business decisions. But is Excel in the midst of its own mid-life crisis?

With the 30th anniversary of the first spreadsheet upon us, the original “killer app” and vehicle on which the PC first rode to fame doesn’t quite add up. The existence of “rogue” spreadsheets spreading through an organisation can turn logic on its head. The problem is tracing the errors in such a manual process. Is it down to Microsoft’s “interoperability”or just your colleague having a bad keystroke day?

Bild

Are spreadsheets responsible for the downturn? When spreadsheets, not people, become the decision makers then certainly, chaos can ensue. Consider the following: The European Spreadsheet Risk Interest Group analyses and quantifies the cost of spreadsheet errors worldwide. In the last six months alone, they have reported various situations, including:

A well-known medical and consumer imaging company had to amend its third-quarter loss by $9 million, announcing that the adjustment was needed because too many zeros were added to an employee’s accrued severance on a spreadsheet; the company’s CFO characterised the situation as “an internal control deficiency”

The many add-on tools that have appeared on the market to cushion Excel’s weakness when it comes to error protection and auditing is the strongest witness to the case. Excel is seen as ‚BI Tool No. 1′ but financial controllers must ensure that a company’s numbers aren’t spread across numerous desktops in isolation; and apply enduring standards, for example by connecting Excel, OpenOffice Calc and web-based online spreadsheets by one common, commercial-strength Open Source database.

Palo takes the risk out of Excel and delivers a single version of the truth: safety in numbers, down the Chinese walls.

The Beauty of Palo

There are many ways to explain how Palo is enhancing spreadsheets like Microsoft Excel or OpenOffice.org Calc. Today I will try a very generic approach. Let’s have a look at the following spreadsheet that only uses 6 cells, whereby cell C6 displays the actual number of units sold in 2007 for Germany. It is 919.665 units.

Bild

Now lets change the row title in cell B6 from “Units” to “Turnover”.

Bild

You see that cell C6 now displays the value 6.196.660 which is the actual turnover for Germany 2007. How is it done? Obviously there is a formula in cell C6 but I reassure you that the spreadsheet does not contain lookup tables or links to other spreadsheets.

To make it more mysterious, lets add two more row titles in cell B7 and B8, labeled “Cost of Sales” and “Gross Profit”. Using the copy fill command, we copy the cell formula in cell C6 to cells C7 and C8. As a result, cells C7 and C8 now display reasonable values for Cost of Sales and Gross Profit.

Bild

The same miracle works if we add France and Belgium as additional column titles. After copying the formulas in cells C6:C8 to D6:E8, the values for France and Belgium appear automatically.

Bild

How does the magic work? Very simple: Cells C6:E6 contain a Palo function, called the PALO.DATA function. The PALO.DATA function uses a special syntax to identify cell coordinates of data, which is very different from the well known A1-Style that you know from Excel.

For Example: let’s take a look at the Palo data function in cell C6. This function retrieves the actual year total turnover for all products in the region Germany in 2007. Expressed in the Palo syntax this is PALO.DATA(…,…,”All Products”,”Germany”,”Year”,”2007″,”Actual”,”Turnover”).

Bild

But where does the Palo function retrieve this value from? It retrieves it from Palo, more precisely from a Palo cube in a Palo database on some Palo server. The “…,…,” expression in the previously cited Palo function specifies the name of the Palo server and the Palo database (“localhost/Demo”) and the name of the Palo cube (“Sales”). So the complete syntax of this Palo data function in cell C6 is PALO.DATA(“localhost/Demo, “Sales”, “All Products”, “Germany”, “Year”, “2007″, “Actual”, “Turnover”).

With the Palo Function it is possible to work with numbers in Excel which are actual not stored in the spreadsheet, but in an external database (i.e. Palo). So if two or more people on different workplaces looking on the actual year total turnover for all products in the region Germany in 2007, even on different spreadsheets, they can be sure they all see exactly the same value. This is the end of the Excel-Chaos where people have multiple versions of spreadsheets with outdated data leading to multiple versions of the truth.

Bild

It is also important to know that you can enter values directly on PALO.DATA functions. The values do no overwrite the function, instead the values are “beamed” to the corresponding cell in the Palo cube and the function stays intact, displaying the newly entered value. To the end user it looks like he entered a value in Excel, but in the background Palo makes sure that the value is actually stored in the Palo database making it centrally available for all Excel users that are connected to the specified Palo cube.

I should mention that you can assign read and write access rights to every user that is connected to the Palo server. The access rights go as far a the element level, so for example one user could edit values for France, but would only have read access for Germany and would not have any access for Belgium.

So what is the beauty of Palo? It is the easy way how Palo adds consistency, multidimensionality and centrality to Excel. By the way, Palo is not limited to Excel, but that is another story and will be covered in a future post. Another beauty of Excel Palo? It is free and Open-Source. Both client and server. You can download it from the Jedox website at www.jedox.com .