Archive for November, 2009



A Peek Inside #PowerPivot: The client architecture

Well, I have a fun one today. We get asked all of the time about how the PowerPivot add-in works. Particularly how it relates to the SSAS components that BI Pros are used to, e.g. making connections to SSAS. As I am getting ready to write an architecture overview blog posting for the PowerPivot team blog, I decided to take the client architecture out for a spin and attempt a more detailed drilldown here . . .

Here is a block diagram of the PowerPivot client component architecture

Read More…

SharePoint Prerequisite Lab Installation Tip (#PowerPivot CTP3 Install)

When you install SharePoint Beta 2 to go with your PowerPivot for SharePoint CTP3 install, some will install SharePoint on a set of servers (or VMs) that do not have internet connectivity.

The first step to installing SharePoint Beta 2 is to run the SharePoint Prerequisite Installer.  If you are in a lab environment that does not have internet connectivity, this step will fail because it is trying to download the required files.

Within the installer dialog, it will provide you a link to the installation log.  Search for “download” within the file and you can then identify the component and URL it was trying to download from.   Copy the URL to a box that has internet connectivity to download that file, place it in a location your lab machine can get access to, run the component, and then you can then proceed with the next step of your installation.

The operational issue here is that there are potentially seven different components that you need to download so going through the step of downloading, running the Prerequisite Installer, running into an error, identifying the component and URL, downloading it, running the component, continuing… is a tad daunting.  To help things out, below are the links to the files required for the SharePoint Prerequisite Installer for Windows Server 2008 (they are slightly different for Windows Server 2008 R2 – mainly the first step).

Read More…

“PowerPivot Twins” is itself a double-take!

On a more off the wall note, I just realized that we have a bit of a double take (ha!) here concerning the idea and name “PowerPivot Twins”.  The code name for PowerPivot was “Project Gemini”.  The name Gemini had been chosen for this project because of its meaning:

Gemini – Gemini (pronounced gɛmɪnaɪ/, Latin: twins, symbol ♊) is one of the constellations of the zodiac known as “the twins

Within the BI world, the “twins” are the traditionally conflicting ideas of the Users who are trying to model, analyze, and share the data and the IT Administrators who are trying to provision, administer, and secure the data.  With PowerPivot, we are able to unite these “twins” because we are:

  • Giving Users the ability to understand the data by giving them more power within the tool they trust: Excel
  • Giving IT Administrators the ability to manage all of this data by using the information management tool of choice: SharePoint.

But even more interesting than the fact that the “PowerPivot Twins” allows us to pay homage to our “twins” roots within Gemini…even we, the PowerPivot Twins ourselves, pay homage to the idea of these two “twins”.

I, Denny Lee, work for the SQL Customer Advisory Team; in the case of PowerPivot I typically represent the customer – i.e. the Users.  Dave Wickert works for the Analysis Services team; in his previous role as an Analysis Services guru and current role as the PowerPivot SharePoint Integration PM, he typically represents the IT Administrator.

So we, the “PowerPivot Twins”, pay homage to Project Gemini not just in reference to our twins roots, but also that we ourselves represent the two equal parts of the BI twins.

Just food for thought (or perhaps I just need more sleep) ;-)

“BI strategy? We don’t need no stink’en BI strategy!”

Want to know more about Dave Wickert‘s thoughts on the Forrester report “10 strong hints your enterprise may not have a BI strategy“; I think you can figure out the direction of his thoughts just with the title of his blog post below :-)

“BI strategy? We don’t need no stink’en BI strategy!”.

As for the PowerPivot perspective in all of this, I was all ready to start chiming in about it and realized it would be significantly easier for me – and much easier for you (to read) to read Rob Collie‘s blog:

Forrester Top 10 – The View From PowerPivot

Enjoy!

Understanding why an Excel PowerPivot workbook is so large

During my and Dave Wickert’s SQLPASS session (SQLCAT: A Preview to PowerPivot Best Practices), we had shown to the audience how to view and better understand the PowerPivot database file structure within the Excel workbook.  First of all, I’d like to give credit where credit is due – and the thanks really should go to Ashvini Sharma – Power Pivot, Analysis Services, and Integration Services Dev MasterMind.  So while the presentation piece is ours – the actual knowledge and know how is his :-) .

Saying this, what do we mean by all of this?

For starters, recall that an Excel utilizing the PowerPivot add-in actually contains the database within the Excel workbook.  For more information on this, please refer to For Excel PowerPivot, the database is IN the workbook.

But now that you know that the database is IN the workbook, how can you find out why the workbook is so large as my esteemed PowerPivot twitter @VidasM recently inquired?

Well, if you recall from the presentation or blog post, when Excel utilizes PowerPivot, it makes use of the VertiPaq database.  While the database is primarily in-memory, it will have an on-disk structure for it cache against before finally saving itself into the workbook itself (when saved and no longer in use).  To find this temporary cache folder,

  • Open up an Excel PowerPivot workbook
  • Go to your %TEMP% folder (e.g. c:\Users\MyName\AppData\Local\Temp) and find the IMBI_<GUID> VertiPaq_<GUID> folder.  If there are multiple IMBI_<GUID> VertiPaq_<GUID> folders, find the most recent one
  • When you open up this folder, for those of you who work regularly with Analysis Services, this file structure will be very familiar as it is very similar to your native Analysis Services data folder structure.

image 

  • Within the [Search Box] (upper right hand corner of your Windows Explorer for those using Windows 7) and type *.* and sort by size.

image 

  • By doing so, you can now identify which IDF and dictionary files exist and which files are larger which correspond to the tables and columns you have imported.

While we have yet to develop best practices, the basic principal here is by doing this, you will be able to find the tables and columns that are taking up alot of space within your Excel PowerPivot workbook and can potentially remove any columns that are not required.

For Excel PowerPivot, the database is IN the workbook

Concerning Excel PowerPivot, the question is often asked how is it possible for Excel to now handle tens or hundreds of millions of rows – the way native Excel handles tens or hundreds of thousands of rows. 

How you ask? 

Well, the reason why is because Excel (with the PowerPivot add-in) is using the new In-Memory BI (IMBI) database.  It utilizes a column-based store which has been seen academically and empirically to have some interesting characteristics within BI scenarios…namely, it can calculate many (but not all) BI aggregate very quickly.  Because it is in-memory, it does not utilize disk I/O and performs all of its scans in-memory – i.e. much faster because you do not have disk contention.  As well, it has a great compression algorithm in order to significantly reduce the size of the overall data stored.  By the way, this new database architecture is called VertiPaq for all those interested in the trivia!

So while Vertipaq is really cool, what does this have to do with Excel PowerPivot?  Well, the reason is because the PowerPivot database is actually in the workbook itself.  To prove it, here’s what you can do:

  • Take your Excel PowerPivot workbook, which you’ll notice is typically much larger than your typical native Excel workbooks
  • Copy the Excel file and rename it using a .zip extension instead of a .xlsx.  Note, the reason why is because .xlsx is actually XML

image

  • When you open up the .zip file, you’ll notice a lot of various folders within it (as noted, it’s all XML).

image 

  • Open up the xl folder, and then you’ll notice the customData folder.  When you navigate to it, you’ll notice the item1.data file which contains the bulk of the size of the Excel file

image

This item1.data file is your PowerPivot database stored right IN the Excel file.

Welcome to PowerPivot Twins!

We are the PowerPivot Twins!  And who are the PowerPivot Twins you ask?  We are Denny Lee from the SQL Customer Advisory Team and Dave Wickert from the Analysis Services Team.  This site is dedicated to all things PowerPivot, eh?!

Over the next few days, we’ll be aggregating posts from PowerPivotGeek.com – Dave Wickert’s excellent blog and PowerPivot materials from SQLCAT.com- the SQL Customer Advisory Team site that Denny Lee is part of.  We’ll of course add new material here as well!

Oh, and we’ll aggregate and/or link to great PowerPivot blogs  as well such as Rob Collie’s PowerPivotPro.com. 

PowerPivot Twins Powers… Activate!!

Hello world!

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

« Previous Page



Follow

Get every new post delivered to your Inbox.