PerformancePoint Blog

A Blog about PerformancePoint and Microsoft BI technologies. Your host is Russell Christopher

A Microsoft guy does QlikView

Posted by Russell on January - 2 - 2011 with 20 Comments

Over the holidays, I decided to experiment with some business intelligence technology that I normally don’t have the opportunity work with – QlikTech’s QlikView. In a nutshell, I took Microsoft’s Contoso Retail sample DW, brought portions of it into QlikView, and created several dashboards.

Might as well blog about the experience, eh?

I plan to split this effort up into three posts. This entry will focus on the data model. In part two, I’ll address dashboard building, and the final post will consist of the kitchen sink.

QlikView has a lot going for it as a product, and I’ve really enjoyed working with it. Of course, I also see some things I don’t like Smile

Having not worked with the product very long, I apologize in advance to our friends at QlikTech for any blind spots I have or mistakes that may worm their way into this series. I tried hard to make sure everything below  is technically accurate, but you know how it goes – keep me honest!

Anyhow…Data.

As I mentioned previously, I chose to work with the Contoso Retail sample data warehouse. Specifically, I focused on the Online Sales fact table (about 12.6M rows) and related dimensions:

ContosoRetailDW

The Basics

You’re going to load data into QlikView via scripts. The scripting environment is simple and (sort of) the equivalent of a SQL Server Analysis Services DSV, or PowerPivot’s “PowerPivot Window”. QlikView gives you some basic query building tools and a wizard which will assemble SELECT, LOAD or BINARY statements from table/field selections that the user makes. As I mentioned the query builder is rudimentary compared to the tools I normally use – I didn’t see any way to do builder-assisted JOINs, aggregations, etc. 

In your scripts, you’ll use standard SQL SELECT statements to get at data via ODBC and OLEDB. While writing these select statements, you can do the same sort of things that one might attempt within a DSV’s Named Query. You can rename and duplicate columns, add calculated columns, and do any pre-grouping or aggregation that might be appropriate.

SELECT Statement

The LOAD statement is pretty much the same thing – except you’ll point it at text-based files, Excel, and QlikView files. From the examples I looked at, you can include DISTINCT, WHERE, GROUP BY, ORDER BY clauses, too:

LOAD Statement

QlikView includes multiple qualifiers which can be used to re-shape data during the import. Here’s a sampling of what I feel would be the most commonly used qualifiers:

  • Concatenate: Like a UNION without the requirement of two tables you’re slapping together having to have the same fields in the same order
  • Crosstable: Like SQL’s UNPIVOT statement
  • Generic: Splits the table being loaded into several other logical tables)
  • Join: Joins the current table being imported to the previous table via common fields

I found the complexity around the QlikView data loading processes to be somewhere in between PowerPivot’s user-focused simplicity and SSAS’s developer-centric rich feature set.

Loading a real-world dataset into QlikView is not a point-and-click affair as it is with PowerPivot – I doubt a mere mortal (business user) could handle making necessary adjustments to an existing database schema without IT’s assistance. I was kind of surprised by this, because everything I’ve read and heard about QlikView focuses on self-service and rapid time-to-publish. In my opinion, only the top 5 -10% of power users might be able to single-handedly pull off a data load.

Out of the box, QlikView can import the following:

  • ODBC and OLEDB data sources
  • Excel
  • CSV
  • XML
  • HTML
  • QVD – QlikView Data file: proprietary file format – think of SSIS’s RAW format file
  • QVX – QlikView Data Exchange: open file/stream format

Just like SQL Integration Services, scripts can be utilized to do things like grab data from a web service. QlikView also has an SDK which you can use to build custom data sources like one might build a custom data processing extension for SQL Reporting Services. I wasn’t interested in trying to do any “hard core” ETL so I won’t attempt to compare it to SSIS in any other way. I suspect that QlikTech is shooting for “good enough” data loading with QlikView and then leaning on their partners Informatica and/or Kalido for more difficult ETL and MDM scenarios.

The Big Lesson

Here are the two primary lessons I took away from my data model building experience in QlikView:

  • Field names matter. A lot. We’re talking “holy writ” time.
  • If QlikView modifies your data model for you by loosening tables and/or (sometimes) adding synthetic keys, you may want to modify your scripts – what you have is likely less-than-optimum.

 

QlikView builds relationships (which are referred to as “associations”) and synthetic keys (we’ll talk about these later) as the result of matching column names across tables. This is the only way relationships are created. As a developer, you can’t manually create a relationship between Table1.Field1 and Table2.Field3 where appropriate like you can in SSAS or PowerPivot. Instead, you have to alias Table2.Field3 to Table2.Field1, or create a copy of the field and rename it, etc. Coming from a Microsoft background I found this rather odd and restrictive.

The by-product of QlikView’s field-name-centric relationship building behavior is that it will often create a bunch of associations between tables that you simply don’t want. Imagine you have an arbitrary “Status” field in your Employee and Product dimensions…well, congratulations, you’re the proud parent of an association between the tables you probably don’t need. I found this behavior quite clumsy. To make life more interesting, unintended relationships lead to looping scenarios across your tables which cause QlikView to further modify the model by “loosening tables”.

You’ll need to be very careful to do away with generic field names (Start Date, End Date, Load Date) or at the very least make them more specific to the table they reside in. This lesson was a tough one to learn since I generally consider the existing field names I import to be largely irrelevant in Microsoft BI tools. (Why? Because I can manually create relationships regardless of field names across all MSFT BI solutions. With a tip of the hat to Kimball, field names should be meaningful to users, of course).

Here, for example is the script I started off with to import tables from SQL. Note how I carefully avoid a proper SELECT list Smile

OLEDB CONNECT TO [Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=ContosoRetailDW;Data Source=.;Use Procedure for Prepare=1;Auto Translate=True;Packet Size=4096;Workstation ID=W1;Use Encryption for Data=False;Tag with column collation when possible=False];

/************** Dimensions *************/

SELECT * FROM [ContosoRetailDW].[dbo].[DimDate];
SELECT * FROM [ContosoRetailDW].[dbo].[DimProduct];
//Snowflake tables off Product
SELECT * FROM [ContosoRetailDW].[dbo].[DimProductSubcategory];
SELECT * FROM [ContosoRetailDW].[dbo].[DimProductCategory];

SELECT * FROM [ContosoRetailDW].[dbo].[DimCustomer];
//Snowflakes off DimCustomer
SELECT * FROM [ContosoRetailDW].[dbo].[DimGeography];
SELECT * FROM [ContosoRetailDW].[dbo].[DimSalesTerritory];

SELECT * FROM [ContosoRetailDW].[dbo].[DimPromotion];

/************** Facts *************/
SELECT * FROM [ContosoRetailDW].[dbo].[FactOnlineSales];

 

Blowing up your OS with QlikView (a how-to guide)

The script above turned out to be so clumsy I was able to repeatedly bring down my RDP connection to the 5 GB Virtual Machine I’m running QlikView on when I ran it.

WLoad Records with Syn Field 1

After QlikView initially loaded my tables (above – click for a bigger view)), it began creating synthetic keys due to the matching field names for ETL lineage columns across my tables:

WLoad Records with Syn Field 2

After the 9th synthetic key, QV.EXE spiked to 99% CPU and started consuming memory like crazy:

WLoad Records with Syn Field 3

After QV.EXE had consumed most of the memory on the box, the process dropped down to about 30-50% CPU utilization, and just churned. I saw qv.exe hard faulting like nuts – we’d expect this, I guess.

Finally, I noted the task manager in the client OS stopped updating in my RDP window. The whole RDP session became unstable due to constricted memory on the client OS, and it finally failed. I couldn’t reconnect. I bounced the Virtual Machine, raised the amount of RAM on the image to 12 GB, and tried again. Same result!

So that’s what happens when you write crummy scripts.

Fixing my script

I re-wrote my SELECT statements the way they should have been created in the first place. I removed all the lineage and SCD-related columns which were causing most of the unnecessary synthetic keys to be created. I also added a temporary TOP qualifier to statement which pulled rows in from the fact table so I didn’t have to wait for 12.6M rows to come across from SQL:

OLEDB CONNECT TO [Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=ContosoRetailDW;Data Source=.;Use Procedure for Prepare=1;Auto Translate=True;Packet Size=4096;Workstation ID=W1;Use Encryption for Data=False;Tag with column collation when possible=False];

SELECT [DateKey]
      ,[FullDateLabel]
      ,[DateDescription]
      ,[CalendarYear]
      ,[CalendarYearLabel]
      ,[CalendarHalfYear]
      ,[CalendarHalfYearLabel]
      ,[CalendarQuarter]
      ,[CalendarQuarterLabel]
      ,[CalendarMonth]
      ,[CalendarMonthLabel]
      ,[CalendarWeek]
      ,[CalendarWeekLabel]
      ,[CalendarDayOfWeek]
      ,[CalendarDayOfWeekLabel]
      ,[FiscalYear]
      ,[FiscalYearLabel]
      ,[FiscalHalfYear]
      ,[FiscalHalfYearLabel]
      ,[FiscalQuarter]
      ,[FiscalQuarterLabel]
      ,[FiscalMonth]
      ,[FiscalMonthLabel]
      ,[IsWorkDay]
      ,[IsHoliday]
      ,[HolidayName]
      ,[EuropeSeason]
      ,[NorthAmericaSeason]
      ,[AsiaSeason]
  FROM [ContosoRetailDW].[dbo].[DimDate];

SELECT [ProductKey]
      ,[ProductLabel]
      ,[ProductName]
      ,[ProductDescription]
      ,[ProductSubcategoryKey]
      ,[Manufacturer]
      ,[BrandName]
      ,[ClassID]
      ,[ClassName]
      ,[StyleID]
      ,[StyleName]
      ,[ColorID]
      ,[ColorName]
      ,[Size]
      ,[SizeRange]
      ,[SizeUnitMeasureID]
      ,[Weight]
      ,[WeightUnitMeasureID]
      ,[UnitOfMeasureID]
      ,[UnitOfMeasureName]
      ,[StockTypeID]
      ,[StockTypeName]
      ,[UnitCost]
      ,[UnitPrice]
      ,[AvailableForSaleDate]
      ,[StopSaleDate]
      ,[Status]
      ,[ImageURL]
      ,[ProductURL]
  FROM [ContosoRetailDW].[dbo].[DimProduct];

SELECT [ProductSubcategoryKey]
      ,[ProductSubcategoryLabel]
      ,[ProductSubcategoryName]
      ,[ProductSubcategoryDescription]
      ,[ProductCategoryKey]
  FROM [ContosoRetailDW].[dbo].[DimProductSubcategory];

SELECT [ProductCategoryKey]
      ,[ProductCategoryLabel]
      ,[ProductCategoryName]
      ,[ProductCategoryDescription]
  FROM [ContosoRetailDW].[dbo].[DimProductCategory];

SELECT [CustomerKey]
      ,[GeographyKey]
      ,[CustomerLabel]
      ,[Title]
      ,[FirstName]
      ,[MiddleName]
      ,[LastName]
      ,[NameStyle]
      ,[BirthDate]
      ,[MaritalStatus]
      ,[Suffix]
      ,[Gender]
      ,[EmailAddress]
      ,[YearlyIncome]
      ,[TotalChildren]
      ,[NumberChildrenAtHome]
      ,[Education]
      ,[Occupation]
      ,[HouseOwnerFlag]
      ,[NumberCarsOwned]
      ,[AddressLine1]
      ,[AddressLine2]
      ,[Phone]
      ,[DateFirstPurchase]
      ,[CustomerType]
      ,[CompanyName]
  FROM [ContosoRetailDW].[dbo].[DimCustomer];

DimGeography:
SELECT [GeographyKey]
      ,[GeographyType]
      ,[ContinentName]
      ,[CityName]
      ,[StateProvinceName]
      ,[RegionCountryName]
  FROM [ContosoRetailDW].[dbo].[DimGeography];

SELECT  [SalesTerritoryKey]
      ,[GeographyKey]
      ,[SalesTerritoryLabel]
      ,[SalesTerritoryName]
      ,[SalesTerritoryRegion]
      ,[SalesTerritoryCountry]
      ,[SalesTerritoryGroup]
      ,[SalesTerritoryLevel]
      ,[SalesTerritoryManager]
      ,[StartDate]
      ,[EndDate]
      ,[Status]
  FROM [ContosoRetailDW].[dbo].[DimSalesTerritory];

SELECT [PromotionKey]
      ,[PromotionLabel]
      ,[PromotionName]
      ,[PromotionDescription]
      ,[DiscountPercent]
      ,[PromotionType]
      ,[PromotionCategory]
      ,[StartDate]
      ,[EndDate]
      ,[MinQuantity]
      ,[MaxQuantity]
  FROM [ContosoRetailDW].[dbo].[DimPromotion];

SELECT TOP 1000000 [OnlineSalesKey]
      ,[DateKey]
      ,[StoreKey]
      ,[ProductKey]
      ,[PromotionKey]
      ,[CurrencyKey]
      ,[CustomerKey]
      ,[SalesOrderNumber]
      ,[SalesOrderLineNumber]
      ,[SalesQuantity]
      ,[SalesAmount]
      ,[ReturnQuantity]
      ,[ReturnAmount]
      ,[DiscountQuantity]
      ,[DiscountAmount]
      ,[TotalCost]
      ,[UnitCost]
      ,[UnitPrice]
  FROM [ContosoRetailDW].[dbo].[FactOnlineSales];

This time I made it through the loading process quite quickly, creating only two synthetic keys… but then I got the following warning:

Loop Error

Nuts.

QlikView includes a Table Viewer, similar to SQL Server’s database diagram except that everything is read-only – no editing. I used that tool to investigate what was going on, and here’s was what I saw (I suggest you pop this open in it’s own window):

Table view sith lots of loosened tables and Syn Fields

The model as it stands now has problems with circular references and a more mundane relationship issue, too. See if you can spot them when comparing the QlikView model to what we had coming out SQL Server’s database diagram.

Before solving the loop issue, I needed to understand what a synthetic key was…So, what is it? Here’s what the QlikView Reference Manual has to say (bold is mine throughout the section below):

When two or more input tables have two or more fields in common, this implies a composite key relationship. QlikView handles this through synthetic keys. These keys are anonymous fields that represent all occurring combinations of the composite key. When the number of composite keys increases, depending on data amounts, table structure and other factors, QlikView may or may not handle them gracefully. QlikView may end up using an excessive amount of time and/or memory. Unfortunately, the actual limitations are virtually impossible to predict, which leaves only trial and error as a practical method to determine them.

Therefore we recommend an overall analysis of the intended table structure by the application designer. Typical tricks include:

  • Forming your own non-composite keys, typically using string concatenation inside an Autonumber script function
  • Making sure only the necessary fields connect. If you for example use a date as a key, make sure you do not load eg. Year, Month, or day_of_month from more than one input table.

Everything became a lot more clear about the earlier experience with QV.exe using all the system’s RAM: Unnecessary matching field names across tables = synthetic keys getting created = excessive amount of memory being used. My fault!

First, let’s look at the synthetic keys that got created:

See Ssyn 1Table, at the bottom left? It got created because both the DimPromotion and DimSalesTerritory happened to have StartDate and EndDate fields – so QlikView assumes a relationship via a composite key which really doesn’t exist in the real world.

Note how when I hover over StartDate or EndDate inside DimSalesTerritory, the fields are being identified not only as keys ($key) , but as synthetic ($synthetic).

DimSalesTerritorySyntheticKey

That assumed relationship, combined with what is going on in Ssyn 2 Table and a third relationship between DimProduct and DimSalesTerritory caused my looping issues.

What is looping? Again, from the reference manual:

If there are circular references (“loops”) in a data structure, the tables are associated in such a way that there is more than one path of associations between two fields.

This type of data structure should normally be avoided as much as possible since it might lead to ambiguities in the interpretation of data. Unfortunately, circular data structures are quite common in the real world. In some cases they are a result of poor databases design, but in certain cases they may be unavoidable.

QlikView solves the problem of circular references by breaking the loop with a loosely coupled table. When QlikView finds circular data structures while executing the load script, a warning dialog will be shown and one or more tables will be set to loosely coupled. QlikView will typically attempt to loosen the longest table in the loop, as this is often a transaction table, which normally will be the one to loosen.

Essentially, all the dotted lines you see above and in the larger Table View screenshot were the result of QlikView attempting to remedy this issue by re-jiggering relationships to “loosely-coupled”.

The real problem around loosely coupled tables as I see it is that they no longer respond to filters. Selections in one field will also not propagate through to other fields in the table. For example, note how DimDate is loosely coupled to the fact table. When I try to filter a chart showing metrics from the fact table with a column in DimDate (Calendar Year = 2007), my charts begs off with “no data to display”. There is data for 2007, btw:

ChartNotWorking

In fact, if you look closely, ALL relationships against my single fact table have become loose, meaning I’m not going to be able to slice and dice my measures with my dimensions at all! That won’t do!

So, here’s the final script. I’m renaming all those problem-child fields:

  • In DimPromotion and DimSalesTerritory, the Start Date and End Date fields are prefaced with a table name (for example, “Promotion Start Date”) to deal with synthetic key #1 (Ssyn1)
  • In DimProduct, Unit Cost and Unit Price become Product Unit Cost and Product Unit Price to avoid synthetic key #2 (Ssyn2)
  • DimProduct and DimSalesTerritory have a matching Status column. I rename the one in DimSalesTerritory so I don’t get a bogus relationship
  • DimGeography, DimCustomer, and DimSalesTerritory all contain a GeographyKey field (see the SQL Server database diagram at the beginning of this post). There is no direct relationship between DimCustomer and DimSalesTerritory, but QlikView creates one anyway because of matching field names. I added customer and sales territory specific geography key fields to DimGeography, and then renamed the existing GeographyKey fields in DimCustomer and DimSalesTerritory

OLEDB CONNECT TO [Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=ContosoRetailDW;Data Source=.;Use Procedure for Prepare=1;Auto Translate=True;Packet Size=4096;Workstation ID=W1;Use Encryption for Data=False;Tag with column collation when possible=False];

SELECT [DateKey]
      ,[FullDateLabel]
      ,[DateDescription]
      ,[CalendarYear]
      ,[CalendarYearLabel]
      ,[CalendarHalfYear]
      ,[CalendarHalfYearLabel]
      ,[CalendarQuarter]
      ,[CalendarQuarterLabel]
      ,[CalendarMonth]
      ,[CalendarMonthLabel]
      ,[CalendarWeek]
      ,[CalendarWeekLabel]
      ,[CalendarDayOfWeek]
      ,[CalendarDayOfWeekLabel]
      ,[FiscalYear]
      ,[FiscalYearLabel]
      ,[FiscalHalfYear]
      ,[FiscalHalfYearLabel]
      ,[FiscalQuarter]
      ,[FiscalQuarterLabel]
      ,[FiscalMonth]
      ,[FiscalMonthLabel]
      ,[IsWorkDay]
      ,[IsHoliday]
      ,[HolidayName]
      ,[EuropeSeason]
      ,[NorthAmericaSeason]
      ,[AsiaSeason]
  FROM [ContosoRetailDW].[dbo].[DimDate];

SELECT [ProductKey]
      ,[ProductLabel]
      ,[ProductName]
      ,[ProductDescription]
      ,[ProductSubcategoryKey]
      ,[Manufacturer]
      ,[BrandName]
      ,[ClassID]
      ,[ClassName]
      ,[StyleID]
      ,[StyleName]
      ,[ColorID]
      ,[ColorName]
      ,[Size]
      ,[SizeRange]
      ,[SizeUnitMeasureID]
      ,[Weight]
      ,[WeightUnitMeasureID]
      ,[UnitOfMeasureID]
      ,[UnitOfMeasureName]
      ,[StockTypeID]
      ,[StockTypeName]
      ,[UnitCost] as [Product Unit Cost]
      ,[UnitPrice] as [Product Unit Price]
      ,[AvailableForSaleDate]
      ,[StopSaleDate]
      ,[Status]
      ,[ImageURL]
      ,[ProductURL]
  FROM [ContosoRetailDW].[dbo].[DimProduct];

SELECT [ProductSubcategoryKey]
      ,[ProductSubcategoryLabel]
      ,[ProductSubcategoryName]
      ,[ProductSubcategoryDescription]
      ,[ProductCategoryKey]
  FROM [ContosoRetailDW].[dbo].[DimProductSubcategory];

SELECT [ProductCategoryKey]
      ,[ProductCategoryLabel]
      ,[ProductCategoryName]
      ,[ProductCategoryDescription]
  FROM [ContosoRetailDW].[dbo].[DimProductCategory];

SELECT [CustomerKey]
      ,[GeographyKey] as [GeographyCustomerKey]
      ,[CustomerLabel]
      ,[Title]
      ,[FirstName]
      ,[MiddleName]
      ,[LastName]
      ,[NameStyle]
      ,[BirthDate]
      ,[MaritalStatus]
      ,[Suffix]
      ,[Gender]
      ,[EmailAddress]
      ,[YearlyIncome]
      ,[TotalChildren]
      ,[NumberChildrenAtHome]
      ,[Education]
      ,[Occupation]
      ,[HouseOwnerFlag]
      ,[NumberCarsOwned]
      ,[AddressLine1]
      ,[AddressLine2]
      ,[Phone]
      ,[DateFirstPurchase]
      ,[CustomerType]
      ,[CompanyName]
  FROM [ContosoRetailDW].[dbo].[DimCustomer];

SELECT [GeographyKey]
      ,[GeographyKey] as [GeographyCustomerKey]
      ,[GeographyKey] as [GeographySalesTerritoryKey]
      ,[GeographyType]
      ,[ContinentName]
      ,[CityName]
      ,[StateProvinceName]
      ,[RegionCountryName]
  FROM [ContosoRetailDW].[dbo].[DimGeography];

SELECT  [SalesTerritoryKey]
      ,[GeographyKey] as [GeographySalesTerritoryKey]
      ,[SalesTerritoryLabel]
      ,[SalesTerritoryName]
      ,[SalesTerritoryRegion]
      ,[SalesTerritoryCountry]
      ,[SalesTerritoryGroup]
      ,[SalesTerritoryLevel]
      ,[SalesTerritoryManager]
      ,[StartDate] as [Sales Territory Start Date]
      ,[EndDate] as [Sales Territory End Date]
      ,[Status] as [Sales Territory Status]
  FROM [ContosoRetailDW].[dbo].[DimSalesTerritory];

SELECT [PromotionKey]
      ,[PromotionLabel]
      ,[PromotionName]
      ,[PromotionDescription]
      ,[DiscountPercent]
      ,[PromotionType]
      ,[PromotionCategory]
      ,[StartDate] as [Promotion Start Date]
      ,[EndDate] as [Promotion State Date]
      ,[MinQuantity]
      ,[MaxQuantity]
  FROM [ContosoRetailDW].[dbo].[DimPromotion];

SELECT TOP 1000000 [OnlineSalesKey]
      ,[DateKey]
      ,[StoreKey]
      ,[ProductKey]
      ,[PromotionKey]
      ,[CurrencyKey]
      ,[CustomerKey]
      ,[SalesOrderNumber]
      ,[SalesOrderLineNumber]
      ,[SalesQuantity]
      ,[SalesAmount]
      ,[ReturnQuantity]
      ,[ReturnAmount]
      ,[DiscountQuantity]
      ,[DiscountAmount]
      ,[TotalCost]
      ,[UnitCost]
      ,[UnitPrice]
  FROM [ContosoRetailDW].[dbo].[FactOnlineSales];

After reloading the script above I saw no errors. My final data model looked like this:

Final Table Design

Note all the beautiful, solid lines!

And the proof is in the pudding. Remember the chart I tried to filter with CalendarYear= 2007 when I had my loose tables? It’s working now!:

ChartWorking

This is a convenient place to stop working, even though we haven’t touched on hierarchies yet – I’ll do that in the “kitchen sink” entry instead.

I’ll close by revisiting our two maxims:

  • Field names matter. A lot. We’re talking “holy writ” time.
  • If QlikView modifies your data model for you by loosening tables and/or (sometimes) adding synthetic keys, you may want to modify your scripts – what you have is likely less-than-optimum.

You’ve seen how important the first commandment is by now – not too much more I can say about it.

The second commandment is a condensed version of the “group wisdom” I found reading the QlikView discussion forums. At the risk of over-simplifying, the feedback I got and saw was:

  • Loosened tables are evil. Rename your fields or de-normalize/reshape your tables (using JOINS) as necessary to avoid them.
  • Synthetic keys which represent real, useful composite keys (I didn’t have any) and relationships may not necessarily be bad – test and decide on your own. Here’s a good thread discussing same: http://community.qlikview.com/forums/t/31028.aspx

Next time: We’ll get into dashboard building itself. I honestly wasn’t overwhelmed with the data modeling and ETL-ish capabilities of QlikView. Building a dashboard was another matter, though!

20 Responses so far.

  1. Peter says:

    Hi!
    great first try with qlikview. next time you can try with the qualify function. It puts the table name in front of the field name in the qv-table. this means that you can load a table from a database that you know will have problems with i.e. Start Date and End Date fields. This asures that you dont get the problem of syntetic keys and the cpu problem that you encountered.

    qualify *;
    Unqualify CustomerID;
    Opportunities:
    LOAD Id,
    IsDeleted,
    AccountId,
    AccountId as CustomerID,
    CreatedDate,
    CreatedById;
    SQL SELECT Id,
    IsDeleted,
    AccountId,
    CreatedDate,
    CreatedById
    FROM Opportunity;
    unqualify *;

    There is a lot of ways to modify data in script that you have not covered in this post but this was a god try!

    Best of luck!
    /Peter

  2. Peter says:

    Hi again….

    the unqualify function tells qv that all fields but the named field or fields should have the table name in front of filed name (AccountId = Opportunity.AccountId) except CustomerID. This field can be the key field in the table. This gives you the ability to tinker with your tables.

    /Peter

  3. Russell says:

    Thanks, Peter! Unqualify would have been very helpful! Live and learn!

  4. Thanks for a really interesting post. We’re big users of Tableau at my company. However, QlikView always lurks around in the background as a competitive alternative. When I evaluated Tableau v Qlikview, the complexity of the data connection was what put me off Qlikview. With Tableau, you can pretty much just point and click to connect and you’re off. This enables “normal” business users (ie the people who think a script is something found in Hollywood) to use data v quickly.

    Have you checked out Tableau? I’d be really interested in your thoughts. Having used Tableau for so long, I’m now blinded to its problems – I work around them without thinking. Seeing some fresh perspective on the product would be really beneficial.

    Cheers
    Andy

  5. Russell says:

    Thanks, Andy! So, Tableau – I’ve seen the end results, and of course, they’re beautiful. I think Tableau visualizations have been what all the mega-and-niche vendors have been aspiring to, frankly. MSFT’s Project Crescent will get Microsoft into the neighborhood, however. How well does Tableau scale? That is arguably one of QlikView’s problems.

    I don’t see a lot of Tableau with the Media & Entertainment customers I deal with, so I don’t have an excuse to try it out (yet!). As I recall, their trial version expires pretty fast…but I may be thinking of something else.

  6. [...] my previous post, I touched on building out a data model in QlikView. In this post, we’ll cover using it by [...]

  7. John says:

    I have been surprised at how difficult it is to import and relate data into Qlikview – a product that I had much higher hopes for!

    The issues on name matching etc highlighted here are on top of a nicely formatted schema with VERY clean dimension and fact tables.

    The idea that one can easily get meaningful analytics from Qlikview by pointing it at a real world source database is less than accurate.

    Any sizable enterprise system is likely to have hundreds of tables that either force you to :-

    [1] Build a data warehouse
    [2] Implement funky queries to present the data dimensionally without an actual physical layer.

    If some of the challenges here are a bit of a headache on top of probably the most sanitised demo database in the world, can one imagine the nightmare of simply pointing qlikview at the source tables of a large OLTP ERP/CRM system and expecting an end user to make sense of it? No denormalisation, and likely no consistency in column names that relate.

    My point being that a lot of the work is actually in data transformation – qlikview has not changed that.

    While I’ve seen several tools (ie wherescape red) attempt to simplify the process of denormalisation/warehousing, the reality is that they are often more cumbersome, and perform worse, than ETL tools currently available from the big vendors.

    I consider QlikView to be just a presentation layer, albeit a very nice one, and I can see people having a lot of nightmares in the ‘Edit Script’ pane of QlikView. Effectively you are attempting to present a sanitized layer on top of a database using only queries. Fine for an already nicely formatted database like AdventureWorksDW – a bit of a nightmare for anything else.

    Anyone can produce a flashy demo/video on top of a silly little database like Adventure Works or Northwind (heck even MS do that) but seasoned developers out there need to see real world stuff to be impressed.

    Qlikview is certainly one of the best/prettiest user interfaces on the market today and it is a valuable product. But it cannot hope to meaningfully make sense of enterprise database schemas via its ‘Edit Script’ window.

    • Amitav says:

      Sorry to say, but only thing i can accept that is Alias Table not supported, really found interesting to see bunch of jokers are still testing this tool to create a comparative study, whiteout knowing how to use it…. but buddy try it and have at least 6 months experience then you able to realize, what is future.

  8. Taxaw says:

    WITH QLIKVIEW NO NEED TO CREATE OLAP CUBES !
    That’s the big difference with other BI Tools.

    • Russell says:

      Er, nice use of caps! :)

      “Cube-less BI” sounds like marketing weasel speak to me. Why is it an advantage? You’re still going to need a star schema to maximize performance in QlikView, and building that data model (if it doesn’t already exists in the DW) is what takes all the time.

    • Grif says:

      Yea till you try and work with more that two fact tables. whole thing turns into three ring circus very quickly.

  9. rb says:

    Hello Russell,

    I have been a Microsoft data tier consultant. Your work here with Qlikview has helped me immensely in my attempts to compare the different BI stacks in the market today.

    I am comfortable with Report Builder (good for server deployable multi page flat file reporting with some charts, and grids with adhoc query capabilities), kimball style datamarts, SSAS cubes, and Tableau.

    I am just starting to scratch at Qlikview. The data script setup seems to be approachable but cumbersome for large enough models. Every product has its strengths. This is not meant to be critical review of Qlikview.

    Tableau has done a great job of evolving the desktop, and server products with intuitive data setup stage, and fantastic visualizations. Connecting to practically all the sources of data in the market, with its no code interactive dashboards, it is worth a look for any BI developer. Tableau generally sends all the query requests as pass through to the appropriate data source behind the scenes. It’s data extract feature serves the need for in memory like performance benefits.

    Thanks.

    • Since you mention Tableau, which I think is a great tool on so many levels, I thought I’d share an article I wrote comparing it with Qlikview – which is also a great tool and perhaps even better than Tableau at reporting – although harder to learn so not so good for those not so technically strong. http://reports4u.co.uk/2011/06/12/tableau-vs-qlikview/

      The Tableau data engine (i.e. extracts) uses columnar data principles so as you can imagine is able to handle millions of rows of data very quickly – far quicker than Qlikview.

  10. Kerem Pekçabuk says:

    With qualify and unqualify functions it’s very easy to manage connections in qlikview.

    Managin Data and Data model is very easy and flexible in QlikView.I can do anything i imagine…

    Only i need is what i want to see…then it’s easy to execute on qlikview.

    I used several professional Data Services and Integrator tools.(very expensive ones).But Qlikview(it’s not only a data integrator or ETL tool) is the best i used.

    • Binga says:

      If you think Qlikview is a professional ETL tool you are showing your glaring lack of experience with any reasonable BI products.

      As an ETL tool, it sucks.

      • Russell says:

        LOL! Yeah, Binga is being a little harsh but I gotta agree – no one buys Qlik because it is a shining example of ETL goodness. I’d describe it’s functionality as “just about good enough to get data into QlikView”. It certainly is not a true ETL tool.

        • Kerem Pekçabuk says:

          i did’nt say it’s a professional ETL tool ! i only said that it’s the best and easy tool for showing data to users…(easy to execute)

          But you guyss… you only want to take data one side to another. i think you don’t care about what user see.

  11. Kasim Ikram says:

    I am a certified SQL 2008 BI developer and have been working with SSAS and MDX for a while now. I have done extensive research for my current employer some time ago in the search of a good analytical tool. I evaluated Strategy Analyzer to XLCubed. The 3 finalist were Tableau, Spotfire and Qlikview. Out of these three we decided to go with Tableau and since then we never looked back.

  12. yusuf says:

    Hi there
    I am writing an app in c# reading bo report and writing it in qvx format, here the problem i am having is is the sample code which i have has hardcoded field name (because he knew the table name) but in my case I am able to read the field name but not able to write it , please find attach code
    private QvxTableHeaderQvxFieldHeader[] enterFieldInfo()
    {
    QvxTableHeaderQvxFieldHeader[] Fields = new QvxTableHeaderQvxFieldHeader[]
    {

    new QvxTableHeaderQvxFieldHeader
    {
    FieldName = “Year”,
    Type = QvxFieldType.QVX_TEXT,
    Extent = QvxFieldExtent.QVX_COUNTED,
    NullRepresentation = QvxNullRepresentation.QVX_NULL_NEVER,
    CodePage = StdEncoding.CodePage.ToString(),
    ByteWidth = “4″
    },
    new QvxTableHeaderQvxFieldHeader
    {
    FieldName = “Quarter”,
    Type = QvxFieldType.QVX_TEXT,
    Extent = QvxFieldExtent.QVX_COUNTED,
    NullRepresentation = QvxNullRepresentation.QVX_NULL_NEVER,
    CodePage = StdEncoding.CodePage.ToString(),
    ByteWidth = “4″
    },
    new QvxTableHeaderQvxFieldHeader
    {
    FieldName = “Month”,
    Type = QvxFieldType.QVX_TEXT,
    Extent = QvxFieldExtent.QVX_COUNTED,
    NullRepresentation = QvxNullRepresentation.QVX_NULL_NEVER,
    CodePage = StdEncoding.CodePage.ToString(),
    ByteWidth = “4″
    }, }

    i need to write it dynamically read the name and write in the array like

    private QvxTableHeaderQvxFieldHeader[] enterFieldInfo()
    {
    QvxTableHeaderQvxFieldHeader[] Fields = new QvxTableHeaderQvxFieldHeader[]
    {

    for (intCountColumns = 0; intCountColumns <= recIRecordset.ColumnCount – 1; intCountColumns++)
    {
    new QvxTableHeaderQvxFieldHeader
    {
    FieldName = recIRecordset.ColumnName,
    Type = QvxFieldType.QVX_TEXT,
    Extent = QvxFieldExtent.QVX_COUNTED,
    NullRepresentation = QvxNullRepresentation.QVX_NULL_NEVER,
    CodePage = StdEncoding.CodePage.ToString(),
    ByteWidth = "4"
    },

    }
    }

    but it is not allowing me to do that

  13. markp3rry says:

    Great post! I’ve been working with QlikView for a few days and have found it quite frustrating – especially around handling dates and my data looping back on itself. This article has given me a lot of help – and creating a seperate table with a full list of dates in for the last x years is such a good idea. Thanks.


  • RSS
  • Delicious
  • Digg
  • Facebook
  • Twitter
  • Linkedin
  • Russell's About.Me Profile