In today’s digital world, data is the lifeblood of business. Whether you’re a small eCommerce retailer or a multinational corporation, data analytics and visualisation give you a competitive advantage by driving smarter decision-making. But for any data to work within an analytics or visualisation platform, you need to get the foundations right. That means effective data modelling.
In this article, we’ll look at some of the best practices for data modelling in Qlik — a popular analytics platform that provides powerful real-time business intelligence and data visualisation. Qlik’s two main solutions, both of which can be used for data modelling, are:
- QlikView: A data analytics, visualisation, and reporting tool that helps businesses make sense of their data using charts and dashboards.
- Qlik Sense: Launched in 2014, Qlik Sense is a modern, self-service data exploration tool that allows users to build custom dashboards via drag-and-drop functionality.
Why is good data modelling important?
Businesses today collect a vast amount of data from multiple sources. But the usefulness of raw data is limited; it becomes useful when it’s transferred into an understandable and actionable format.
Data modelling is the visualisation and blueprint for how the data will be used. Without effective data models, platforms like QlikView and Qlik Sense can’t perform at their best, resulting in sluggish performance. To get the most out of your data, you need to design and implement a data model that:
- Reduces your system’s memory storage by freeing up access data
- Creates high-quality visualisations in real-time
- Run platforms, like Qlik, efficiently.
Qlik data model best practices
Data modelling can be a complex process. In this section, we’ll break down some of the data model best practices for QlikView and Qlik Sense, helping you get the most out of your data. Let’s dive in.
#1 Working with crosstables
A crosstable is a table consisting of columns and rows in a grid-like format. The top row contains one field, and the left-hand column contains another, with data populating the grid accordingly. See the example below.
While this may look appealing, it’s not the ideal format for data modelling in Qlik. If you load data this way, it would display a field for the year plus additional fields for every month, whereas you most likely need just three fields: the year, the month, and the respective values.
You can fix this problem by adding the crosstable prefix to the SELECT or LOAD script. Here’s an example:
Crosstable (Month, Sales) LOAD * from ex1.xlsx
What you get is this:
This process enables efficient data structuring and is the same whether you are using QlikView or Qlik Sense.
#2 Star schema vs Snowflake schema
Using a star schema in both QlikView and Qlik sense is the most efficient schema technique. Using a central fact table containing the relevant fields and keys, surrounded by dimensional tables that contain the attributes of the fields located in the central table, is the easiest to understand schema for data modelling.
Snowflake schemas, though useful for more complex fields and data, are less efficient due to the additional, intermediary tables through which information needs to travel.
Pro Tip: Circular references or loops — tables with more than one path of association between two fields — should be eliminated to improve efficiency. Qlik Sense uses loosely coupled tables to break circular references.
#3 Join and keep
You can combine two data tables in Qlik using the join and keep prefixes in your script. Join is used to fully combine two tables, creating all possible combinations of values from the tables. As a result, joined tables can be huge and slow to process in Qlik.
This is where the keep functionality comes in. Instead of joining tables to create one large table, keep allows you to link the two tables together, reducing repeated or identical data from the two, while continuing to store them as separate tables. This reduces the table size, ensuring faster processing times while freeing up memory.
The process here is the same for both QlikView and Qlik Sense.
#4 Incremental load
Incremental load allows you to load only new or updated data, as opposed to loading the entire data set each time. The best and fastest way to go about an incremental load is by using QVD files.
Here’s how the basic process works in both QlikView and Qlik Sense:
- New or updated data is loaded from the data source table. While this can be a slow process, only a limited number of records are actually loaded.
- Existing/old data is loaded from the QVD file. This involves loading a lot of records but at a much faster speed.
- You then create a new QVD file, containing both the old and new data, which you’ll use the next time you want to do an incremental load.
- Repeat this for each table you want to load.
Pro Tip: Using an ‘As-Of calendar’ prevents users from loading data multiple times to get previous-period calculations. An As-Of calendar prevents multiplication of data volumes.
#5 Generic databases
To display attributes of different objects, you can store data in generic databases. These are essentially tables where field names are stored as values in one column, with field values stored in a second column. See the example below:
As you can see, this table contains two objects: a ball and a box. While they share some common attributes, e.g. colour and weight, other attributes are specific to one or the other, e.g. diameter or length/width.
If you load this table as a generic database in Qlik Sense or QlikView, the attributes in the second column become tables of their own, allowing the data to be stored in a more compact way. See the examples below.
Pro Tip: Giving tables easy and intuitive names helps users easily filter data and fields using table names.
#6 Matching intervals to discrete data
By adding the intervalmatch prefix to a LOAD or SELECT statement in Qlik Sense or QlikView, you can link discrete numeric values from one table to different numeric intervals in another table.
This allows you to show, for example, how certain events actually took place compared to how they were expected to take place. It is particularly powerful in manufacturing, where production lines are scheduled to run at certain times, but due to breakdowns, delays, or other errors, they may run at different times.
There are a few important points to consider when using interval matching:
- The discrete data points must already have been read in Qlik before using intervalmatch.
- The table you want to be matched must always contain two fields, typically start and end.
- Intervals are always closed, with endpoints included in the interval.
#7 Using and loading hierarchy data
Hierarchy data can be displayed in Qlik Sense and QlikView in several ways, including adjacent nodes tables, expanded nodes tables, and ancestors tables. Let’s take a look at what each one offers.
Adjacent nodes tables: each node in the hierarchy is stored once and is linked to the node’s parent (see the examples below). Adjacent nodes tables are the simplest way to present hierarchy data. While good for maintaining unbalanced hierarchies, they aren’t suitable for detailed analysis.
Expanded nodes tables: In this type of table, each level of the hierarchy is presented in its own separate field, making it easier to use in a tree structure (see example below).
Expanded nodes tables are more suitable for querying and analysis than adjacent nodes tables, but aren’t best suited for searches or selections as you need prior knowledge of each level you want to search for or select.
|3||2||Senior Manager||CEO||Director||Senior Manager||–|
Ancestors table: This table solves the search/selection issues that come with expanded nodes tables, presenting hierarchy data in even greater detail. Ancestors tables show a unique record for each child-ancestor relation in the data, including keys and names for each child as well as for each ancestor.
#8 Data cleansing
Sometimes, field values that represent the same thing may be written differently. For example, you could find the following common field values in different tables: UK, U.K., United Kingdom.
All three field values clearly mean the same thing, but the lack of consistency in their formatting means they could be interpreted as different values, leading to messy, inaccurate, or redundant data. This is why data cleansing is so important.
You can cleanse such data in Qlik Sense and QlikView using a mapping table, which maps the column values between different tables. This ensures that values that are written in different ways will consistently be recognised as the same value, not different ones.
#9 Mapping instead of joining
As we discussed in point #2, the join prefix is a powerful way to combine multiple tables in Qlik Sense and QlikView, but it often results in very large tables that can be a drag on performance. You can get around this problem by using mapping instead.
Let’s look at an example. The first table below presents a business’s order book. Imagine you needed to know which countries your customers are from, which is stored in the second table below.
To look up the country of a customer, you’d need to create a mapping table, like the one below:
By applying the mapping table to the order table, you create a clear table, like this:
#10 Creating date intervals from single dates
In some cases, time intervals are not stored with a beginning and an end time, but rather a single field representing when something changed. Take this table below, for example, which shows different rates for two different currencies:
In this instance, the change date field is equivalent to the beginning date of an interval, and the end date is defined by the beginning of the next interval. The two empty rows in the change date column show the initial currency conversion rate, prior to the first change being made.
Additionally, there’s no end date column. To create a new table that has an end date column, you’ll need to follow the steps outlined in this article for Qlik Sense and this article for QlikView. Once that’s done, you will produce a table like this:
Pro Tip: When using multiple dates, using a master calendar with canonical dates helps reduce multiple calendars, each of which contain date fields.
Making best practice normal practice
Data modelling is a complicated process. But to make the most of your data and powerful platforms like Qlik, effective data modelling is critical. Without a solid understanding of Qlik data model best practices, however, you could put unnecessary strain on the platform — and never truly unlock the insights in your data.
This can affect the speed and efficiency of your data processing, which in turn can impact the speed of your decision-making, the value of your data, and the ROI of your investment in the tool itself.
By working with a trusted data partner like Jarmany, you can sidestep these issues altogether, ensuring that you get the most out of Qlik and, as a result, your data. Whether it’s supplementing your in-house team or providing a fully outsourced service, our experts are here to help you implement data modelling best practices with minimum hassle and maximum benefit.
If you’d like to find out more about how Jarmany could help you unlock the power of Qlik, get in touch today and talk to one of our experts.