Search This Blog

Friday, February 15, 2013

SQL Server Analysis Services Interview Questions on Dimensions, Hierarchies and Properties


Problem

Dimensions are a very important component of an OLAP Solution/Cube in SQL Server Analysis Services. Hence it is good to have a fair amount of knowledge/understanding on the Dimensions, its properties, and its sub components like Attributes, Hierarchies etc.

Solution

In the first tip of this series, I covered some of the Questions on Basic Concepts, Data Sources, and Data Source Views in SQL Server Analysis Services. And in the second tip, I covered some of the Questions on General Dimension Concepts and Types of Dimensions. In this third tip, I will be covering some of the questions on Dimensions, its properties, and its components like Attributes, Hierarchies etc. within SQL Server Analysis Services.

What are Database Dimension and Cube Dimension? What is the difference between them?

Database Dimension is a Dimension which exists independent of a Cube. You can see a Database Dimension as a separate entity/file in the Solution Explorer/Project Folder. An analogy, this is pretty similar to a Class in a Programming Language.
A Cube Dimension is an Instance of a Database Dimension. An analogy, this is pretty similar to an Object (Instance of a Class) in a Programming Language.
Here are some of the highlights/differences of Database and Cube Dimensions:
  • A Database Dimension is independent of a Cube, but a Cube Dimension is always a part of a Cube.
  • One Database Dimension can be added to a Cube more than once (Role-Playing Dimension) as different instances. In this scenario, each Instance is treated as a separate Cube Dimension.
  • A Database Dimension can be added to more than one Cube, whereas a Cube Dimension can belong to a Single Cube at any given point.

What is a Linked Dimension? In what scenarios do you use it?

Linked Dimension is a Dimension which is based on (Linked To) another Database Dimension which might be either located on the same Analysis Services server as the Linked Dimension or on a different Analysis Services server.
Linked Dimensions can be used when the exact same dimension can be used across multiple Cubes within an Organization like a Time Dimension, Geography Dimension etc.
Here are some of the highlights of a Linked Dimension:
  • More than one Linked Dimension can be created from a Single Database Dimension.
  • These can be used to implement the concept of Conformed Dimensions.
  • For an end user, a Linked Dimension appears like any other Dimension.

What are the different ways to create a Time Dimension in Analysis Services?

Time Dimension is one of the most important and most common type of dimensions as pretty much every metric is analyzed over time. Analysis Services offers following different ways to create a Time Dimension:
  • Create Using an Existing Table: This is one of the common approaches for building a Time Dimension. In this approach, a table is created in the underlying data source and pre-populated with data and it is then used to create the Time Dimension in SSAS.
  • Generate a Time Table in the Data Source: In this approach there will be no underlying table and at the time of creation of a Time Dimension in SSAS, Analysis Services creates a table in the underlying data source (relational database) and populates it with the required data using the date range, attributes, and calendars etc. which are specified at the time of creation of Time Dimension in Business Intelligence Development Studio. This option requires permissions to create a table in the underlying data source.
  • Generate a Time Table on the Server: In this approach there will be no underlying table and at the time of creation of a Time Dimension in SSAS, Analysis Services creates a table on the server and this table is used to generate the Time Dimension by the wizard. The Dimension created using this approach is called a Server Time Dimension.

What is Type property of a Dimension? What is the purpose of setting this property?

Type property of a Dimension is used to specify the type of information that the Analysis Services Dimension contains. Like a Time Dimension (Contains Years, Quarters, Months, Dates, and so on), Geography Dimension (Contains Geographical Information), and Accounts Dimension (Contains Accounts related information) etc. This property is set to Regular by default.
There are basically two important uses of setting this property:
  • This property is used by the Business Intelligence Wizard to assign/generate standard classification/MDXexpressions.
  • This property is used by the Analysis Services Client Applications like Excel etc. to adjust the user interface/rendering format appropriately. For example, when a dimension is added to a Pivot Table in Excel, certain types of dimensions are automatically added across the columns axis (like Time Dimension) and a few other types of dimensions are added to the rows axis (Like Product Dimension) automatically.

What is a Storage Mode? What are the different storage modes applicable to Dimensions?

A Storage Mode defines the location in which the Dimension data will be stored and the format (Relational or Multidimensional) in which the data will be stored.
Following are the two Storage Modes supported by Dimensions:
  • ROLAP (Relational On-Line Analytical Processing):
    • When the Storage Mode is set to ROLAP for a Dimension, then the Dimension Data will be stored in the relational database tables.
    • This storage Mode offers effective memory usage by avoiding duplication of data compared to MOLAP Storage Mode.
    • Using this Storage Mode will result in a slowdown in the query performance compared to MOLAP Storage Mode.
  • MOLAP (Multidimensional On-Line Analytical Processing):
    • When the Storage Mode is set to MOLAP for a Dimension, then the Dimension Data will be stored in a multidimensional format in the Analysis Services/OLAP Server.
    • This Storage Mode offers poor memory usage as it involves duplication of data (first copy is the data in the underlying dimensional tables and the second copy is the data in the Dimension in the OLAP Server).
    • Using this Storage Mode will result in best query performance compared to any other Storage Modes available in SSAS.

What is the difference between Attribute Hierarchy and User Defined Hierarchy?

An Attribute Hierarchy is a Hierarchy created by SQL Server Analysis Services for every Attribute in a Dimension by default. An Attribute by default contains only two levels - An "All" level and a "Detail" level which is nothing but the Dimension Members.
A User Defined Hierarchy is a Hierarchy defined explicitly by the user/developer and often contains multiple levels. For example, a Calendar Hierarchy contains Year, Quarter, Month, and Date as its levels.
Here are some of the highlights/differences of Attribute and User Defined Hierarchies:
  • Attribute Hierarchies are always Two-Level (Unless All Level is suppressed) whereas User Defined Hierarchies are often Multi-Level.
  • By default, Every Attribute in a Dimension has an Attribute Hierarchy whereas User Defined Hierarchies have to be explicitly defined by the user/developer.
  • Every Dimension has at least one Attribute Hierarchy by default whereas every Dimension does not necessarily contain a User Defined Hierarchy. In essence, a Dimension can contain zero, one, or more User Defined Hierarchies.
  • Attribute Hierarchies can be enabled or disabled. Disable the Attribute Hierarchy for those attributes which are commonly not used to slice and dice the data during analysis, like Address, Phone Number, and Unit Price etc. Doing this will improve the cube processing performance and also reduces the size of the cube as those attributes are not considered for performing aggregations.
  • Attribute Hierarchies can be made visible or hidden. When an Attribute Hierarchy is hidden, it will not be visible to the client application while browsing the Dimension/Cube. Attribute Hierarchies for those attributes which are part of the User Defined Hierarchies, like Day, Month, Quarter, and Year, which are part of the Calendar Hierarchy, can be hidden, since the attribute is available to the end users through the User Defined Hierarchy and helps eliminate the confusion/redundancy for end users.

What is an Attribute Relationship? What are the different types of Attribute Relationships?

An Attribute Relationship is a relationship between various attributes within a Dimension. By default, every Attribute in a Dimension is related to the Key Attribute. Quite often these default Attribute Relationships need to be modified to suit the User Defined Hierarchies and other end user requirements.
There are basically two types of Attribute Relationships:
  • Rigid: Attribute Relationship should be set to Rigid when the relationship between those attributes is not going to change over time. For example, relationship between a Month and a Date is Rigid since a particular Date always belongs to a particular Month like 1st Feb 2012 always belongs to Feb Month of 2012. Try to set the relationship to Rigid wherever possible.
  • Flexible: Attribute Relationship should be set to Flexible when the relationship between those attributes is going to change over time. For example, relationship between an Employee and a Manager is Flexible since a particular Employee might work under one manager during this year (time period) and under a different manager during next year (another time period).

What are KeyColumns and NameColumn properties of an Attribute? What is the different between them?

KeyColumns is a property of an SSAS Dimension Attribute and it forms the Key (Unique) for the attribute. It can be bound to one or more columns in the underlying database table. When User Defined Hierarchies are created in the dimension (Attribute Relationships defined), setting this property becomes very critical and often requires setting this to a combination of more than one column from the Data Source View. For Example, say you have a Date Dimension and a hierarchy called Calendar Hierarchy (Year -> Quarter -> Month). Now what happens is that, Month gets repeated across different quarters and quarters get repeated across different years making the attribute as non-unique (like January can belong to Q1 of any year and similar Q1 can belong to any year). So to make the attribute unique, KeyColumns for Month should be set to something like Year and Month and similarly for Quarter should be set to Year and Quarter.
A NameColumn is a property of an SSAS Dimension Attribute and it is used to identify the column from the underlying Data Source View which provides the name of the attribute which is displayed to the end user by making it more user friendly instead of displaying the Key Column value. For Example, you might have ProductCategoryKey as 1, 2, 3, & 4, and ProductCategoryName as Bikes, Components, Clothing, & Accessories respectively. Now, NameColumn will be set to ProductCategoryName so that user sees them as Bikes, Components etc. even though the data in the background is processed/retrieved using the Key Column values as 1, 2 etc.
Here are some of the highlights/differences of KeyColumns and NameColumn properties:
  • KeyColumns property is defaulted to the Attribute itself, and the NameColumn property is defaulted to Key Column (when the KeyColumns is set to only one column).
  • Column(s) provided in the KeyColumns should be able to uniquely identify all the values of the respective attribute, whereas NameColumn need not be unique.
  • KeyColumns can contain one or more columns whereas NameColumn can contain only one column.

What is an Unknown Member? What is its significance?

An Unknown Member is a built-in member provided by SQL Server Analysis Services. It represents a Missing or Null value. Basically when a Dimension is Processed, Analysis Services populates each of the attributes with distinct values from the underlying data source and in this process, if it encounters Null value then it converts them appropriately (to 0 in case of numeric columns and to empty string in case of string columns) and marks them as Unknown Member for easy interpretation by the end user.
One of the important uses of an Unknown Member is to handle Early Arriving Facts. This is a scenario, in which the transactional record (Fact) would have come into the system but the corresponding Dimension/Contextual data is yet to come in which could be due to various reasons like Improperly Designed Data Load Process, Failure in the ETL Process, and a Technical Glitch in the transactional system causing delay in pushing the Dimension Data.
Unknown Member can be enabled or disabled (set to None) or set to Visible or Hidden based on the end user requirements.

What are Dimension Translations? In what scenarios do you use them?

Translation in SSAS is a mechanism to support Localization, in which the labels, names, and captions associated with any SSAS Object (pretty much every SSAS Object supports Translations) are translated from one language to another language based on the country/language of the user accessing the data. A Dimension Translation is same as Translation in SSAS, but in this case the labels, names, and captions associated with Dimensions, Attributes, Hierarchies, and/or Dimension Members are translated from one language to another language.
Translations are very useful in achieving higher level of adoption of the BI/Analytics system (SSAS). This will eliminate the language barriers among users from different locations/languages and presents the same information in different languages making single version of truth available to users across different geographical locations.
Here are some of the highlights of Dimension Translations:
  • There can be multiple Translations associated with a single Dimension, Attribute, Hierarchy, and Dimension Member etc.
  • The collation and language settings/information from the client application/computer is used to determine and provide the analysis services metadata/data to the client application.

Next Steps



Last Update: 5/11/2012 

About the author

Datta has over 5 years of experience in the IT industry working with SQL Server, SSIS, SSAS, SSRS.

View all my tips

2 comments: