Using PROC SQL to compute distinct counts for columns used as levels in an OLAP cube


When working with a SAS OLAP cube, both in analyzing cube design or for debugging a memory-related issue, you often need to determine what the distinct counts are for categorical columns in the input table that will be used as levels within cube dimensions.

To calculate distinct counts for levels within your cube, you can use PROC SQL with COUNT(DISTINCT varname) to determine the cardinalities, or number of unique values, for columns that will be used as dimension levels in your cube.

The following PROC SQL code computes the distinct counts for each categorical column of the sample table SASHELP.PRDSALE.

   PROC SQL;   
      SELECT count(distinct country) AS CNT_Country, 
             count(distinct region) as CNT_Region,
             count(distinct division) as CNT_Division,
             count(distinct prodtype) as CNT_Prodtype,
             count(distinct product) as CNT_Product,
             count(distinct year) as CNT_Year,
             count(distinct quarter) as CNT_Quarter,
             count(distinct month) as CNT_Month
      FROM SASHELP.PRDSALE;
   QUIT; 

The following distinct count values are returned for each of the requested classification columns.

   CNT_Country=3 
   CNT_Region=2 
   CNT_Division=2 
   CNT_Prodtype=2 
   CNT_Product=5 
   CNT_Year=2
   CNT_Quarter=4  
   CNT_Month=24 

This approach enables you to utilize the efficiency of SQL to compute these values.