Choosing your degree of numeric precision


Different factors affect numeric precision, which is a common issue for many people, including SAS® users. Though computers and software can help, you are still limited in how precisely you can calculate, compare, and represent data. So only those people who generate and use data can determine the exact degree of precision that meets their enterprise needs.

Factors That Can Cause Calculation Differences

As you decide what degree of precision you want, you need to consider the following system factors, which can cause calculation differences:

The following factors can also cause differences:

You also need to consider how conversions are performed on, between, or across any of these system or calculation items.

Simple Examples of Specific Problems That Result in Numeric Imprecision

Depending on the degree of precision that you want, calculating the value of r can result in a tiny residual in a floating-point unit. When you compare the value of r to 0.0, you might find that r≠0.0—the numbers are very close but not equal. This type of discrepancy in results can stem from problems in representing, rounding, displaying, and selectively extracting data.

 

Representing Data

Some numbers can be represented exactly, but others cannot. As shown in the following example, the number 10.25, which terminates in binary, can be represented exactly.

   data x; 
      x=10.25; 
      put x hex16.;
   run;

The output from this DATA step is an exact number: 4024800000000000.

However, the number 10.1 cannot be represented exactly, as shown in this example:

    data x; 
      x=10.1; 
      put x hex16.;
   run;
 

The output from this DATA step is an inexact number: 4024333333333333.

Rounding Data

Rounding errors, as illustrated in the following example, can result from platform-specific differences for which there is no solution.

   data x;
      x=10.1; 
      put x hex16.;
      y=100000;
      newx=(x+y)-y; 
      put newx hex16.;
   run;

In the Windows and Linux environments, the output from this DATA step is 4024333333333333 (8/10-byte hardware double). In the Solaris x64 environment, the output is 4024333333334000 (8/8-byte hardware double).

Displaying Data

For certain numbers (such as x.5), the precision of displayed data depends on whether you round up or down. Low-precision formatting (rounding down) can produce different results on different platforms. In the following example, the same high-precision (rounding up) result occurs for X=8.3, X=8.5, or X=hex16. However, a different result occurs for X=8.1 because this number does not yield the same level of precision.

    data;
      x=input('C047DFFFFFFFFFFF', hex16.);
      put x= 8.1 x= 8.3 x= 8.5 x= hex16.;
   run;


The output under Windows or Linux (high-precision formatting) is as follows:

   x=-47.8 
   x=-47.750 x=-47.7500
   x=C047DFFFFFFFFFFF


The output under Solaris x64 (low-precision formatting) is as follows:

   x=-47.7 
   x=-47.750 x=-47.7500
   x=C047DFFFFFFFFFFF

To fix the problem illustrated by this example, you must select a number that yields the next precision level; in this case, 8.2.

Extracting Data Selectively

Results can also vary when you access data that is stored on one system by using a client on a different system. The following example illustrates running a DATA step from a Windows client to access SAS data in the MVS environment.

 data z(keep=x);
      x=5.2;
      output;
      y=1000;
      x=(x+y)-y;   /*almost 5.2 */
      output;
   run; 
   
   proc print data=z;
   run;


This DATA step produces the following output:

   Obs     x
   1      5.2
   2      5.2


The next example illustrates the output that you get when you execute the DATA step interactively under Windows or under MVS:

   data z1;
      set z(where=(x=5.2));
   run;


The output under Windows is as follows:

NOTE: There were 2 observations read from the data set WORK.Z1.


The output under MVS is as follows:

   NOTE: There were 1 observations read from the data set WORK.Z. 
   WHERE x=5.2; 
   NOTE: The data set WORK.Z1 has 1 observations and 1 variables. 
   The DATA statement used 0.00 CPU seconds and 14476K. 

In the previous example, the expected count was not returned correctly under MVS because the imperfection of the data and finite precision are not taken into account. You cannot use equality to obtain a correct count because it does not include the "almost 5.2" cases in that count. To obtain the correct results under MVS, you need to run the following DATA step:

   data z1;
      set z(where=(compfuzz(x,5.2,1e-10)=0));
   run;


Under MVS, the output from this DATA step is as follows:

   NOTE: There were 2 observations read from the data set WORK.Z.
   WHERE COMPFUZZ(x, 5.2, 1E-10)=0;
   NOTE: The data set WORK.Z1 has 2 observations and 1 variables.

Your Options When Choosing the Degree That You Need

Once you determine the degree of precision that your enterprise needs, you can refine your software. You can use macros, sensitivity analyses, or fuzzy comparisons such as extractions or filters to extract data from databases or from different versions of SAS.

If you are running SAS® 9.2, use the COMPFUZZ (fuzzy comparison) function. Otherwise, use the following macro:

         /******************************************************************************/
      /*This macro defines an EQFUZZ operator.  The subsequent DATA step shows*/
      /* how to use this operator to test for equality within a certain tolerance.*/
      /******************************************************************************/

   %macro eqfuzz(var1, var2, fuzz=1e-12);
   abs((&var1 - &var2) / &var1) < &fuzz
   %mend;

   data _null_;
      x=0; 
      y=1;
      do i=1 to 10;
         x+0.1;
      end;
      if x=y then put 'x exactly equal to y';
      else if %eqfuzz(x,y) then put 'x close to y';
      else put 'x nowhere close to y';
   run;
 

When you read numbers in from an external DBMS that supports precision beyond 15 digits, you can lose that precision. You cannot do anything about this for existing databases. However, when you design new databases, you can set constraints to limit precision to about 15 digits or you can select a numeric DBMS data type to match the numeric SAS data type. For example, select type BINARY_DOUBLE in Oracle (precise up to 15 digits) instead of type NUMBER (precise up to 38 digits).

When you read numbers in from an external DBMS for noncomputational purposes, use the DBSASTYPE= data set option, as shown in this example:


    libname ora oracle user=scott password=tiger path=path;

   data sasdata;
      set ora.catalina2( dbsastype= ( c1='char(20)') ) ;
   run;

This option retrieves numbers as character strings and preserves precision beyond 15 digits. For details about the DBSASTYPE= option, see "Data Set Options for Relational Databases" in SAS/ACCESS 9.1.3 for Relational Databases: Reference.

Resources

Refer to the following resources for more detail about numeric precision, including variables that can affect precision.