Tips for addressing unresponsive SAS® 9.4 Stored Process Servers, Part 1


Part 1:  Restoring Unresponsive SAS® Stored Process Servers with SAS® 9.4

SAS Technical Support has received reports about previously working SAS Stored Process Servers becoming unresponsive over time for unknown reasons. Unresponsive means that the SAS Stored Process Servers are running, but no requests from client applications are getting through to the server. These servers might also be referred to as hung or orphaned SAS processes.

You might have encountered this problem in one or more ways. You might be the end user working with one of the SAS® Business Intelligence client applications, such as SAS® Enterprise Guide®, SAS® Web Report Studio, SAS® Add-in for Microsoft Office, SAS® Information Delivery Portal, SAS® Stored Processes Web Application, or another web-based application. In this scenario, you click a button, expecting a report to be returned, but instead you receive a generic error or a Java dump. Or, you might be the systems administrator who gets a call from the end user. Then, you determine that there is no SAS Stored Process Server responding, or at least discover that the problem involves a SAS server rather than a client.

What do you do?

1. Initially. with all customers, troubleshooting starts as a "Put out the Fire" situation, in which Technical Support offers suggestions to confirm that the servers are down or unresponsive, and to clean up and recover from the problem immediately. This document provides tips to evaluate and restore your SAS Stored Process Servers.

2. After the "fire" is out, a long-term strategy is needed to gather information and determine why the problem occurs. See SAS Note 60306, "Tips for addressing unresponsive SAS® 9.4 Stored Process Servers, Part 2" for a suggested approach.

Conducting Short-Term Troubleshooting to "Put Out the Fire"

Use the following five-step approach to evaluate the status of and restore your SAS Stored Process Servers:

  1. Test the Connection to the SAS Stored Process Server from SAS® Management Console

2. Check the Status of the Stored Process Server Ports by Running the NETSTAT System Command

3. Evaluate SAS Processes after Stopping the SAS Object Spawner

4. Review Applicable Log Files

5. Terminate Orphaned SAS Stored Process Servers and Restart the SAS Object Spawner

Step 1: Test the Connection to the SAS Stored Process Server from the SAS® Management Console

A basic test of stored process server functionality is available in the SAS® Management Console. To conduct this test, complete the following steps:

  1. Open SAS Management Console. Click the Plug-ins tab. The hierarchy appears on the left pane.
  2. Click + to expand the Server Manager hierarchy.
  3. Click + to expand the SASApp server group hierarchy.
  4. Highlight the SASApp – Logical Stored Process Server object.
  5. Right-click and select Validate.

A pop-up message appears with the message Test Connection Successful! if the test was successful.

You usually receive the following error in SAS Management Console when your stored process servers are unresponsive:

The application could not log on to the server
"host.name.com:8601".
The user ID "smcloginID@!*(generatedpassworddomain)*!"
or the password is incorrect.
 
Clicking the Details button in the error dialog box might display messages similar to the following:
 
[5/5/11 11:28 AM] INFO: Starting extended validation for Stored
Process server (level 1) - Making a connection
[5/5/11 11:28 AM] SEVERE: Access Denied.
[5/5/11 11:28 AM] SEVERE: Access denied.
[5/5/11 11:28 AM] SEVERE: The application could not log on to
the server "host.name.com:8601".
The user ID "smcloginid@!*(generatedpassworddomain)*!"
or the password is incorrect.
 
Step 2: Check the Status of the Stored Process Server Ports

The command-line tool NETSTAT (network statistics) displays incoming and outgoing network connections, routing tables, and various network interface statistics. This tool is available on UNIX and Windows operating systems. Use this tool to provide information about the status of the ports on which a particular stored process server runs, which are ports 8611, 8621, and 8631 by default.

Complete the following steps:

  1. Execute the following command in a command prompt:
    • Windows
      prompt> netstat –ano | find "8611"
      Note: Substitute 8621 or 8631 in order to check the other ports.

    • UNIX
      prompt> netstat –an | grep "8611"
      Note: Substitute 8621 or 8631 in order to check the other ports.

  2. Evaluate the port status.
    • LISTENING and ESTABLISHED are normal states.
    • TIME_WAIT and CLOSE_WAIT are normal states if the connection is shutting down.

      Note: A state of CLOSE_WAIT that persists for longer than two–five minutes might indicate that the server is unresponsive.

See RFC_793 (pages 20 and 21) for details about the progression of states for a TCP/IP connection.

When port 8611 is unresponsive, you see NETSTAT output similar to the following:

tcp4    1312      0  myserver.na.sas.8611  otherserver.na.sas.56480 CLOSE_WAIT
tcp4    1254      0  myserver.na.sas.8611  otherserver.na.sas.56487 CLOSE_WAIT
tcp4    1300      0  myserver.na.sas.8611  otherserver.na.sas.56498 CLOSE_WAIT
tcp4       0         0  *.8611                 *.*                    LISTEN
tcp4    1280      0  myserver.na.sas.8611  otherserver.na.sas.56805 CLOSE_WAIT
tcp4    1011      0  myserver.na.sas.8611  otherserver.na.sas.56816 CLOSE_WAIT
tcp4    1267      0  myserver.na.sas.8611  otherserver.na.sas.56822 CLOSE_WAIT
tcp4    1234      0  myserver.na.sas.8611  otherserver.na.sas.56825 CLOSE_WAIT
tcp4    1260      0  myserver.na.sas.8611  otherserver.na.sas.56828 CLOSE_WAIT
tcp4       0         0  *.8621                 *.*                    LISTEN
tcp4       0         0  *.8631                 *.*                    LISTEN
tcp4    1299      0  myserver.na.sas.8611  otherserver.na.sas.56850 CLOSE_WAIT
tcp4    1280      0  myserver.na.sas.8611  otherserver.na.sas.56854 CLOSE_WAIT
tcp4    1311      0  myserver.na.sas.8611  otherserver.na.sas.56865 CLOSE_WAIT
tcp4    1269      0  myserver.na.sas.8611  otherserver.na.sas.32775 ESTABLISHED
tcp4    1273      0  myserver.na.sas.8611  otherserver.na.sas.32781 ESTABLISHED
tcp4    1281      0  myserver.na.sas.8611  otherserver.na.sas.57282 CLOSE_WAIT
tcp4    1272      0  myserver.na.sas.8611  otherserver.na.sas.32786 ESTABLISHED
tcp4    1322      0  myserver.na.sas.8611  otherserver.na.sas.57288 CLOSE_WAIT
tcp4    1284      0  myserver.na.sas.8611  otherserver.na.sas.32795 ESTABLISHED
tcp4    1302      0  myserver.na.sas.8611  otherserver.na.sas.57298 CLOSE_WAIT
tcp4    1296      0  myserver.na.sas.8611  otherserver.na.sas.32804 ESTABLISHED               
 

Step 3: Evaluate SAS Processes after Stopping the SAS Object Spawner

By default, the SAS Stored Process Server is configured to execute using the sassrv user account.

If executing the NETSTAT command reveals unresponsive servers, you should stop the object spawner and search for remaining SAS processes that are owned by the sassrv user account. Any SAS processes owned by the sassrv account that persist after the SAS Object Spawner shuts down are likely to be unresponsive SAS Stored Process Servers (although other possible explanations exist).

Complete the following steps:

  1. Stop the object spawner.
    • Windows
      Stop the object spawner service through the Windows Services Manager.
    • UNIX
      Locate the ObjectSpawner.sh file in your SAS Configuration directory. For example:
      SAS-configuration-directory/Lev1/ObjectSpawner/ObjectSpawner.sh

      From a system prompt, change directories to where the ObjectSpawner.sh script is located and submit the following:
      prompt> ./ObjectSpawner.sh stop

  2. Search for remaining SAS processes that still persist after you stop the object spawner and are owned by the sassrv user account (or equivalent at your site):
    • Windows
      Use the Windows Task Manager.

      From the Processes tab, sort the process list by the User Name column and look for processes with an associated user name of sassrv.

    • UNIX
      Use the PS command.
      >prompt> ps-ef | grep "sassrv"

    • AIX
      prompt> ps -ef | grep "sassrv" | grep "8611"
      prompt> ps -ef | grep "sassrv" | grep "8621"
      prompt> ps -ef | grep "sassrv" | grep "8631"

    • HP-UX or Solaris
      prompt> ps -ef | grep "sassrv" | grep "sasexe/sas"
      prompt> ps -ef | grep "sassrv" 

Step 4: Review Applicable Log Files

When stored process servers become unresponsive, error and warning messages might appear in log files for the object spawner, stored process servers, metadata servers, and (for Windows systems only) the Windows Event Viewer.

You should review all the log file types noted below.

SAS Object Spawner Logs

Go to the appropriate folder and view the object spawner log file.

Stored Process Server Logs

Go to the appropriate folder and view the stored process server log file.

It is possible that the log from an unresponsive stored process server will contain no errors. In this case, you should note the last step or program that executed successfully and the last entry that was written in the log.

Often the last entry in an unresponsive stored process server log file notes that a request has started executing, similar to the example below:

INFO  [00000417] 4:sasdemo - STP: 1: Executing c:\mycode stp_report.sas 

 

If servers are unresponsive and you restart the object spawner, log entries similar to the following will appear in the stored process server log file when the stored process server attempts to restart:

ERROR [00000007] :sassrv - The TCP/IP tcpSockBind support routine failed with error 10048   
      (The specified address is already in use.).
ERROR [00000007] :sassrv - Bridge Protocol Engine Socket Access Method was unable to bind 
      the listen socket to port 8611.
ERROR [00000007] :sassrv - The Bridge Protocol Engine Socket Access Method listen thread 
      failed during the definition of the server listen.
INFO  [00000007] :sassrv - Bridge protocol engine is unloading.  

 

Note: These errors occur because the orphaned stored process server is still bound to the designated port for the server that is trying to start. If the stored process server log contains the statements above, you must terminate the existing stored process servers before you restart the object spawner.

For instructions, see the Step 5: Terminate Orphaned Stored Process Servers and Restart the Object Spawner section.

Metadata Server Log

Go to the appropriate folder and view the metadata server log file.

Windows Event Viewer (Windows operating system only)

The Windows Event Viewer is a Windows System Tool found in the Microsoft Management Console. Complete these tasks to check the Windows Event Viewer for errors:

  1. From the Windows Control Panel, open Administrative Tools ► Computer Management.
  2. Expand System Tools, and then expand the Event Viewer tool.
  3. Expand Windows Logs, and then click each log file, looking in the right pane of Event Viewer for any SAS event or other activity that occurred during the same date and time as the unresponsive condition of the stored process servers.

Step 5: Terminate Orphaned SAS Stored Process Servers and Restart the SAS Object Spawner

  1. Retrieve the listing of process IDs for unresponsive SAS Stored Process Servers that you noted after stopping the SAS Object Spawner.
  2. Terminate the unresponsive processes.
    1. Windows

      Use Task Manager or the kill command to terminate the unresponsive SAS Stored Process Servers.

        
    2. UNIX

      Use the kill command. See the example below:

           prompt> kill processID#
      
  3. Restart the object spawner.

Start the Object Spawner service through the Windows Services manager.

Locate your ObjectSpawner.sh file in your SAS configuration directory. Here is an example:

SAS-configuration-directory/Lev1/ObjectSpawner/ObjectSpawner.sh

From a system prompt, change directories to where the ObjectSpawner.sh script is located and submit the following:

prompt> ./ObjectSpawner.sh start
 

4. Using SAS Management Console, validate the SAS Stored Process Server to ensure that the servers were successfully restored. See Step 1 for the procedure.

5. Clean up leftover WORK library files. These files accumulate as a result of abnormal termination of the stored process server sessions.

If the above steps do not restore the SAS Stored Process Servers to normal functionality, compile results from the tests and the log files referenced in these steps and contact SAS Technical Support for further assistance.

Additional Information

If you are experiencing unresponsive stored process servers using SAS® 9.3, refer to the following note:

If you are experiencing unresponsive stored process servers using SAS® 9.2, refer to the following notes:

If you are experiencing unresponsive stored process servers using SAS® 9.1.3, refer to the following notes: