3.5 Using the lsattr Command

3.6 The System Error Log

Once you have all the devices configured in your system and your system is in production, you may encounter errors related to hardware during your normal day-to-day operations. AIX provides the error logging facility for recording hardware and software failures in an error log. This error log can be used for information purposes or for fault detection and corrective actions.

The error logging process begins when an operating system module detects an error. The error-detecting segment of code then sends error information to either the errsave and errlast kernel service or the errlog application subroutine where the information is, in turn, written to the /dev/error special file. This process then adds a time stamp to the collected data. You can use the errpt command to retrieve an error record from the error log.

3.6.1 Using the errdemon Command

The errdemon process constantly checks the /dev/error file for new entries. When new data matches an item in the Error Record Template Repository, the daemon collects additional information from other system components.

The errdemon command is normally started automatically during system start-up, however, if it has been terminated for any reason and you need to restart it, enter:

/usr/lib/errdemon

In order to determine the path to your system's error log file, run the following command:

# /usr/lib/errdemon -l
Error Log Attributes
--------------------------------------------
Log File                /var/adm/ras/errlog
Log Size                1048576 bytes
Memory Buffer Size      8192 bytes

In order to change the maximum size of the error log file, enter:

/usr/lib/errdemon -s 2000000

In order to change the size of the error log device driver's internal buffer, enter:

/usr/lib/errdemon -B 16384

A message similar to the following is displayed:

0315-175 The error log memory buffer size you supplied will be rounded up to a multiple of 4096 bytes.

3.6.2 Using the errpt Command

In order to retrieve the entries in the error log, you can use the errpt command. The errpt command generates an error report from entries in an error log. It includes flags for selecting errors that match specific criteria. By using the default condition, you can display error log entries in the reverse order in which they occurred and were recorded.

Note

The errpt command does not perform error log analysis; for analysis, use the diag command.

The general syntax of the errpt command is as follows:

errpt [ -a ] [ -c ] [ -d ErrorClassList ] [ -e EndDate ] [ -g ] [ -i File ] [ -j ErrorID [ ,ErrorID ] ] | [ -k ErrorID [ ,ErrorID ]] [ -J ErrorLabel [ ,ErrorLabel ] ] | [ -K ErrorLabel [ ,ErrorLabel ] ] [ -l SequenceNumber ] [ -m Machine ] [ -n Node ] [-s StartDate ] [ -F FlagList ] [ -N ResourceNameList ] [ -R ResourceTypeList ] [ -S ResourceClassList ] [ -T ErrorTypeList ] [ -y File ] [ -z File ]

Some of the most commonly used flags used with the errpt command are given in Table 14.


Table 14: errpt Command Flags

The following sections show a few examples of using the errpt command.

3.6.2.1 Displaying Errors Summary

To display a complete summary report of the errors that have been encountered so far, on the command line, use the errpt command as follows:

# errpt
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
2BFA76F6   1025181998 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   1025182198 T O errdemon       ERROR LOGGING TURNED ON
2BFA76F6   1025175998 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   1025180298 T O errdemon       ERROR LOGGING TURNED ON
2BFA76F6   1025174098 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   1025174398 T O errdemon       ERROR LOGGING TURNED ON
.......... (Lines Removed)
2BFA76F6   1021134298 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   1021135098 T O errdemon       ERROR LOGGING TURNED ON
2BFA76F6   1021120198 T S SYSPROC        SYSTEM SHUTDOWN BY USER
9DBCFDEE   1021130298 T O errdemon       ERROR LOGGING TURNED ON
9DBCFDEE   1018210898 T O errdemon       ERROR LOGGING TURNED ON
9DBCFDEE   0808123837 T O errdemon       ERROR LOGGING TURNED ON
9DBCFDEE   0918153137 T O errdemon       ERROR LOGGING TURNED ON
9DBCFDEE   0918145637 T O errdemon       ERROR LOGGING TURNED ON

3.6.2.2 Displaying Error Details

To display a detailed report of all the errors encountered on the system, use the errpt command as follows:

# errpt -a
-----------------------------------------------------------------------
LABEL:          REBOOT_ID
IDENTIFIER:     2BFA76F6

Date/Time:       Sun Oct 25 18:19:04
Sequence Number: 60
Machine Id:      006151474C00
Node Id:         sv1051c
Class:           S
Type:            TEMP
Resource Name:   SYSPROC

Description
SYSTEM SHUTDOWN BY USER

Probable Causes
SYSTEM SHUTDOWN

Detail Data
USER ID
           0
0=SOFT IPL 1=HALT 2=TIME REBOOT
           0
TIME TO REBOOT (FOR TIMED REBOOT ONLY)
.......... (Lines Removed)
-----------------------------------------------------------------------
LABEL:          DISK_ERR3
IDENTIFIER:     35BFC499

Date/Time:       Thu Oct 22 08:11:12
Sequence Number: 36
Machine Id:      006151474C00
Node Id:         sv1051c
Class:           H
Type:            PERM
Resource Name:   hdisk0
Resource Class:  disk
Resource Type:   scsd
Location:        04-B0-00-6,0
VPD:
        Manufacturer................IBM
        Machine Type and Model......DORS-32160    !#
FRU Number..................
        ROS Level and ID............57413345
        Serial Number...............5U5W6388
        EC Level....................85G3685
        Part Number.................07H1132
        Device Specific.(Z0)........000002028F00001A
        Device Specific.(Z1)........39H2916
        Device Specific.(Z2)........0933
        Device Specific.(Z3)........1296
        Device Specific.(Z4)........0001
        Device Specific.(Z5)........16

Description
DISK OPERATION ERROR

Probable Causes
DASD DEVICE
STORAGE DEVICE CABLE

Failure Causes
DISK DRIVE
DISK DRIVE ELECTRONICS
STORAGE DEVICE CABLE

        Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0A06 0000 2800 0088 0002 0000 0000 0200 0200 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0001 2FC0
.......... (Lines Removed)
-----------------------------------------------------------------------
LABEL:          ERRLOG_ON
IDENTIFIER:     9DBCFDEE

Date/Time:       Fri Sep 18 14:56:55
Sequence Number: 14
Machine Id:      006151474C00
Node Id:         sv1051c
Class:           O
Type:            TEMP
Resource Name:   errdemon

Description
ERROR LOGGING TURNED ON

Probable Causes
ERRDEMON STARTED AUTOMATICALLY

User Causes
/USR/LIB/ERRDEMON COMMAND

        Recommended Actions
        NONE

3.6.2.3 Displaying Errors by Time Reference

If you suspect that the errors were encountered during the last day, you can display a detailed report of all errors logged in the past 24 hours, where the string equals the current month, day, hour, minute, and year, minus 24 hours. To do so, use the errpt command as follows:

# date
Fri Oct 30 08:24:00 CST 1998
# errpt -a -s 1029082498
-----------------------------------------------------------------------
LABEL:          ERRLOG_ON
IDENTIFIER:     9DBCFDEE

Date/Time:       Sat Aug  8 12:38:35
Sequence Number: 16
Machine Id:      006151474C00
Node Id:         sv1051c
Class:           O
Type:            TEMP
Resource Name:   errdemon

Description
ERROR LOGGING TURNED ON

Probable Causes
ERRDEMON STARTED AUTOMATICALLY

User Causes
/USR/LIB/ERRDEMON COMMAND

        Recommended Actions
        NONE
.......... (Lines Removed)
-----------------------------------------------------------------------
LABEL:          ERRLOG_ON
IDENTIFIER:     9DBCFDEE

Date/Time:       Fri Sep 18 14:56:55
Sequence Number: 14
Machine Id:      006151474C00
Node Id:         sv1051c
Class:           O
Type:            TEMP
Resource Name:   errdemon

Description
ERROR LOGGING TURNED ON

Probable Causes
ERRDEMON STARTED AUTOMATICALLY

User Causes
/USR/LIB/ERRDEMON COMMAND

        Recommended Actions
        NONE

3.6.3 Other Error Handling Commands

In addition to the errpt command, the following commands can be used in conjunction with the errpt command to find hardware errors and take corrective measures for any problems reported by the error logging facility:

diag
Performs hardware problem determination.
errclear
Deletes entries from the error log.
errinstall
Installs messages in the error logging message sets.
errupdate
Updates the Error Record Template Repository.

3.7 The System Log