Search this Blog

Thursday, March 20, 2014

What is a crashinfo file?

What is a crashinfo file.How to find and pull the crash info files from a 6500/7600 after a crash or unscheduled reload?

6500/7600 has two crashinfo files per crash, one for RP, one for SP. Cisco TAC needs BOTH crashinfo files in order to best determine the root cause of the crash.
To recover the crashinfo files log the output of:
  RP:  more bootflash:crashinfo_YourTimestamp
  SP:  more sup-bootdisk:crashinfo_YourTimestamp
If the device has redundant supervisors then the crashinfo files may be stored like:
  RP:  more slavebootflash:crashinfo_YourTimestamp
  SP:  more slavesup-bootdisk:crashinfo_YourTimestamp
Finally, sometimes the SP path is sup-bootflash: instead of sup-bootdisk:
Details follow below.
Introduction:
  To troubleshoot the root cause of a crash or unexpected reload on a 6500 or 7600 you generally need to open a case with Cisco TAC. TAC will need some files known as "crashinfo files".  Basically these are text files that contain information detailing what was happening in the device at the time of the crash. TAC takes these crashinfo files and uses proprietary tools to decode them and attempt to find the root cause of a reload. (So a customer or 3rd party would not be able to decode a crash and find the root cause.)
  Therefore before TAC can begin troubleshooting a crash case the customer will need to provide the crashinfo file. In the case of the 6500 (or 7600) there are actually two crashinfo files that are normally generated.  One crashinfo for the Route Processor (RP) and one for the Switch Processor (SP).  TAC will generally need both of these files in order to find the root cause of a crash on the 6500/7600 platform.  This document shows where to find these files and how to get them off the device that crashed.
What is a crashinfo file?
  When a Cisco device running IOS crashes it should create a special text format file known as a "crashinfo file". This file provides diagnostic information and logs that are an important part of the root cause analysis of a crash. Unless configured otherwise, crashinfo files are generally stored on the flash of the device that the crash occurred on.  For a normal router this would be the flash: or bootflash:.  In the case of the 6500/7600 with two processors, the SP and the RP, the locations are generally the sup-bootflash: or sup-bootdisk: (for the SP) and the bootflash: orbootdisk: (for the RP).
Crashinfo files can be recognized in the file system because the file name consists of "crashinfo" and a time stamp from when the crash occurred.
For example:
crashinfo_20110807-094111-PDT
Translates to:
crashinfo on August 7th, 2011 at 09:41:11 (AM) PDT
Default locations for crashinfo files  (6500/7600)
Route Processor (RP):
Usually the RP crashinfo would be stored on the bootflash: or bootdisk:.  For a switch with redundant supervisors then the crashinfo can also be stored on the slavebootflash: or slavebootdisk:
show bootflash: all -#- ED ----type------crc---- *snip* --------date/time---------name 1  .. crashinfo   10248492  *snip* 09:41:05 -07:00 crashinfo_20110807-094105-PDT
Therefore the path for the RP Crashinfo is:
bootflash:crashinfo_20110807-094105-PDT
Switch Processor (SP):
Usually the SP crashinfo would be stored on the sup-bootflash: or sup-bootdisk:. For a switch with redundant supervisors then the crashinfo can also be stored on the slavesup-bootflash: or slavesup-bootdisk:
show sup-bootdisk: all -#---length-- -----date/time------ path 1  33554432 Dec 21 2010 17:07:46 -08:00 sea_console.dat 2  33554432 Dec 21 2010 17:09:46 -08:00 sea_log.dat 3 133150980 Jan 6 2011 10:49:30 -08:00 c7600s72033-advipservicesk9-mz.122-33.SRD3.bin 4 155056204 Jan 5 2011 14:27:34 -08:00 c7600s72033-advipservicesk9-mz.122-33.SRE1.bin 5    419973 Aug 7 2011 09:41:16 -07:00 crashinfo_20110807-094111-PDT
Therefore the path for the SP Crashinfo is:
sup-bootdisk:crashinfo_20110807-094111-PDT
Capturing the Output of the Crashinfo File:
  Now that you know the path to the crashinfo files you need to actually capture the files.  The easiest way to do this is to enable logging on your terminal server (ie putty or SecureCRT) and save the output to a text file. Please remember to save the file name as something intelligible such as:
"slavesup-bootdisk-crashinfo_20110407-182546-UTC.txt" 
Enable logging:
  Putty:
  SecureCRT:
  File -> Log Session -> Enter File Name ->  Click Save
Once logging has been enabled you simply:
  term length 0   more path:filename   Example:   more slavesup-bootdisk:crashinfo_20110407-182546-UTC
Then disable logging and the file should be captured.
If you have a TFTP server you can also TFTP the files off the device, or use any of the other available means of file transfer.
For example:
  copy slavesup-bootdisk:crashinfo_20110407-182546-UTC tftp:
and then follow the prompts.
Caveat: Duplicate Crashinfo Names
  Because the file name of a crashinfo file consists of "crashinfo" plus a timestamp it is possible that the crashinfo file on the RP can have exactly the same file name as the crashinfo file on the SP. This can cause confusion so when logging a crashinfo file please be sure to indicate if the file is from the RP or SP.  This issue is resolved* in the most recent versions of IOS for the 6500/7600.

Crashinfo files have _RP and _SP added to them via two enhancemend Bug ID's.
For 6500:    12.2(33)SXJ and later   CSCte76841 Adding SP and RP in the middle of crashinfofiles for cat6000 
For 7600:    15.1(2)S and later   CSCtj17344  Add SP and RP in the crashinfo file name to differentiate the crashes.
Upon Completion:
  Please open a case with Cisco TAC. Once the case is open please upload the two crashinfo files in .txt format, unzipped.  (This will help avoid any extra delays.)  Usually it also helps to include a "show tech" in .txt format, if it is possible. Frequently an engineer armed with the crashinfo files from both the SP and RP, as well as a show_tech.txt, will be able to find the root cause of a crash. For more complex crashes more data may need to be gathered but the three files above will provide a solid foundation to begin troubleshooting.

Caveats

What if I don't find two crashinfo files for each crash?
  While in general the IOS for 6500's and 7600's is very good about generating a crashinfo file upon crash, this doesn't always happen. There are bugs and certain scenarios when one or both processors won't generate a crashinfo file. Sometimes instead of a crashinfo file you will find a "debuginfo" file with similar formatting.  If this is the case then please also include the debuginfo file by capturing it just like a crashinfo file.
Then please open a case with Cisco TAC and upload any crashinfo files that you can find, as well as a show tech and any syslogs that you may have from around the time of the reload. If you were running commands at the time of the crash this will also help your TAC Engineer determine the root cause of the crash.
Silent Reload: If there are no crashinfo files

  If the switch or router reset but there are no crashinfo files then please do a "show version" and look for the Reload Reason. If the reason is "reload" then someone likely issued the "reload" command and you should try to use AAA accounting, if available, to determine what commands were run before the event.
  If the reason is "power on" then please check the power feeding the switch's power supplies.  If there are other devices on the circuit please check to see if they also rebooted.  If so then there was an issue with the environmental power feeding the switch.  It could also be caused by someone performing routine maintenance.

In the event of an errored file.
  Rarely it is possible that the crashinfo file would get corrupted when it is generated.  Usually this issue would manifest itself by having the crashinfo file show in a "dir" output but the "more path:crashinfo" failing. If you do a "dir" and see an "E" flag then the file has been corrupted.  Fortunately we may still be able to get this data. In this scenario please add "/error" to the more command when printing out the crashinfo file.
For example:
show bootflash: all -#- ED ----type------crc---- *snip* --------date/time---------name 1  E. crashinfo   10248492  *snip* 09:41:05 -07:00 crashinfo_20110807-094105-PDT    ^
Therefore the to retrieve the errored crashinfo is:
  more /error bootflash:crashinfo_20110807-094105-PDT

Citation - This blog post does not reflect original content from the author. Rather it summarizes content that are relevant to the topic from different sources in the web. The sources might include any online discussion boards, forums, websites and others.

No comments :

Post a Comment

 
/* Google Analytics begin ----------------------------------------------- */ /* Google Analytics end ----------------------------------------------- */