Using Multiple Grok Statements to Parse a Java Stack Trace

Introduction

With increasing deployment of Big Data applications as well as to manage scalability and availability, enterprises are deploying a lot of servers (either as physical instances of virtual instances). Many of these deployments will be of Java applications, hosted on enterprise application servers like IBM WebSphere, Oracle Fusion Server, etc. Given the nature of application development and the way users interact with them, there is a good chance of these applications crashing at some point in time during their operation. If such applications are handling critical enterprise functionality, it is imperative that failures are detected at the earliest and responded to by suitable corrective action, followed by re-deployment of the failed services.

To monitor applications, most enterprises use well-defined logging guidelines, which not only log server performance parameters like CPU usage, memory consumption, disk consumption, but also make it mandatory for applications to log their performance and progress parameters. For example, in the case of a retail web application, it may log information about all the items in the users' shopping cart. Such information can provide a basis for detailed analysis as well as allow the application to recover its state, in case the application crashes mid-way through a transaction. In case of an error, a Java application generates an exception report, also known as a Java stack trace, which provides information that can be used by the development teams, to identify and resolve the error.

For any analysis or recovery, parsing the log information is of prime importance. In this document, we will cover, using the Logstash tool - the most common methods used to parse a Java stack trace. We will also share some issues faced when using these methods and propose an alternative approach to parsing the Java stack trace.

Java Stack Trace

When compared with other machine log traces generated by applications, a Java stack trace is quite peculiar and different. This is because it spans multiple lines, whereas most log information spans only one line. For effective analytics and reporting, all information from the log data needs to be parsed - irrespective of it being on a single line or spanning multiple lines.

For the scope of this document, we will refer to the sample log, shown in Table 1. It should be noted that a Java stack trace does not exist in isolation and is usually intermingled with other log information generated by the application. Thus, it is important for any parsing method to not only parse the stack trace correctly, but to also parse other log information.

Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile
INFO: validateJarFile(C:\Apps\tomcat-6.0.41\webapps\Recommendation\WEB-INF\lib\servlet-api.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class
Jan 12, 2015 9:51:18 AM org.apache.catalina.core.StandardContext start
SEVERE: Error listenerStart
Jan 12, 2015 9:51:18 AM org.apache.catalina.core.StandardContext start
SEVERE: Context [/Recommendation] startup failed due to previous errors
Jan 12, 2015 9:51:19 AM org.apache.catalina.loader.WebappClassLoader clearReferencesJdbc
WARNING: JDBC driver de-registration failed for web application [/Recommendation]
java.lang.reflect.InvocationTargetException
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
   at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: PermGen space
   at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
   at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1647)
   at java.lang.Class.forName(Class.java:274)
   at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
... 1 more




Jan 12, 2015 9:51:19 AM org.apache.catalina.core.StandardContext start
SEVERE: Exception during cleanup after start failed
java.lang.OutOfMemoryError: PermGen space
   at sun.misc.Unsafe.defineClass(Native Method)
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1642)
   at java.lang.Thread.run(Thread.java:745)


Table 1: Sample log

ELK Stack of Tools

With machine generating a lot of information, parsing the generated data is important. For a long time, enterprises have deployed custom parsers to read such data set and enable processing on the same. Nowadays, a lot of tools are available to parse and process data. There are commercial tools like Splunk. And there are open source tools like GrayLog2, Logstash and Elasticsearch. Of these tools, we have selected the ELK stack, which is short for a trio of tools, namely Elasticsearch, Logstash and Kibana. In this document, we will only cover Logstash, though we are providing a brief overview of all three components in the stack.

Logstash

Logstash is an open source tool that provides an integrated framework for log collection, centralization, parsing, storage and search. It has a wide variety of input mechanisms: it can take inputs from TCP/UDP, files, Syslog, Microsoft Windows EventLogs, STDIN and a variety of other sources. The Logstash server supports a large collection of filters that allow us to modify, manipulate and transform those events and give them context. Doing so makes it easier to draw conclusions and make good decisions using the log data. On the output side, Logstash supports a huge range of destinations, including TCP/UDP, email, files, HTTP, Nagios and a wide variety of network and online services.

Elasticsearch

Elasticsearch is an open-source search engine built on top of Apache Lucene, a fulltext search-engine library. Lucene is a complex, advanced, high-performance, and fully featured search engine library. Elasticsearch uses Lucene internally for all of its indexing and searching, but aims to make full-text search easy by hiding its (Lucene's) complexities behind a simple, coherent, RESTful API. In addition to full-text search, Elasticsearch also supports the following features

Kibana

Kibana is an open source analytics and visualization platform designed to work with Elasticsearch. Kibana can be used to search, view, and interact with data stored in Elasticsearch indices. Advanced data analysis and visualization of data is possible using a variety of charts, tables, and maps.

Using ELK

Typically, log information is collected and parsed from the various systems by Logstash. If the system is remote, Logstash forwarder agents have to be used to send data to the Logstash instance. Log information is parsed using Logstash scripts and inserted into an Elasticsearch index. Typically, Logstash scripts convert the plain text log information into JSON objects, which are then stored by Elasticsearch. Elasticsearch allows the business logic to search for information using suitable queries and results are generated in JSON format. Kibana, the visualization tool, understands Elasticsearch and we can start creating visualizations simply by pointing the Kibana instance to an Elasticsearch index. Kibana fetches the index definitions from Elasticsearch, which ca be used to define visualizations like column charts, pie charts and the like.

Approach 1: using multiline codec in input

One of the most common solutions suggested to parse a Java stack trace is to use the 'multiline' 'codec' in the input section of the Logstash script. The pattern used to read the data, appends all lines that begin with a whitespace, to the previous line. In other words, when Logstash reads a line of input that begins with a whitespace (space, tab), that line will be merged with the previously read input information. Thus, multiple lines of the trace are merged into one entry. A sample script using the 'multiline' 'codec' is shown in Table 2. The pattern '^\s*' means any line beginning with whitespace.

input {
  tcp {
   type =>; "javaStackTraceMultilineCodec"
      port => 9999
      codec => multiline {
        pattern => "(^\s*)"
        what => "previous"
      }
   }
}

filter {
  if [type] == "javaStackTraceMultilineCodec" {
    grok {
      patterns_dir => "./patterns"
      match => [ "message", "%{TOMCATLOG}", "message", "%{CATALINALOG}" ]
    }
    date {
      match => [ "timestamp", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
    }
    mutate {
      gsub => ['message', "\n", " "]
      gsub => ['message', "\t", " "]
    }
  }
}

output {
  stdout {
    codec => "rubydebug"
  }
}


Table 2: Multiline codec for parsing

When this script is executed on the sample log, it generates JSON records that can be inserted into an Elasticsearch index. The JSON records generated are shown in Table 3. As shown, the complete log information has been read by Logstash. In most cases, the original message is retained in the 'message' field, so as to provide an easy reference to the original data.

{
   "@timestamp" => "2015-01-12T04:21:08.000Z",
   "message" => "Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile INFO: validateJarFile(C:\\Apps\\tomcat-6.0.41\\webapps\\Recommendation\\WEB-INF\\lib\\servlet-api.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class Jan 12, 2015 9:51:18 AM org.apach e.catalina.co re.  St an d ar dC on t ex t   s ta rt   S EV ER E :   Er r or   l i st e ne rS ta r t   J an   1 2,   2 0 15   9 :5 1 :1 8   A M   or g. ap ac he .catalina.core.StandardCo ntext   s tart   S EVERE: Context [/Recom men d at i o n]   s t ar t up   f a il e  d   du e   t o   p r ev i ou s   e r r o r s   J a n   1 2 ,   2 0 1 5   9 :5 1 : 1 9   A M   or g . a p a c h e . c at a l i n a . l o a d e r . W e b a p p C l a s s L o a d e r   c l e a r R e f e r e n c e s J d b c   W A R N I N G :   J D B C   d r i v e r   d e - r e g i s t r a t i o n   fa il e d   f o r   we b   a p p l i c a t i o n   [ / R e c om  m e n d a t i o n ]   j a v a.lang.reflect.Invocatio n Tar g e t E x c e p t i o n   a t   s u n . r e f l e c t . De l e g a t i n g M e t h o d A c c e s s o r I m p l . i n v o k e ( D e l e g a t i n g M e t h o d A c c e s s o r Im p l . j a v a : 4 3 )   a t   j a v a . l a n g . r e f l e c t . M e t h o d . i n v o k e ( M e t h o d . j a v a: 6 0 6 )   a t   o r g . a p a c h e . c a t a l i n a . u t i l . L i f e c y c l e S u p p o r t . f i r e L i f ec y c l e E v e n t ( L i f e c y c l e S u p p o r t . j a v a : 1 4 2 )   a t   j a v a . l a n g . T h r e a d . ru n ( T h r e a d . j a v a : 7 4 5 )   C a u s e d   b y :   j a v a . l a n g . O u t O f M e m o r y E r r o r :   Pe r m G e n   s p a c e   a t   j a v a . l a n g . C l a s s L o a d e r . d e f i n e C l a s s ( C l a s s L o a de r . j a v a : 8 0 0 )   a t   j a v a . l a n g . C l a s s L o a d e r . d e f i n e C l a s s ( C l a s s L o a de r . j a v a : 8 0 0 )   a t   o r g . a p a c h e . c a t a l i n a . l o a d e r . W e b a p p C l a s s L o a d er . l o a d C l a s s ( W e b a p p C l a s s L o a d e r . j a v a : 1 6 4 7 )   a t   j a v a . l a n g . C l a s s. f o r N a m e ( C l a s s . j a v a : 2 7 4 )   a t   o r g . a p a c h e . c a t a l i n a . c o r e . C onta inerBase.addChild(C ontainerBase.java:7 79) ...   1 more   Jan 12 , 2015 9:5 1:19 AM org .apache.ca talina.co re.Standar dContext   start   SEVERE:   Exception   during cl eanup afte r start f ailed   java.lang .OutOfMem oryError:   PermGen s pace   at  sun.mi sc.Unsafe.def ineClass(Nati ve Method)   org.ap ache.catalin a.core.Contai nerBase$Conta inerBackgroun dProcessor.ru n(ContainerBa se.java:1642)   at   java.lang.Th read.run(Thre ad.java:745) ",
   "@version" => "1",
   "tags" => [
   [0] "multiline"
   ],
   "host" => "127.0.0.1:56123",
   "type" => "javaStackTraceMultiline",
   "timestamp" => "Jan 12, 2015 9:51:08 AM",
   "class" => "org.apache.catalina.loader.WebappClassLoader",
   "logmessage" => "validateJarFile"
}


Table 3: Records created when using multiline code in input section

Issues with the script

While this is the most common suggestion, there are a few issues with this solution. As shown in the output, this script has created a JSON object containing all lines from the input file. Ideally, each line that begins with a timestamp should be created as a separate record and a separate record object should be created for the Java stack trace information. The problem with the pattern used is that it reads extra information.

Another issue with the script relates to the buffer used for storing log data. Logstash has a limited buffer (whose size can be changed) for storing the input data. If the Java stack trace spans many lines, there is a chance that the concatenation of multiple lines into one line can cause this size to be exceeded. Whenever such splitting occurs, it is not consistent and data continuity is lost to an extent that it is not possible to extract meaningful information from the generated output. As an example, if Logstash exceeds its buffer size while parsing 'at java.lang.Class.forName(Class.java:274)', one possibility of the strings inserted into Elasticsearch (as records), can be 'at', 'java.', 'lang.Class.forName', and '(Class.java:274)'. It should be noted that behaviour is not consistent in that the first record contains two characters; the second contains five characters and so on. The result depends on the way the data is processed in the 'filter' section of the Logstash script.

Approach 2: using multiline in filter

Another common suggestion to parse Java stack trace is to use 'multiline' in the 'filter' section of the script. As in approach 1, 'multiline' reads multiple lines of input as one block of text. The Logstash script using 'multiline' in 'filter' is shown in Table 4.

input {
  tcp {
    type => "javaStackTraceMultiline"
    port => 9999
  }
}

filter {
  if [type] == "javaStackTraceMultiline" {
    multiline {
      patterns_dir => "./patterns"
      pattern => "(^%{TOMCAT_DATESTAMP})|(^%{CATALINA_DATESTAMP})"
      negate => true
      what => "previous"
    }

    grok {
      patterns_dir => "D:/bipin/src/elk/patterns"
      match => [ "message", "%{TOMCATLOG}", "message", "%{CATALINALOG}" ]
    }

    date {
      match => [ "timestamp", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
    }

    mutate {
     gsub => ['message', "\n", " "]
     gsub => ['message', "\t", " "]
    }
  }
}

output {
  stdout {
    codec => "rubydebug"
  }
}


Table 4: Using multline in filter

The records generated when the script is executed on the sample data are shown in Table 5. Using this approach, multiple records are created, as expected.

{
   "message" => "Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile\nINFO: validateJarFile(C:\\Apps\\tomcat-6.0.41\\webapps\\Recommendation\\WEB-INF\\lib\\servlet-api.jar) - jar not loaded. See ServletSpec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class",
   "@version" => "1",
   "@timestamp" => "2015-01-12T04:21:08.000Z",
   "host" => "127.0.0.1:56171",
   "type" => "javaStackTraceMultiline",
   "tags" => [
   [0] "multiline"
   ],
   "timestamp" => "Jan 12, 2015 9:51:08 AM",
   "class" => "org.apache.catalina.loader.WebappClassLoader",
   "logmessage" => "validateJarFile"
}

{
   "message" => "Jan 12, 2015 9:51:18 AM org.apache.catalina.core.StandardContext start\nSEVERE: Error listenerStart",
   "@version" => "1",
   "@timestamp" => "2015-01-12T04:21:18.000Z",
   "host" => "127.0.0.1:56171",
   "type" => "javaStackTraceMultiline",
   "tags" => [
   [0] "multiline"
   ],
   "timestamp" => "Jan 12, 2015 9:51:18 AM",
   "class" => "org.apache.catalina.core.StandardContext",
   "logmessage" => "start"
}

{
   "message" => "Jan 12, 2015 9:51:18 AM org.apache.catalina.core.StandardContext start\nSEVERE: Context [/Recommendation] startup failed due to previous errors",
   "@version" => "1",
   "@timestamp" => "2015-01-12T04:21:18.000Z",
   "host" => "127.0.0.1:56171",
   "type" => "javaStackTraceMultiline",
   "tags" => [
   [0] "multiline"
   ],
   "timestamp" => "Jan 12, 2015 9:51:18 AM",
   "class" => "org.apache.catalina.core.StandardContext",
   "logmessage" => "start"
}

{
   "message" => "Jan 12, 2015 9:51:19 AM org.apache.catalina.loader.WebappClassLoader clearReferencesJdbc\nWARNING:JDBC driver de-registration failed for web application [/Recommendation]\njava.lang.reflect.InvocationTargetException\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:606)\n\tat org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: java.lang.OutOfMemoryError: PermGen space\n\tat java.lang.ClassLoader.defineClass(ClassLoader.java:800)\n\tat org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1647)\n\tat java.lang.Class.forName(Class.java:274)\n\tat org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)\n... 1 more\n",

   "@version" => "1",
   "@timestamp" => "2015-01-12T04:21:19.000Z",
   "host" => "127.0.0.1:56171",
   "type" => "javaStackTraceMultiline",
   "tags" => [
   [0] "multiline"
   ],
   "timestamp" => "Jan 12, 2015 9:51:19 AM",
   "class" => "org.apache.catalina.loader.WebappClassLoader",
   "logmessage" => "clearReferencesJdbc"
}


Table 5: Records created when using multiline in filter section

Issues with the script

This script also has issues. One of the issues with this method is that it seems to skip log information if an empty line does not separate the previous stack trace and a new line containing log information. For example, for the data shown in Table 6, the data following the line '... 1 more' is not parsed properly. Thus, this script is also not without drawbacks, as we typically would not like to have incomplete information while performing analysis.

. . . (omitted)
   at java.lang.Class.forName(Class.java:274)
   at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
... 1 more

Jan 12, 2015 9:51:19 AM org.apache.catalina.core.StandardContext start
SEVERE: Exception during cleanup after start failed
java.lang.OutOfMemoryError: PermGen space
   at sun.misc.Unsafe.defineClass(Native Method)
. . . (omitted)


Table 6: Empty line not present in log

Approach 3 – Using multiple grok statements

To overcome the problems described in the previous section, we propose a method where we treat each line of the Java stack trace at an individual level, without merging them into one line. The script uses multiple grok statements to parse the trace information. As the script stores each line as separate record in Elasticsearch, we use one pattern per input line type. For this script, we have defined a few patterns, as shown in Table 7.

Input line

Pattern used

Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile

CUSTOM_CATALINALOG

SEVERE: Error listenerStart

CUSTOM_WARNINGLEVEL

WARNING: JDBC driver de-registration failed for web application [/Recommendation]

CUSTOM_WARNINGLEVEL

Caused by: java.lang.OutOfMemoryError: PermGen space

CUSTOM_TRACE_CAUSEDBY

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

CUSTOM_TRACE_OTHERS

java.lang.reflect.InvocationTargetException

CUSTOM_TRACE_EXCEPTION

... 1 more

CUSTOM_TRACE_OTHERS

Any other text

GREEDYDATA

Table 7: Patterns used

The script used to parse the log data is shown in Table 8. As depicted, we use multiple 'grok' statements, with one statement for each type of input data. In addition to the log data, each 'grok' statement adds an additional field, namely 'subType' that indicates the kind of record. Some of the values for the 'subType' field are

Table 8: Parsing input into separate records

The records generated by the script, for the sample log data are shown in Table 9.

{
   "message" => "Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.165Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "catalinaLog" => "Jan 12, 2015 9:51:08 AM org.apache.catalina.loader.WebappClassLoader validateJarFile",
   "timestamp" => "Jan 12, 2015 9:51:08 AM",
   "class" => "org.apache.catalina.loader.WebappClassLoader",
   "messageText" => "validateJarFile",
   "subType" => "timestamp",
   "uploadTimestamp" => "2015-04-06T05:26:31.165Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}

{
   "message" => "INFO: validateJarFile(C:\\Apps\\tomcat-6.0.41\\webapps\\Recommendation\\WEB-INF\\lib\\servlet-api.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.194Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "tags" => [],
   "warningLevel" => "INFO:",
   "messageText" => "validateJarFile(C:\\Apps\\tomcat-6.0.41\\webapps\\Recommendation\\WEB-INF\\lib\\servlet-api.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class",
   "subType" => "warning",
   "uploadTimestamp" => "2015-04-06T05:26:31.194Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}

{
   "message" => "java.lang.reflect.InvocationTargetException",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.382Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "tags" => [],
   "exceptionText" => "java.lang.reflect.InvocationTargetException",
   "subType" => "exception",
   "uploadTimestamp" => "2015-04-06T05:26:31.382Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}

{
   "message" => " at java.lang.Thread.run(Thread.java:745)",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.598Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "tags" => [],
   "messageText" => " at java.lang.Thread.run(Thread.java:745)",
   "subType" => "continuation",
   "uploadTimestamp" => "2015-04-06T05:26:31.598Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}

{
   "message" => "Caused by: java.lang.OutOfMemoryError: PermGen space",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.642Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "tags" => [],
   "causedbyText" => "Caused by: java.lang.OutOfMemoryError: PermGen",
   "messageText" => "space",
   "subType" => "cause",
   "uploadTimestamp" => "2015-04-06T05:26:31.642Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}

{
   "message" => "... 1 more",
   "@version" => "1",
   "@timestamp" => "2015-04-06T05:26:31.846Z",
   "host" => "127.0.0.1:56016",
   "type" => "javaStackTrace",
   "tags" => [],
   "unrecognizedText" => "... 1 more",
   "subType" => "unrecognizedText",
   "uploadTimestamp" => "2015-04-06T05:26:31.846Z",
   "uploadTimestampString" => "2015-04-06 05:26:31 UTC"
}


Table 9: Separate records created from log data

Patterns

The Logstash script used to parse the Java stack trace uses some standard patterns. Standard patterns are typically supplied with the Logstash installation and are stored in the 'grok-patterns' file (in the 'patterns' directory, under the Logstash installation). We have defined a few custom patterns that have been appended to the standard pattern file. The custom defined patterns are given in Table 10.

JAVACLASS (?:+\.)+[A-Za-z0-9$]+
JAVALOGMESSAGE (.*)
# MMM dd, yyyy HH:mm:ss eg: Jan 9, 2014 7:13:13 AM
CATALINA_DATESTAMP %{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM)
CUSTOM_WARNINGLEVEL (^SEVERE:)|(^INFO:)|(^WARNING:)|(^ERROR:)|(^DEBUG:)|(^Error:)
CUSTOM_TRACE_ERROR (^.+Error:)
CUSTOM_TRACE_EXCEPTION (^java.+Exception)|(^.+Exception.+)|(^.+Exception: .+)
CUSTOM_TRACE_CAUSED_BY (^\s*Caused by:.+)
CUSTOM_TEXT (?:[A-Za-z_.-]+)
CUSTOM_TRACE_OTHERS (^\s+at .+)|(^\s+... \d+ more)
CUSTOM_CATALINALOG %{CATALINA_DATESTAMP:timestamp} %{JAVACLASS:class} %{JAVALOGMESSAGE:messageText}


Table 10: Pattern definitions

Observations

While the script describe in approach 3 does not leave any data unparsed, it also has a few quirks.

Order of 'grok' statements

The first quirk is related to the use of 'grok' statements. When using multiple 'grok' statements, the order of specifying them is very important. If a line of data can be matched using more than one pattern, we need to determine pattern that best describes the log data being parsed presently. For example, a line in the log file may contain the word 'Exception' either as standalone text or as part of a Java named exception. If we have two patterns that contain 'Exception' as text that needs to be matched, we need to decide the pattern that will take precedence. In the most likely case, we would like to have 'Exception' as part of a Java named exception to have more precedence over 'Exception' as a word by itself.

'Related' records

The second quirk is to do with 'relatedness'. As part of machine log analytics and related reporting, it is important to display all information in a coherent manner. Using the script described in approach 3, we are splitting information from the stack trace into multiple records that are not connected to each other.

Retrieving related records

Once all the data has been inserted into Elasticsearch, the only way to identify all the records that belong to one stack trace is to make use of the '@timestamp' field. To get all records, we will need to get the timestamp of the first record in the timestamp and fetch all records whose timestamp value is near to that of the first record. But, to make that task easier, we have added a helper field, namely 'subType' into each record. Thus, using a combination of '@timestamp' and 'subType' fields, we will be able to fetch the necessary records. It should be noted that '@timestamp' is a special field that is difficult to map to a Java class, from its JSON equivalent. Hence we have added two fields (which represent the same value), namely 'upload_time' and 'upload_time_string'. The value for both fields is the same as that of '@timestamp', indicating the time when the record was parsed and uploaded to Elasticsearch. The field 'upload_time' is of type 'Date', while the field 'upload_time_string' is of type 'String'.

Conclusion

In this article, we have presented a few of the approaches commonly used to parse Java stack trace information. Most of the approaches try and merge all the lines that belong to the stack trace, into one block of information. Due to the problems associated with these approaches, we suggest that it is better to treat the stack trace information as individual pieces of information and parse them as separate records. But, by using one record per line, our business logic / search criteria have to become 'intelligent' while extracting all the records that match our analysis and reporting criteria.

References

 

 

 

 

Top