|
LogAnalysis
Re: Re: [logs] regexless parsing, again? Sep 18 2007 04:51PM Marcus J. Ranum (mjr ranum com) (2 replies) Re: Re: [logs] regexless parsing, again? Sep 20 2007 12:59AM Mordechai T. Abzug (morty frakir org) (2 replies) Re: [logs] regexless parsing, again? Sep 20 2007 02:07PM Mike Heisler (mgh4 cornell edu) (2 replies) RE: [logs] regexless parsing, again? Sep 20 2007 07:31PM Rainer Gerhards (rgerhards hq adiscon com) (1 replies) Re: [logs] regexless parsing, again? Sep 24 2007 10:30PM Anton Chuvakin (anton chuvakin org) (3 replies) RE: [logs] regexless parsing, again? Sep 27 2007 04:41PM Eric Fitzgerald (Eric Fitzgerald microsoft com) (1 replies) RE: Re: [logs] regexless parsing, again? Sep 20 2007 06:48AM Rainer Gerhards (rgerhards hq adiscon com) Re: Re: [logs] regexless parsing, again? Sep 19 2007 04:58PM E G (bronc94583 yahoo com) (4 replies) RE: Re: [logs] regexless parsing, again? Sep 19 2007 06:40PM Tina Bird (tbird precision-guesswork com) |
|
Privacy Statement |
> "explains" how to "parse" unknown logs, apparently with no manually
> written regexes in sight...
http://www.freshpatents.com/System-and-method-for-analysis-and-managemen
t-of-logs-and-events-dt20060817ptan20060184529.php?type=description
> "[0031] Another preferred embodiment of the present invention
> describes a method for parsing log data with undefined grammar. The
> method comprises the following steps: a) storing more than one pattern
> object record of different grammar types, b) receiving at least a
> portion of raw log data input from at least one computerized system,
> c) identifying the delimiter of the portion of raw log data's grammar,
> d) using the delimiter for generating a new pattern object
> representing the grammar type of the log data, the new pattern object
> comprising a list of terms, and e) storing the new pattern object. "
Sounds like a standard tokenization methodology. Other network vendors have
implemented similar methods using dynamic token dictionaries of byte
stream. Same approach can be applied to log messages. You identify log
messages not by a regex but by token values within the message, the
"grammar" of the tokens, etc. You can have dictionary tokens, grammar
tokens, tokens-of-tokens, etc.
> BTW, here is a patent for log management , which (among other things)<br>> "explains" how to "parse" unknown logs, apparently with no manually<br>> written regexes in sight...<br><br><a href="http://www.freshpatents.com/System-and-method-for-analysis-and-man
agement-of-logs-and-events-dt20060817ptan20060184529.php?type=descriptio
n">
http://www.freshpatents.com/System-and-method-for-analysis-and-managemen
t-of-logs-and-events-dt20060817ptan20060184529.php?type=description</a><
br><br>> "[0031] Another preferred embodiment of the present invention
<br>> describes a method for parsing log data with undefined grammar. The<br>> method comprises the following steps: a) storing more than one pattern<br>> object record of different grammar types, b) receiving at least a
<br>> portion of raw log data input from at least one computerized system,<br>> c) identifying the delimiter of the portion of raw log data's grammar,<br>> d) using the delimiter for generating a new pattern object
<br>> representing the grammar type of the log data, the new pattern object<br>> comprising a list of terms, and e) storing the new pattern object. "<br><br>Sounds like a standard tokenization methodology. Other network vendors have implemented similar methods using dynamic token dictionaries of byte stream. Same approach can be applied to log messages. You identify log messages not by a regex but by token values within the message, the "grammar" of the tokens, etc. You can have dictionary tokens, grammar tokens, tokens-of-tokens, etc.
<br><br>
_______________________________________________
LogAnalysis mailing list
LogAnalysis (at) loganalysis (dot) org [email concealed]
http://www.loganalysis.org/mailman/listinfo/loganalysis
[ reply ]