LogAnalysis
Re: [logs] regexless parsing, again? Sep 13 2007 10:51PM
Marcus J. Ranum (mjr ranum com) (1 replies)
Re: [logs] regexless parsing, again? Sep 14 2007 12:12AM
E G (bronc94583 yahoo com) (1 replies)
Re: [logs] regexless parsing, again? Sep 14 2007 07:30PM
Christina Noren (cfrln cfrln com) (1 replies)
I think you need to distinguish between the two different goals of
parsing in order to have a productive discussion of this:

1) classifying log messages based on having a specific pattern -
which the below approach does better than regexes - and which is
realized somewhat similarly in Splunk's automatic event type
classification feature.
2) pulling out and naming fields within the log data - which is also
something where there are other possible approaches than regexes.
However, even other approaches would necessarily rely on pattern
matching of some sort in the absence of a self-describing log format
whether XML, name/value pair, csv header, etc. (Splunk also guesses
at fields for such well defined formats.) But the pattern matching
could be less cryptic and smarter about the patterns that are in
logs, such as Marcus mentions in his last post.

Repeating Raffy's disclaimer, I also work at Splunk.

Christina

On Sep 13, 2007, at 5:12 PM, E G wrote:

> Back when I worked at "another company" a few years
> back I did a lot of research into this area.
>
> We looked a an approach that grouped logs together
> based upon what we already knew about that type of log
> source and how they are similar, rather then
> "guessing" what each line was as it came in.
>
> This came about from doing quite a bit of statistical
> analysis on raw log data, I noted quite a bit of
> correlation from source to source (which in itself
> isn't news), but because of this, would allow us to
> classify unknown data in some semi-intelligent method
> and dump known entities in known "buckets".
>
> Working with some people who were much smarter then I,
> I was able to create a reverse Patricia Trie tree like
> structure. Think of it like when you're on your
> blackberry and you're typing. It attempts to predict
> the next letter and tries to complete the word you're
> typing for you. The same logic can basically work in
> reverse where you use this Trie structure to dissemble
> a word, or string in our case. Once you reach an end
> point on the Trie, it leaves you with what the data
> is, however you have decided to classify it.
>
> I hope that's understandable; I didn't want to write
> out a book.
>
> Anyhow, my ideas didn't end up going anywhere. They
> choose to stay with the RegEx "guessing" method - as
> is the standard. I had a lot of the code I developed
> after I left up on SourceForce for a while, but real
> life took me away from it. I might be able to dig it
> up if anyone is interested.
>
>
> - Erik
>
>
> --- "Marcus J. Ranum" <mjr (at) ranum (dot) com [email concealed]> wrote:
>
>> Anton Chuvakin wrote:
>>> Anybody care to restart the discussion and see what
>> the collective
>>> wisdom of loganalysis can produce?
>>
>> I am coding on something regarding regexless parsing
>> as we
>> speak. ETA is unknown but certainly before Xmas. It
>> will be
>> open source but not GPL.
>>
>> mjr.
>> _______________________________________________
>> LogAnalysis mailing list
>> LogAnalysis (at) loganalysis (dot) org [email concealed]
>>
> http://www.loganalysis.org/mailman/listinfo/loganalysis
>>
>
>
>
>
> ______________________________________________________________________
> ______________
> Be a better Globetrotter. Get better travel answers from someone
> who knows. Yahoo! Answers - Check it out.
> http://answers.yahoo.com/dir/?link=list&sid=396545469
> _______________________________________________
> LogAnalysis mailing list
> LogAnalysis (at) loganalysis (dot) org [email concealed]
> http://www.loganalysis.org/mailman/listinfo/loganalysis
_______________________________________________
LogAnalysis mailing list
LogAnalysis (at) loganalysis (dot) org [email concealed]
http://www.loganalysis.org/mailman/listinfo/loganalysis

[ reply ]
Re: [logs] regexless parsing, again? Sep 14 2007 08:05PM
E G (bronc94583 yahoo com) (1 replies)
RE: [logs] regexless parsing, again? Sep 14 2007 09:41PM
Kinsley, Michael (michael kinsley sensage com) (1 replies)
Re: [logs] regexless parsing, again? Sep 14 2007 10:40PM
Christina Noren (cfrln cfrln com) (1 replies)
Re: Re: [logs] regexless parsing, again? Sep 15 2007 12:07AM
Michael Kinsley (michael kinsley sensage com) (3 replies)
Re: Re: [logs] regexless parsing, again? Sep 15 2007 05:59AM
Tom Le (dottom gmail com)
Re: Re: [logs] regexless parsing, again? Sep 15 2007 05:33AM
E G (bronc94583 yahoo com)
Re: Re: [logs] regexless parsing, again? Sep 15 2007 02:25AM
cfrln cfrln com


 

Privacy Statement
Copyright 2010, SecurityFocus