LogAnalysis
Re: [logs] ugliest application logs ever? Jan 24 2008 10:20PM
Jason Lewis (jlewis packetnexus com) (1 replies)
Re: [logs] ugliest application logs ever? Jan 25 2008 12:20AM
David Corlette (DCorlette novell com) (2 replies)
Re: [logs] ugliest application logs ever? Jan 25 2008 02:09PM
Chris Lonvick (clonvick cisco com) (1 replies)
Re: [logs] ugliest application logs ever? Jan 29 2008 01:13AM
Anton Chuvakin (anton chuvakin org)
Re: [logs] ugliest application logs ever? Jan 25 2008 07:09AM
Fredrik Bengtsson (fredrik bengtsson fortego se)
Me personally, I dislike the regex approach for all but the most
simplistic formats and instead have customized parsers that can easily
and through well documented code deal with any quirks of the format in
question. They usually end up with not that many lines of code anyway
(and a whole lot faster), especially with a decent library of support
routines that grows with time.

Like a good NVP parser - I added the option bits MULTIWORD_KEY and
MULTIWORD_VALUE to ours so it can deal with the case of both (Fortigate)

[...] Original Address=172.10.10.1 [...]

and (Clavister?)

[...] flags=SYN ACK [...]

(but not both at the same time, obviously). Plus quoted keys and values
of course. Once you've written this, parsers for name-value formats
usually just contain a long list of lines like

VALIDATE_RESULT(Parser::FetchStringParameter("recvif", &ptr, &rint));
VALIDATE_RESULT(Parser::FetchIPv4Parameter("srcip", &ptr, &cip));
VALIDATE_RESULT(Parser::FetchIPv4Parameter("destip", &ptr, &sip));

where these helpers use the underlying ParseSingleKeyValuePair
iterator-like call and does additional value validation where necessary.
That way, you can also get error results like "Expected 'recvif'
parameter at character position 24, found 'gengis khan'" which is
usually infinitely more useful than most regex parser error messages.

Makes sense?

/Fredrik

David Corlette wrote:
> We just do a replace on those before we do the NVP parse. E.g.:
>
> src zone --> src_zone
> dst zone --> dst_zone
>
> Then we can run our standard NVP parser routine and it works like a charm...
>
>>>> On Thu, Jan 24, 2008 at 5:20 PM, in message
> <47990F41.2040603 (at) packetnexus (dot) com [email concealed]>, Jason Lewis <jlewis (at) packetnexus (dot) com [email concealed]> wrote:
>
>> Except they didn't standardize the keys....
>>
>> proto=6 src zone=Trust dst zone=Untrust action=Permit
>>
>> There is a space before zone that hoses things up.
>>
>> Dilley, Ron wrote:
>>> Jas,
>>>
>>> This does not look too bad as long as you don*t use regex to parse it.
>>>
>>> Key=value all the way . . .
>>>
>>> Ron
>>>
>>>
>>>
>>> On 1/24/08 11:52 AM, "Jason Lewis" <jlewis (at) packetnexus (dot) com [email concealed]> wrote:
>>>
>>> I don't know about ugly, but logs that are difficult to parse suck.
>>>
>>> Netscreen:
>>> messages:Dec 17 09:35:27 10.14.93.7 ns5xp: NetScreen device_id=ns5xp
>>> system-notification-00257(traffic): start_time="2002-12-17 09:40:18"
>>> duration=4 policy_id=0 service=tcp/port:8000 proto
>>> =6 src zone=Trust dst zone=Untrust action=Permit sent=715 rcvd=6561
>>> src=10.14.94.221 dst=10.14.90.217 src_port=1039 dst_port=8000
>>> translated
>>> ip=10.14.93.7 port=1217
>>> messages:Dec 17 09:35:27 10.14.93.7 ns5xp: NetScreen device_id=ns5xp
>>> system-notification-00257(traffic): start_time="2002-12-17 09:40:18"
>>> duration=4 policy_id=0 service=tcp/port:8000 proto
>>> =6 src zone=Trust dst zone=Untrust action=Permit sent=651 rcvd=2782
>>> src=10.14.94.221 dst=10.14.90.217 src_port=1040 dst_port=8000
>>> translated
>>> ip=10.14.93.7 port=1218
>>>
>>> There isn't a good delimiter to break the log up, so it requires an
>>> custom regex. Trying to use a space is a nightmare. Give me something
>>> so I can quickly grab only what I need. I like pipe delimited.
>>>
>>> jas
>>>
>>>
>>> Anton Chuvakin wrote:
>>> > All,
>>> >
>>> > Ah, long time - no post! :-)
>>> >
>>> > I wanted to turn this into a formal contest but figured I'd poll the
>>> > list first: what are the ugliest, most useless application logs that
>>> > you've seen? Logs that defy log analysis, that are full of numeric
>>> > codes not explained anywhere? Logs that don't say what they mean (and
>>> > vice versa)? Logs that omit the most critical piece of info?
>>> >
>>> > Here is my example:
>>> >
>>> > |22:22:32|BTC| 7|000|DDIC | |R49|Communication error, CPIC
>>> > return code 020, <application> return code 456
>>> >
>>> > Why it sux: numeric codes (twice), ambiguous language, no sense of
>>> > priority, etc.
>>> >
>>> > More?
>>> >
>>> > Best,
_______________________________________________
LogAnalysis mailing list
LogAnalysis (at) loganalysis (dot) org [email concealed]
http://www.loganalysis.org/mailman/listinfo/loganalysis

[ reply ]


 

Privacy Statement
Copyright 2010, SecurityFocus