LogAnalysis
Re: Re: [logs] regexless parsing, again? Sep 15 2007 08:40PM
Marcus J. Ranum (mjr ranum com) (1 replies)
Re: Re: [logs] regexless parsing, again? Sep 16 2007 05:42AM
Tom Le (dottom gmail com) (1 replies)
RE: Re: [logs] regexless parsing, again? Sep 18 2007 03:06AM
Desai, Ashish (Ashish Desai fmr com) (1 replies)
There have been some amazing advances in hardware to do PCRE.
Once you max out a regular CPU, you can consider to offload this to a
card.
Check out Tarari http://www.tarari.com/PDF/Tarari-T9000-CP_PB.pdf
They have a API
1.Allows you to load up the chip with all the regexs you desire.
2. Then blast the content you want to want to test and out comes a list
of all the matches
The speeds are pretty incredible (at least on paper), that even
writting the the crappiest regexs you
would have a hard time hitting the system maximum.

Ashish

_____

From: loganalysis-bounces (at) loganalysis (dot) org [email concealed]
[mailto:loganalysis-bounces (at) loganalysis (dot) org [email concealed]] On Behalf Of Tom Le

Heh. Just making sure we didn't trivialize the fact that one
can still maintain the more traditional ways of building regex rules and
still achieve significant performance gains. Scale should be mentioned
here. If you go from parsing 1000 msgs/sec => 10,000 msgs/sec that
might be great for you, but insignificant for others. YMMV.

More like: "Marcus, you should separate discussion of regexes
vs. other parsing approaches into separate categories: performance,
initial ruleset development cost, and on-going maintenance."

Each discussion has it's pros and cons with different cost(x) *
complexity(y) functions depending on the what you're doing and size of
your rulesets. I was just trying to explore a deeper level of
discussion than the usual 'regexes suck' or 'PCRE performance sucks' or
'maintaining 100,000 rules is ugly' type discussions.

Note, however, that I will reserve the right to use parts of
your above quote in the future. :)

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3157" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>There have been some amazing advances in hardware to do
PCRE.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>Once you max out a regular CPU, you can consider
to offload this to a card.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>Check out Tarari <A
href="http://www.tarari.com/PDF/Tarari-T9000-CP_PB.pdf">http://www.tarar
i.com/PDF/Tarari-T9000-CP_PB.pdf</A></SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>They have a API </SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>1.Allows you to load up the chip with all the regexs
you desire.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>2. Then blast the content you want to want to test and
out comes a list of all the matches</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>The speeds are pretty incredible (at least on paper),
 that even writting the the crappiest regexs you</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>would have a hard time hitting the system
maximum.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007>Ashish</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=715230003-18092007></SPAN></FONT> </DIV><BR>
<BLOCKQUOTE style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> loganalysis-bounces (at) loganalysis (dot) org [email concealed]
[mailto:loganalysis-bounces (at) loganalysis (dot) org [email concealed]] <B>On Behalf Of </B>Tom
Le<BR></FONT></DIV><BR>Heh.  Just making sure we didn't trivialize the
fact that one can still maintain the more traditional ways of building regex
rules and still achieve significant performance gains.  Scale should be
mentioned here.  If you go from parsing 1000 msgs/sec => 10,000
msgs/sec that might be great for you, but insignificant for others. 
YMMV. <BR><BR>More like: "Marcus, you should separate discussion of regexes
vs. other parsing approaches into separate categories: performance, initial
ruleset development cost, and on-going maintenance." <BR><BR>Each discussion
has it's pros and cons with different cost(x) * complexity(y) functions
depending on the what you're doing and size of your rulesets.  I was just
trying to explore a deeper level of discussion than the usual 'regexes suck'
or 'PCRE performance sucks' or 'maintaining 100,000 rules is ugly' type
discussions. <BR><BR>Note, however, that I will reserve the right to use parts
of your above quote in the future. :)<BR></BLOCKQUOTE></BODY></HTML>
_______________________________________________
LogAnalysis mailing list
LogAnalysis (at) loganalysis (dot) org [email concealed]
http://www.loganalysis.org/mailman/listinfo/loganalysis

[ reply ]
Re: Re: [logs] regexless parsing, again? Sep 18 2007 10:24AM
Andrew Hay (andrewsmhay gmail com)


 

Privacy Statement
Copyright 2010, SecurityFocus