Focus on Virus
Extracting signature snippets from AV databases May 08 2006 08:37PM
Bill Stout (bill stout greenborder com) (2 replies)
Re: Extracting signature snippets from AV databases May 09 2006 03:15PM
Nick FitzGerald (nick virus-l demon co uk)
Bill Stout wrote:

> I'd like to create a set of test files containing (harmless) virus (and
> spyware) signatures. Can I extract the signatures from AV databases
> (every PC has one)? I'm thinking open source AV database may be easier
> to extract signatures from than a commercial AV database. If I can
> automate the extraction and file creation, files won't become stale
> because of lag time due to fluxuating interest of the maintainer (me).

This is both doable (to some degree) and problematic (to a very large

The short answer to your question is that you probably will not be able
to devise a suitably broad set of test cases (though your exact
intentions and requirements are not clear, so...).

> Has this been done already? ...

You've never heard of the Rosenthal Virus Simulator?

Well, all the reasons that was an incredibly dumb and stupid thing back
then largely apply to what you are asking now, with the moderating
effect that you seem to be planning on using this for in-house purposes
only. The wrath that Doren Rosenthal felt against his Virus Simulator
was (largely) due to the fact he made a lot of very mis-representative
claims about it, suggesting it could be used for sensible real-world
product testing and such. That, and the fact that in the later
versions he actually distributed newly written viruses with it...

> ... Are specific signatures a 'secret sauce'?

Generally. Open source products are the obvious exception, but parts
of some commercial products' "signature databases" are fairly well
understood due to reverse engineering, etc...

Note that if your view of modern virus detection is that it is a
somewhat glorified binary grep, you are not ready to start thinking
about approaching this. The term "virus signature" has been a very
poor term since very early in the development of virus scanners.
Depending on the type and nature of file to be scanned, it will pass
through one to several format interpreters, parsers, emulators and so
on, and then some derived string or strings compared with the "scan
string" database of the scanning product. This is not to say that some
kinds of malware are not (in at least some products) fairly simply
detected via hash-like calculations of parts of the suspect file(s),
but even then, you will find that various blocks of code may have to be
found at various offsets from the file or program head or tail, entry
point, etc, etc.

> The primary purpose is to create a test that safely verifies that our
> browser protection product absolutely protects a computer from
> intentional infection.

The problem here is much the same problem of the Rosenthal product.

Rosenthal's "simulated viruses" were not, of course, real viruses.

True, they (mostly) comprised snippets of real virus code glued
together at the end of a simple DOS stub program, but as the "simulated
virus" code never ran (execution ran through the stub only and exited)
they were not and could not be viruses.

Thus, they were no good for testing real virus scanenrs (or other forms
of antivirus product). By definition, a virus scanner should NOT
detect a non-virus as a virus (with the generally agreed exception of
the EICAR test file) and any scanner that DID detect a Rosenthal
"simulated virus" _as a virus_ was clearly making a false positive...

Nick FitzGerald
Computer Virus Consulting Ltd.
Ph/FAX: +64 3 3267092

[ reply ]
Re: Extracting signature snippets from AV databases May 08 2006 09:42PM
Jose Nazario (jose monkey org)


Privacy Statement
Copyright 2010, SecurityFocus