davidribyrne (at) yahoo (dot) com [email concealed] wrote:
> Can anyone recommend a tool or library for measuring data entropy? Pass it a string, it returns a score.
>
Possibly you want the Levenshtein algorithm, but that's not accurate on
unfixed lengths of entropy. If you want to measure each
rand()-equivalent result, just treat each resulting unsigned int as a
hash-value and see how many collisions you get. Since this is so trivial
tro write I don't think anyone has made a tool available. This should
(sort of) work (fix spelling errors yourself; it's friday and I'm drunk
and headed for the pub).
---8<---8<---8<---
#include <stdio.h>
int main(int argc, char **argv)
{
unsigned x, coll[1024], c = 0;
double biggest;
for (x = 0; x < 1024; x++)
coll[x] = 0;
> Can anyone recommend a tool or library for measuring data entropy? Pass it a string, it returns a score.
>
Possibly you want the Levenshtein algorithm, but that's not accurate on
unfixed lengths of entropy. If you want to measure each
rand()-equivalent result, just treat each resulting unsigned int as a
hash-value and see how many collisions you get. Since this is so trivial
tro write I don't think anyone has made a tool available. This should
(sort of) work (fix spelling errors yourself; it's friday and I'm drunk
and headed for the pub).
---8<---8<---8<---
#include <stdio.h>
int main(int argc, char **argv)
{
unsigned x, coll[1024], c = 0;
double biggest;
for (x = 0; x < 1024; x++)
coll[x] = 0;
while ((x == read(fileno(STDIN)) != EOF) {
c++;
coll[x & 1023]++;
}
for (x = 0; x < 1024; x++) {
if ((double)x / (double)c < (double)1023.0 / (double)c)
biggest = (double)1023.0 / (double)c;
if (biggest > 0.55)
printf("Bad entropy, you foolsome git!\n");
else
printf("Nicely done. Entropy is acceptable\n");
return 0;
}
---8<---8<---8<---
Use as such:
prng --lots-of-numbers | whatever-you-compile-the-above-to
/exon
[ reply ]