Package org.xlattice.crypto.filters
Class BloomSHA1
java.lang.Object
org.xlattice.crypto.filters.BloomSHA1
public class BloomSHA1 extends Object
A Bloom filter for sets of SHA1 digests. A Bloom filter uses a set
of k hash functions to determine set membership. Each hash function
produces a value in the range 0..M-1. The filter is of size M. To
add a member to the set, apply each function to the new member and
set the corresponding bit in the filter. For M very large relative
to k, this will normally set k bits in the filter. To check whether
x is a member of the set, apply each of the k hash functions to x
and check whether the corresponding bits are set in the filter. If
any are not set, x is definitely not a member. If all are set, x
may be a member. The probability of error (the false positive rate)
is f = (1 - e^(-kN/M))^k, where N is the number of set members.
This class takes advantage of the fact that SHA1 digests are good-
quality pseudo-random numbers. The k hash functions are the values
of distinct sets of bits taken from the 20-byte SHA1 hash. The
number of bits in the filter, M, is constrained to be a power of
2; M == 2^m. The number of bits in each hash function may not
exceed floor(m/k).
This class is designed to be thread-safe, but this has not been
exhaustively tested.
- Author:
- Jim Dixon BloomSHA1.java and KeySelector.java are BSD licensed from the xlattice app - http://xlattice.sourceforge.net/ minor tweaks by jrandom, exposing unsynchronized access and allowing larger M and K. changes released into the public domain. Note that this is used only by DecayingBloomFilter, which uses only the unsynchronized locked_foo() methods. Deprecated for use outside of the router; to be moved to router.jar. As of 0.8.11, the locked_foo() methods are thread-safe, in that they work, but there is a minor risk of false-negatives if two threads are accessing the same bloom filter integer.
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BloomSHA1.FilterKey
Store the (opaque) bloom filter offsets for reuse. -
Constructor Summary
Constructors Constructor Description BloomSHA1()
Creates a filter of 2^20 bits with k defaulting to 8.BloomSHA1(int m)
Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.BloomSHA1(int m, int k)
Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash. -
Method Summary
Modifier and Type Method Description int
capacity()
void
clear()
Synchronized versiondouble
falsePositives()
double
falsePositives(int n)
BloomSHA1.FilterKey
getFilterKey(byte[] b, int offset, int len)
Get the bloom filter offsets for reuse.void
insert(byte[] b)
Add a key to the set represented by the filter.void
insert(byte[] b, int offset, int len)
void
locked_insert(byte[] b)
void
locked_insert(byte[] b, int offset, int len)
void
locked_insert(BloomSHA1.FilterKey fk)
Add the key to the filter.boolean
locked_member(byte[] b)
boolean
locked_member(byte[] b, int offset, int len)
boolean
locked_member(BloomSHA1.FilterKey fk)
Is the key in the filter.boolean
member(byte[] b)
Is a key in the filter.boolean
member(byte[] b, int offset, int len)
void
release(BloomSHA1.FilterKey fk)
int
size()
Returns the number of keys which have been inserted.
-
Constructor Details
-
BloomSHA1
public BloomSHA1(int m, int k)Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash.- Parameters:
m
- determines number of bits in filterk
- number of hash functionsx See KeySelector for important restriction on max m and k
-
BloomSHA1
public BloomSHA1(int m)Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.- Parameters:
m
- determines size of filter
-
BloomSHA1
public BloomSHA1()Creates a filter of 2^20 bits with k defaulting to 8.
-
-
Method Details
-
clear
public void clear()Synchronized version -
size
public final int size()Returns the number of keys which have been inserted. This class (BloomSHA1) does not guarantee uniqueness in any sense; if the same key is added N times, the number of set members reported will increase by N.- Returns:
- number of set members
-
capacity
public final int capacity()- Returns:
- number of bits in filter
-
insert
public void insert(byte[] b)Add a key to the set represented by the filter. XXX This version does not maintain 4-bit counters, it is not a counting Bloom filter.- Parameters:
b
- byte array representing a key (SHA1 digest)
-
insert
public void insert(byte[] b, int offset, int len) -
locked_insert
public final void locked_insert(byte[] b) -
locked_insert
public final void locked_insert(byte[] b, int offset, int len) -
locked_member
public final boolean locked_member(byte[] b) -
locked_member
public final boolean locked_member(byte[] b, int offset, int len) -
member
public final boolean member(byte[] b)Is a key in the filter. External interface, internally synchronized.- Parameters:
b
- byte array representing a key (SHA1 digest)- Returns:
- true if b is in the filter
-
member
public final boolean member(byte[] b, int offset, int len) -
getFilterKey
Get the bloom filter offsets for reuse. Caller should call release(rv) when done with it.- Since:
- 0.8.11
-
locked_insert
Add the key to the filter.- Since:
- 0.8.11
-
locked_member
Is the key in the filter.- Since:
- 0.8.11
-
release
- Since:
- 0.8.11
-
falsePositives
public final double falsePositives(int n)- Parameters:
n
- number of set members- Returns:
- approximate false positive rate
-
falsePositives
public final double falsePositives()
-