Bloom filter simple but powerful data structure that can check
membership to a static set.
The trade-off to use Bloom filter is a certain configurable risk
of false positives. The odds of a false positive can be made very
low if the hash bitmap is sufficiently large.
e.g. Spam email is an irrelevant or inappropriate message sent on
the web to a large number of newsgroups and users. A spam word is a
list of well-known words that often appear in spam mails.
It is space-efficient probabilistic data structure that is used
to test whether an element is a member of a set.
Bloom filters never generate false negative result, i.e. telling
you that a username does not exist when it actually exists.
Unlike a standard hash table, a Bloom filter of a fixed size can
represent a set with an arbitrarily large number of elements.
Deleting elements from filter is not possible because, if we
delete a single element by clearing bits at indices generated by k
hash functions, it might cause deletion of few other elements.
bloom provides an index access method based on Bloom filters.
Application uses Bloom filter
Postgresql use Bloom filters to reduce the disk lookups for
An HBase and Google BigTable; Bloom Filter is an efficient
mechanism to test whether a StoreFile contains a specific row or
row-col cell(Bloom filters do not work with Scans).
uses bloom filters to test if any of the SSTables is likely to
contain the requested partition key or not, without actually having
to read their contents (and thus avoiding expensive IO operations).
The Google Chrome web browser used to use a Bloom filter to
identify malicious URLs
Quora implemented a shared bloom filter in the feed backend to
filter out stories that people have seen before.