Popularity

6.7

Stable

Activity

2.0

Stars 34

Watchers 5

Forks 5

Last Commit 8 months ago

Programming language: Crystal

License: MIT License

Tags: Caching

Latest version: v0.1.1

bloom_filter alternatives and similar shards

Based on the "Caching" category.
Alternatively, view bloom_filter alternatives based on common mentions on social networks and blogs.

bojack

8.8 0.0 bloom_filter VS bojack

DISCONTINUED. A non-reliable in-memory key-value store
crystal-memcached

6.8 0.0 bloom_filter VS crystal-memcached

Memcached client for Crystal language

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

Nuummite

5.1 0.0 bloom_filter VS Nuummite

Nuummite is a tiny persistent embedded key-value store for Crystal!
cache-hash

5.1 0.0 bloom_filter VS cache-hash

A simple key/value store where pairs can expire after a specified interval
Bloom Filter

1.3 0.0 bloom_filter VS Bloom Filter

Bloom filter implementation in Crystal lang
crystal-kiwi

0.8 0.0 bloom_filter VS crystal-kiwi

A unified Crystal interface for key-value stores.

Do you think we are missing an alternative of bloom_filter or a related project?

Add another 'Caching' Shard

Popular Comparisons

README

Bloom Filter

Implementation of Bloom Filter in Crystal lang.

Installation
Usage
Benchmark
Contributors

Installation

Add this to your application's shard.yml:

dependencies:
  bloom_filter:
    github: crystal-community/bloom_filter

Usage

Basic

require "bloom_filter"

# Create filter with bitmap size of 32 bytes and 3 hash functions.
filter = BloomFilter.new(bytesize = 32, hash_num = 3)

# Insert elements
filter.insert("Esperanto")
filter.insert("Toki Pona")

# Check elements presence
filter.has?("Esperanto")  # => true
filter.has?("Toki Pona")  # => true
filter.has?("Englsh")     # => false

Creating a filter with optimal parameters

Based on your needs(expected number of items and desired probability of false positives), your can create an optimal bloom filter:

# Create a filter, that with one million inserted items, gives 2% of false positives for #has? method
filter = BloomFilter.new_optimal(1_000_000, 0.02)
filter.bytesize # => 1017796 (993Kb)
filter.hash_num # => 6

Dumping into a file and loading

It's possible to save existing bloom filter as a binary file and then load it back.

filter = BloomFilter.new_optimal(2, 0.01)
filter.insert("Esperanto")
filter.dump_file("/tmp/bloom_languages")

loaded_filter = BloomFilter.load_file("/tmp/bloom_languages")
loaded_filter.has?("Esperanto") # => true
loaded_filter.has?("English")   # => false

Union and intersection

Having two filters of the same size and number of hash functions, it's possible to perform union and intersection operations:

f1 = BloomFilter.new(32, 3)
f1.insert("Esperanto")
f1.insert("Spanish")

f2 = BloomFilter.new(32, 3)
f2.insert("Esperanto")
f2.insert("English")

# Union
f3 = f1 | f2
f3.has?("Esperanto") # => true
f3.has?("Spanish")   # => true
f3.has?("English")   # => true

# Intersection
f4 = f1 & f2
f4.has?("Esperanto") # => true
f4.has?("Spanish")   # => false
f4.has?("English")   # => false

Visualization

If you want to see how your filter looks like, you can visualize it:

f1 = BloomFilter.new(16, 2)
f1.insert("Esperanto")
puts "f1 = (Esperanto)"
puts f1.visualize

f2 = BloomFilter.new(16, 2)
f2.insert("Spanish")
puts "f2 = (Spanish)"
puts f2.visualize

f3 = f1 | f2
puts "f3 = f1 | f2 = (Esperanto, Spanish)"
puts f3.visualize

Output:

f1 = (Esperanto)
░░░░░░░░ ░░░░░░█░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░
░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░█ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░

f2 = (Spanish)
░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░
░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░█░ ░█░░░░░░

f3 = f1 | f2 = (Esperanto, Spanish)
░░░░░░░░ ░░░░░░█░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░
░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░░ ░░░░░░░█ ░░░░░░░░ ░░░░░░█░ ░█░░░░░░

In this way, you can actually see which bits are set:)

Benchmark

Performance of Bloom filter depends on the following parameters:

Size of the filter
Number of hash functions
Length of the input string

To run benchmark from ./samples/benchmark.cr, simply run make task:

$ make benchmark

Number of items: 100000000
Filter size: 117005Kb
Hash functions: 7
String size: 13

                     user     system      total        real
insert           0.004227   0.000000   0.004227 (  2.769349)
has? (present)   0.007980   0.000000   0.007980 (  5.223778)
has? (missing)   0.004318   0.000000   0.004318 (  2.829521)

Contributors

greyblake Potapov Sergey - creator, maintainer
funny-falcon Sokolov Yura - better hash algorithms

bloom_filter

Bloom filter implementation in Crystal lang

bloom_filter alternatives and similar shards

bojack

crystal-memcached

WorkOS - The modern identity platform for B2B SaaS

Nuummite

cache-hash

Bloom Filter

crystal-kiwi

Popular Comparisons

README

Bloom Filter

Installation

Usage

Basic

Creating a filter with optimal parameters

Dumping into a file and loading

Union and intersection

Visualization

Benchmark

Contributors