ReservoirSampler

ReservoirSampler

Mojo struct 🡭

ReservoirSampler

@memory_only
struct ReservoirSampler[T: Copyable & Movable]

Sample N items from a stream of unknown length.

Sample all the elements, this should retain the order since we always automatically take the first N elements.

from random import seed
from testing import assert_equal

from extramojo.stats.sampling import ReservoirSampler

# Set the global random seed
seed(42)

var sampler = ReservoirSampler[Int](10)

var items = List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
for item in items:
    sampler.insert(item)
assert_equal(sampler.reservoir, items)

Sample only a subset of the input list.

var sampler = ReservoirSampler[Int](5)

for item in items:
    sampler.insert(item)
assert_equal(len(sampler.reservoir), 5)
assert_equal(sampler.reservoir, List(0, 9, 2, 3, 7))

Sample only a single element.

var sampler = ReservoirSampler[Int](1)

for item in items:
    sampler.insert(item)
assert_equal(len(sampler.reservoir), 1)
assert_equal(sampler.reservoir, List(6))

Sample more elements than are in the input stream.

var sampler = ReservoirSampler[Int](11)

for item in items:
    sampler.insert(item)
assert_equal(len(sampler.reservoir), 10)
assert_equal(sampler.reservoir, items)

Sample zero elements

var sampler = ReservoirSampler[Int](0)

for item in items:
    sampler.insert(item)
assert_equal(len(sampler.reservoir), 0)
assert_equal(sampler.reservoir, List[Int]())

Parameters

  • T (Copyable & Movable)

Fields

  • reservoir (List[T])
  • values_to_collect (Int)
  • seen_values (Int)

Implemented traits

AnyType, Copyable, ExplicitlyCopyable, Movable, UnknownDestructibility

Methods

 

__init__

fn __init__(out self, values_to_collect: Int)

Details Args:

  • values_to_collect (Int)
  • self (Self)

Returns:

Self

insert

fn insert(mut self, item: T)

Add an element. The element will be tested for addition to the reservoir.

Args:

  • self (Self)
  • item (T)