Skip to content

Sup2point0/weighted-list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

396 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

weighted-list

List data structures for weighted randomisation, implemented (eventually) in every programming language I’ve learnt.

Rust
let descriptors = wlist![
    (10, "cool".to_owned()),
    (5,  "awesome".to_owned()),
    (2,  "elegant".to_owned()),
    (1,  "beautiful".to_owned()),
];

let words = descriptors.select_random_values_unique()
    .rng(&mut rand::rng())
    .count(2)
    .call();

println!("Rust is {} and {}", words[0], words[1]);
// => Rust is awesome and elegant
Python
greetings = WeightedList((20, "sup"), (2, "salutations"))

print(greetings.select())
# => sup
C#
WeightedList<string, int> greetings = new((20, "sup"), (2, "salutations"));

Console.WriteLine(greetings.GetRandomValue());
// => salutations
TypeScript (under development)
let greetings = new FrozenWeightedList([20, "sup"], [2, "salutations"]);

console.log(greetings.sample_value());
// => sup
Haskell (under development)
greetings :: WeightedList String Int
greetings = newWeightedList [(20, "sup"), (2, "salutations")]

main :: IO ()
main = print (randomValue greetings)
-- => salutations
Ruby (awaiting development)
(working on it!)

Features

  • Weighted randomised selection with constraints such as no-replacement, multi-take, unique-only
  • Ergonomic interface targeted for each language
  • Utility methods for manipulating values and weights
  • In-place and out-of-place variants of methods
  • Conversions to and from other data types

Future

  • Slice indexing1

Purposes

Tip

For the full rationale behind this project, see rationale.

I made this class for weighted randomisation, where each element in a collection has a different chance of being selected – the greater an item’s weight, the higher the chance it is selected. This is super common in games for reward systems, displaying messages, etc.


Usage

Rust

crates.io · documentation

> cargo add weighted-list
use weighted_list::{ WeightedList, FrozenWeightedList };

TypeScript

npm

npm install @sup2.0/weighted-list
import { FrozenWeightedList } from "@sup2.0/weighted-list";

Python

All you need is the weightedlist.py file, which contains the WeightedList class with all the functionality. Simply import it, and you’re ready to go!

from weightedlist import WeightedList

See walkthrough for a tutorial, or examples for examples.

C#

All the code is contained within the WeightedList.cs file. You might also need the weighted-list.csproj file. If you want the entire solution, you can download the repo and extract the c-sharp/ folder.

For a tutorial, see walkthrough.

Haskell

import WeightedList qualified as WL
import WeightedList (WeightedList)

Compatibility

Language Version Status Dependencies Notes
Rust 2024 Maintaining rand, num_traits, itertools, bon
Python >= 3.11 Awaiting rewrite None
C# 12.0 Awaiting maintenance None Supports LINQ querying
TypeScript Under development None Currently only supports FrozenWeightedList
Haskell GHC2021 Under development random
Ruby Awaiting development

Implementation

A WeightedList stores ‘weighted items’ instead of raw values. Each item has a weight and value attribute.

WeightedList using weighted indexing. To illustrate with an example:

wl = WeightedList((1, "qi"), (2, "sup"), (5, "shard"))

assert wl[0].value == "qi"
assert wl[1].value == "sup"
assert wl[2].value == "sup"
assert wl[3].value == "shard"
assert wl[4].value == "shard"
assert wl[5].value == "shard"
assert wl[6].value == "shard"
assert wl[7].value == "shard"

wl[8]  # IndexError

In essence, you can think of each item as being repeated a number of times equal to its weight. The ‘length’ of the list is the sum of the weights.

Hence, selecting a random value using weighted randomisation is as simple as picking a random index between $0$ and the length of the list!


Questions

Why did you create this?

Back when I was picking up the ropes of Python, I was working on a project which featured randomisation, and, like any game developer, I thought it’d be cool to give each outcomes different probabilities of occurring. At first, I achieved this behaviour by duplicating items, but I quickly realised the numerous issues with this.

And so, I set out to write my own class, which I’d never really needed to do up until that point. I thought it’d be a great exercise in learning Python – and it very much was, teaching me tons about object-oriented programming, dunder methods, generators, etc. It was also my first experience of conscientiously writing code that wasn’t exclusively for myself, which helped me understand the importance of consistency and clarity, and above all, documentation.

A couple years later, I’ve come back to do the same in C#, this time also adding several features I always intended to add but never did – especially non-integer weights, which allows the class to truly embrace its usage as representing probabilities. Trying to translate Python into C# was an interesting experience,2 and helped highlight some important differences between the languages that I would otherwise not have found out.

A few more years later, I’m back to do the same in Haskell and Rust (and also finish off the TypeScript and Ruby implementations that I started but never finished). Damn I love this project. Seriously, it never fails to raise so many questions about a language’s mechanics and quirks that I would never encounter otherwise.

Is this even useful?

I mean yeah, a whole several-hundred-lines class to handle one thing is probably overkill... it’s more an exercise and proof-of-concept.

Regardless, I’ve used my own code3 in at least 2 major projects (PENGUIN and Algorhythm), so I can definitely say it’s been useful to me!

Why are the source files several hundred lines long?

  1. documentation
  2. line breaks
  3. utility

Particularly documentation. That stuff just eats the line count. Also, implementing something as complex as an enumerable container requires a lot of methods, operators, interfaces and delegation. And in C# you've even got overloading to account for as well.

How fast is it?

In all honesty, I don’t know. I’m slowly adding benchmarks to test different approaches.

Why is your Python code not compliant to PEP 8?

I have my own particular preferences when it comes to coding in Python, which I explain fully here.

Why do you start {} on a new line?

I’m a C# programmer, what can I say :P

Why do you use snake_case in TypeScript?

I’m a Python programmer, what can I say :P

Why are you okay with camelCase in Haskell then?

I used snake_case before, and ngl, in Haskell you kinda need the camelCase to keep things readable without parentheses...


Contribute

Any feedback, suggestions or improvements are definitely welcome!


Footnotes

  1. Really quite difficult with non-integer weights.

  2. This was not exactly the way I created the project in C#, but the Python implementation certainly laid out a general framework and was influential in some design decisions.

  3. To my own surprise, somewhat.