Skip to content

Replace std::uniform_int_distribution with platform-agnostic version#585

Open
cmolder wants to merge 5 commits intoChampSim:developfrom
cmolder:develop/rng-fix
Open

Replace std::uniform_int_distribution with platform-agnostic version#585
cmolder wants to merge 5 commits intoChampSim:developfrom
cmolder:develop/rng-fix

Conversation

@cmolder
Copy link
Copy Markdown
Contributor

@cmolder cmolder commented Jan 31, 2025

While building ChampSim on two different machines, I noticed that I was getting different results for the exact same configuration. After some investigation, I found out it was due to the machines (Ubuntu 20.04 and Arch Linux) having different libstdc++ versions with different implementations of std::uniform_int_distribution. This led to the shuffling of vmem pages inside std::shuffle to differ between the two machines. I'm happy to provide more details on the exact issue I observed and the two systems in question.

As a quick fix for this, I replaced all uses of the random header (which thankfully weren't many) with Boost.random, since Boost's implementation is platform-agnostic. This appears to have resolved the issue.

To avoid this in the future, it might be a good idea to add a CI test which ensures that the outputs of the simulations match for every compiler.

- <random> can be very fickle and produce different results on
  different platforms, in particular for std::*_distribution.
- This replaces all uses of <random> with Boost.random, which
  is platform-agnostic. This should reduce the odds that
  ChampSim mysteriously yields different results on different
  compilers and platforms.
- Since std::shuffle may use std::uniform_int_distribution,
  a new method, champsim::shuffle, is provided which uses
  boost::random::uniform_int_distribution instead.
- Any future uses of std methods which interact with RNGs
  should instead use custom implementations like these.
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 31, 2025

Coverage Status

coverage: 68.322% (+0.2%) from 68.154%
when pulling ce35824 on cmolder:develop/rng-fix
into 0bfb855 on ChampSim:develop.

Comment thread src/vmem.cc Outdated
{
if (randomization_seed.has_value())
std::shuffle(ppage_free_list.begin(), ppage_free_list.end(), std::mt19937_64{randomization_seed.value()});
champsim::shuffle(ppage_free_list.begin(), ppage_free_list.end(), boost::mt19937_64{randomization_seed.value()});
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to reimplement shuffle() rather than simply pass the boost engine?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, yes. Simply changing the boost engine modified the results but there was still a discrepancy. I think it's because libstdc++'s version of shuffle uses std::uniform_int_distribution, which can differ between versions.

This blog post I found explains many issues of <random> in detail. Most relevant here is the part about how "Distributions are nonportable".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, mersenne twister is invariant across different platforms. It is a strictly defined algorithm.
I would assume that this failure is because of std::shuffle itself, and swapping out shuffle for champsim::shuffle should be sufficient by itself while still using mersenne twister and not using boost::random.
Could we confirm this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maccoymerrell Yup, it looks like this is the case. Putting back std::mt19937_64 but keeping champsim::shuffle yields identical results on both g++-9 and 14.

@cmolder
Copy link
Copy Markdown
Contributor Author

cmolder commented Feb 1, 2025

I know adding boost to a project can be a big change, so I went ahead and limited the usage of it to a single header file (util/random.h). So, if we write our own implementation for champsim::uniform_int_distribution, we can get away with not using boost at all.

@ngober
Copy link
Copy Markdown
Collaborator

ngober commented Feb 1, 2025

Is it as simple as something like

template <typename T>
class uniform_int_distribution {
  T a;
  T b;
public:
  uniform_int_distribution(T a_param, T b_param) : a(a_param), b(b_param) {}

  template <typename Gen>
  T operator()(Gen& generator) const {
    const auto width = b-a;
    constexpr auto genwidth = Gen::max() - Gen::min();
    const auto upperbound = Gen::min() + width*(genwidth / width);

    T retval = generator();
    while (retval >= upperbound) {
      retval = generator();
    }

    return ((retval - Gen::min()) % b) + a;
  }
};

Of course, that's a very, very loose sketch and doesn't include the full interface, but is it more complicated than just discarding out-of-range values from the generator? I'd want to test this to the moon and back, but I think it's conceptually pretty simple.

On the topic of adding boost, because we manage dependencies with vcpkg, I'm not too troubled by adding an additional package dependency. It will require all users to run vcpkg install again, but if it's justified, it's justified. My thinking is this: external libraries are (ideally) tested and maintained by someone else, so it's generally worth our brain bandwidth to rely on them, unless the thing is so simple we can just write it ourselves.

If we can find justifiable uses of Boost::random somewhere else, then I think we should add it. Example: our dependency on nlohmann::json, currently used only for stats output, but hopefully we can use it to parse the configuration soon.

@maccoymerrell
Copy link
Copy Markdown
Contributor

maccoymerrell commented Feb 1, 2025

I don't think you need to discard anything actually.
I would assume that a random number generator of any kind produces an apparently random uniform distribution, so wouldn't it be sufficient to just modulus the generated value by the width of the desired uniform distribution? Why the while loop and upper bound?

template <typename T>
class uniform_int_distribution {
  T a;
  T b;
public:
  uniform_int_distribution(T a_param, T b_param) : a(a_param), b(b_param) {}

  template <typename Gen>
  T operator()(Gen& generator) const {
    const auto width = b-a;

    T retval = generator() % width;
    return retval + a;
  }
};

@ngober
Copy link
Copy Markdown
Collaborator

ngober commented Feb 1, 2025

If you just slap a modulus on it, it biases the distribution towards lower numbers. There's an explanation in the blog post that Carson linked above. So, I set the upper bound at an integer number of widths over Gen::min(), then dropped things that fell between that and Gen::max().

Furthermore, Gen::min() is not guaranteed to be zero, so that's important.

@maccoymerrell
Copy link
Copy Markdown
Contributor

maccoymerrell commented Feb 1, 2025

I don't believe this has a meaningful bias in distribution towards lower numbers. Modulus operations are the backbone of integer-based cryptography, so I highly doubt that this is an unsolvable issue.

Here is a short test I did in python with 100M samples of 8 random bytes modulo 100. Python uses mersenne twister for random().
Figure 2025-02-01 143702

For a small generator, I see how this could be a problem. Consider a generator that only generates a value between 0 and 255. Then a modulus operation to generate a proportionally large range (say 100), we will see a bias, since 255 % 100 = 55, so we will see a bimodal uniform distribution, with the bias difference beginning at 55.
See here.

rand_2

@ngober
Copy link
Copy Markdown
Collaborator

ngober commented Feb 1, 2025

That's not the right way to think about this or to evaluate the claim.

Consider the following scenario:

  • The generator yields integer values in [0,1,2] with equal probability
  • The distribution maps to [0,1] with equal probability

Before proceeding, I note that this scenario is entirely within the specification. The values are smaller than we might consider common, but they are legal values.

If we perform a modulo 2 on the generated values, we will map { [0,2] => 0, [1] => 1 }. This is far from a uniform distribution. If instead we discard all 2s, we will map { [0] => 0, [1] => 1 }. This will hold true for any instance where the range of the generator is not an integer multiple of the range of the distribution. Since the values a and b are given at runtime, we cannot assume anything about them (beyond the specification).

We also cannot evaluate this claim with your python script. What you have presented is but one realization of the probability distribution. Your modulus will produce the values of 0-15 with a probability 16/184467440737095516 ~= 8.67e-17 higher than values 16-99. Your run of 100million evaluations simply cannot illustrate this bias above the noise. You would need at least 10 orders of magnitude more evaluations to illustrate what I'm describing. None of this, however, denies that the bias is in fact present.

Edit: your edit does recognize the problem. We have to recognize that this bias does not go away with scale.

@maccoymerrell
Copy link
Copy Markdown
Contributor

maccoymerrell commented Feb 1, 2025

What kind of performance overhead to the randomness would we experience? Mainly, how many resamples?
I guess performance is worst when the bias is worst, mainly when the width is just over half of the genwidth? Something like a 2x overhead on the average worst case? I guess that is acceptable, as long as our generators remain over 2x the desired field width then I don't see this being a major problem.
I guess this is also technically a hard-to-predict branch, which may have additional performance considerations.

Here is data showing as much.
I am sure there's a whole relation here with the distribution of relative prime numbers, factors, etc...

Figure 2025-02-01 162414

@maccoymerrell
Copy link
Copy Markdown
Contributor

Figure 2025-02-01 162746

@cmolder
Copy link
Copy Markdown
Contributor Author

cmolder commented Feb 1, 2025

After looking at a bunch of sample implementations, I've written a custom implementation of uniform_int_distribution that should be a drop-in replacement to the one in <random> while being platform-agnostic.

Comment thread inc/util/random.h
// Use rejection sampling to avoid modulo bias
const u_result_type reject_limit{range_g % range_d};
u_result_type sample{g() - URBG::min()};
while (sample <= reject_limit) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: the branch prediction issue, I think this branch should be easy to predict. We should rarely enter the while loop, unless the first call to g() happens to be an extremely small number. For std::mt19937_64, the odds of that happening are super low since (I think) it uniformly samples from 0 to max(uint64_t).

Copy link
Copy Markdown
Contributor Author

@cmolder cmolder Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, while I was playing with this in compiler explorer, it looks like the compiler optimizes out the - URBG::min() when it's zero, which is the case most of the time (a nice little bonus).

@cmolder cmolder changed the title Replace uses of <random> with Boost.random Replace std::uniform_int_distribution with platform-agnostic version Feb 1, 2025
@ngober
Copy link
Copy Markdown
Collaborator

ngober commented Feb 2, 2025

I like tossing out low values rather than high values. That's clever. We should put plenty of tests on this. Let's test at least the degenerate [0,1,2] => [0,1] case the blog above describes. I might also be curious to see benchmarks for various ratios of (range_g % range_d)/range_g, to address the concerns Maccoy raises. He's probably right that there are performance pessimizations in this implementation if that ratio gets too close to 1/2. But, if benchmarks show that processors can parallelize the wrong-path speculation, that would be good.

- Swap static_assert for checking if IntType is integral with SFINAE,
  this works better when using std::is_constructible.
- Template the OStr and IStr friend methods and fix issues
@cmolder
Copy link
Copy Markdown
Contributor Author

cmolder commented Feb 2, 2025

Okay, I pushed some tests for champsim::uniform_int_distribution. Next step is to test champsim::shuffle and expand the test suite further if we think it's necessary. I'd also like to test VirtualMemory::shuffle_pages, but that may be tricky.

Comment on lines +162 to +183
TEST_CASE("The uniform int distribution is constructed with the right type")
{
// Make sure integral types are constructible
SECTION("char") { test_uniform_int_distribution_type<char>(); }
SECTION("unsigned char") { test_uniform_int_distribution_type<unsigned char>(); }
SECTION("short") { test_uniform_int_distribution_type<short>(); }
SECTION("unsigned short") { test_uniform_int_distribution_type<unsigned short>(); }
SECTION("int") { test_uniform_int_distribution_type<int>(); }
SECTION("unsigned int") { test_uniform_int_distribution_type<unsigned int>(); }
SECTION("long") { test_uniform_int_distribution_type<long>(); }
SECTION("unsigned long") { test_uniform_int_distribution_type<unsigned long>(); }
SECTION("long long") { test_uniform_int_distribution_type<long long>(); }
SECTION("unsigned long long") { test_uniform_int_distribution_type<unsigned long long>(); }

// Make sure non-integral types are not constructible
SECTION("float") { test_uniform_int_distribution_type_fails<float>(); }
SECTION("double") { test_uniform_int_distribution_type_fails<double>(); }
SECTION("long double") { test_uniform_int_distribution_type_fails<long double>(); }
SECTION("std::string") { test_uniform_int_distribution_type_fails<std::string>(); }
SECTION("std::pair<int, int>") { test_uniform_int_distribution_type_fails<std::pair<int, int>>(); }
SECTION("std::vector<int>") { test_uniform_int_distribution_type_fails<std::vector<int>>(); }
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TEST_CASE("The uniform int distribution is constructed with the right type")
{
// Make sure integral types are constructible
SECTION("char") { test_uniform_int_distribution_type<char>(); }
SECTION("unsigned char") { test_uniform_int_distribution_type<unsigned char>(); }
SECTION("short") { test_uniform_int_distribution_type<short>(); }
SECTION("unsigned short") { test_uniform_int_distribution_type<unsigned short>(); }
SECTION("int") { test_uniform_int_distribution_type<int>(); }
SECTION("unsigned int") { test_uniform_int_distribution_type<unsigned int>(); }
SECTION("long") { test_uniform_int_distribution_type<long>(); }
SECTION("unsigned long") { test_uniform_int_distribution_type<unsigned long>(); }
SECTION("long long") { test_uniform_int_distribution_type<long long>(); }
SECTION("unsigned long long") { test_uniform_int_distribution_type<unsigned long long>(); }
// Make sure non-integral types are not constructible
SECTION("float") { test_uniform_int_distribution_type_fails<float>(); }
SECTION("double") { test_uniform_int_distribution_type_fails<double>(); }
SECTION("long double") { test_uniform_int_distribution_type_fails<long double>(); }
SECTION("std::string") { test_uniform_int_distribution_type_fails<std::string>(); }
SECTION("std::pair<int, int>") { test_uniform_int_distribution_type_fails<std::pair<int, int>>(); }
SECTION("std::vector<int>") { test_uniform_int_distribution_type_fails<std::vector<int>>(); }
}
TEMPLATE_TEST_CASE("The uniform int distribution is constructed with the right type", "[util]", char, unsigned char, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long)
{
// Make sure integral types are constructible
test_uniform_int_distribution_type<TestType>();
}
TEMPLATE_TEST_CASE("The uniform int distribution is not constructed with the wrong type", "[util]", float, double, long double, std::string, (std::pair<int, int>), std::vector<int>)
{
// Make sure non-integral types are not constructible
test_uniform_int_distribution_type_fails<TestType>();
}

Catch2 has a mechanism to generate test cases, which I think is useful here.

Comment on lines +36 to +42
std::map<dist_type, std::size_t> buckets{};
for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) {
buckets[uut(generator)]++;
}

// Check that the number of samples in each bucket is within the expected range
for (auto& [bucket, count] : buckets) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::map<dist_type, std::size_t> buckets{};
for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) {
buckets[uut(generator)]++;
}
// Check that the number of samples in each bucket is within the expected range
for (auto& [bucket, count] : buckets) {
std::vector<std::size_t> buckets(num_buckets);
for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) {
REQUIRE_NOTHROW(buckets.at(uut(generator))++);
}
// Check that the number of samples in each bucket is within the expected range
for (std::size_t bucket; bucket < std::size(buckets); ++bucket) {
count = buckets.at(bucket)

The use of at() and REQUIRE_NOTHROW have the benefits here of bounds checking the distribution.

}

TEST_CASE("The uniform int distribution is approximately uniform") { test_if_uniform_int_distribution_is_uniform<std::mt19937_64>(1'000'000, 3, 0.01, 0); }

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*
* Satisfies the requirements for UniformRandomBitGenerator, but only barely.
*/
struct VeryVeryBadURBG
{
using result_type = int;
[[nodiscard]] constexpr static result_type min() { return 0; }
[[nodiscard]] constexpr static result_type max() { return 2; }
explicit VeryVeryBadURBG(int /*ignored seed*/) : state(min()) {}
[[nodiscard]] result_type operator()() { return state++ % (max() + 1); }
private:
result_type state;
};
TEST_CASE("The uniform int distribution is unbiased under bad generators") { test_if_uniform_int_distribution_is_uniform<VeryVeryBadURBG>(1'000'000, 2, 0.01, 0); }

Let's make sure we test really bad cases as well as good ones.

Comment on lines +44 to +50
FAIL(fmt::format("Bucket {bucket} has {actual} samples, expected {expected} (tolerance: {min} to {max})",
"bucket"_a = bucket, // Value tested
"actual"_a = count, // Actual count for this value
"expected"_a = expected_samples, // Expected count for this value
"min"_a = min_samples, // Minimum acceptable count for this value the test will accept
"max"_a = max_samples // Maximum acceptable count for this value
));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FAIL(fmt::format("Bucket {bucket} has {actual} samples, expected {expected} (tolerance: {min} to {max})",
"bucket"_a = bucket, // Value tested
"actual"_a = count, // Actual count for this value
"expected"_a = expected_samples, // Expected count for this value
"min"_a = min_samples, // Minimum acceptable count for this value the test will accept
"max"_a = max_samples // Maximum acceptable count for this value
));
FAIL("Bucket " << bucket // Value tested
<< " has " << count << " samples," // Actual count for this value
<< " expected " << expected_samples // Expected count for this value
<< " (tolerance: " << min_samples // Minimum acceptable count for this value the test will accept
<< " to " << max_samples << ")" // Maximum acceptable count for this value the test will accept
);

FAIL does accept streaming input. It would remove the need for the fmt header, but at the cost of admittedly gross syntax. I could go either way on this suggestion, but I thought it's worth pointing out.

@ngober ngober mentioned this pull request Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants