Replace std::uniform_int_distribution with platform-agnostic version#585
Replace std::uniform_int_distribution with platform-agnostic version#585cmolder wants to merge 5 commits intoChampSim:developfrom
Conversation
- <random> can be very fickle and produce different results on different platforms, in particular for std::*_distribution. - This replaces all uses of <random> with Boost.random, which is platform-agnostic. This should reduce the odds that ChampSim mysteriously yields different results on different compilers and platforms. - Since std::shuffle may use std::uniform_int_distribution, a new method, champsim::shuffle, is provided which uses boost::random::uniform_int_distribution instead. - Any future uses of std methods which interact with RNGs should instead use custom implementations like these.
| { | ||
| if (randomization_seed.has_value()) | ||
| std::shuffle(ppage_free_list.begin(), ppage_free_list.end(), std::mt19937_64{randomization_seed.value()}); | ||
| champsim::shuffle(ppage_free_list.begin(), ppage_free_list.end(), boost::mt19937_64{randomization_seed.value()}); |
There was a problem hiding this comment.
Is it necessary to reimplement shuffle() rather than simply pass the boost engine?
There was a problem hiding this comment.
From what I can tell, yes. Simply changing the boost engine modified the results but there was still a discrepancy. I think it's because libstdc++'s version of shuffle uses std::uniform_int_distribution, which can differ between versions.
This blog post I found explains many issues of <random> in detail. Most relevant here is the part about how "Distributions are nonportable".
There was a problem hiding this comment.
From what I understand, mersenne twister is invariant across different platforms. It is a strictly defined algorithm.
I would assume that this failure is because of std::shuffle itself, and swapping out shuffle for champsim::shuffle should be sufficient by itself while still using mersenne twister and not using boost::random.
Could we confirm this?
There was a problem hiding this comment.
@maccoymerrell Yup, it looks like this is the case. Putting back std::mt19937_64 but keeping champsim::shuffle yields identical results on both g++-9 and 14.
|
I know adding boost to a project can be a big change, so I went ahead and limited the usage of it to a single header file ( |
|
Is it as simple as something like Of course, that's a very, very loose sketch and doesn't include the full interface, but is it more complicated than just discarding out-of-range values from the generator? I'd want to test this to the moon and back, but I think it's conceptually pretty simple. On the topic of adding boost, because we manage dependencies with vcpkg, I'm not too troubled by adding an additional package dependency. It will require all users to run If we can find justifiable uses of Boost::random somewhere else, then I think we should add it. Example: our dependency on nlohmann::json, currently used only for stats output, but hopefully we can use it to parse the configuration soon. |
|
I don't think you need to discard anything actually. |
|
If you just slap a modulus on it, it biases the distribution towards lower numbers. There's an explanation in the blog post that Carson linked above. So, I set the upper bound at an integer number of widths over Furthermore, |
|
I don't believe this has a meaningful bias in distribution towards lower numbers. Modulus operations are the backbone of integer-based cryptography, so I highly doubt that this is an unsolvable issue. Here is a short test I did in python with 100M samples of 8 random bytes modulo 100. Python uses mersenne twister for random(). For a small generator, I see how this could be a problem. Consider a generator that only generates a value between 0 and 255. Then a modulus operation to generate a proportionally large range (say 100), we will see a bias, since 255 % 100 = 55, so we will see a bimodal uniform distribution, with the bias difference beginning at 55.
|
|
That's not the right way to think about this or to evaluate the claim. Consider the following scenario:
Before proceeding, I note that this scenario is entirely within the specification. The values are smaller than we might consider common, but they are legal values. If we perform a modulo 2 on the generated values, we will map We also cannot evaluate this claim with your python script. What you have presented is but one realization of the probability distribution. Your modulus will produce the values of 0-15 with a probability Edit: your edit does recognize the problem. We have to recognize that this bias does not go away with scale. |
|
What kind of performance overhead to the randomness would we experience? Mainly, how many resamples? Here is data showing as much. |
|
After looking at a bunch of sample implementations, I've written a custom implementation of |
| // Use rejection sampling to avoid modulo bias | ||
| const u_result_type reject_limit{range_g % range_d}; | ||
| u_result_type sample{g() - URBG::min()}; | ||
| while (sample <= reject_limit) { |
There was a problem hiding this comment.
re: the branch prediction issue, I think this branch should be easy to predict. We should rarely enter the while loop, unless the first call to g() happens to be an extremely small number. For std::mt19937_64, the odds of that happening are super low since (I think) it uniformly samples from 0 to max(uint64_t).
There was a problem hiding this comment.
Also, while I was playing with this in compiler explorer, it looks like the compiler optimizes out the - URBG::min() when it's zero, which is the case most of the time (a nice little bonus).
|
I like tossing out low values rather than high values. That's clever. We should put plenty of tests on this. Let's test at least the degenerate |
- Swap static_assert for checking if IntType is integral with SFINAE, this works better when using std::is_constructible. - Template the OStr and IStr friend methods and fix issues
|
Okay, I pushed some tests for |
| TEST_CASE("The uniform int distribution is constructed with the right type") | ||
| { | ||
| // Make sure integral types are constructible | ||
| SECTION("char") { test_uniform_int_distribution_type<char>(); } | ||
| SECTION("unsigned char") { test_uniform_int_distribution_type<unsigned char>(); } | ||
| SECTION("short") { test_uniform_int_distribution_type<short>(); } | ||
| SECTION("unsigned short") { test_uniform_int_distribution_type<unsigned short>(); } | ||
| SECTION("int") { test_uniform_int_distribution_type<int>(); } | ||
| SECTION("unsigned int") { test_uniform_int_distribution_type<unsigned int>(); } | ||
| SECTION("long") { test_uniform_int_distribution_type<long>(); } | ||
| SECTION("unsigned long") { test_uniform_int_distribution_type<unsigned long>(); } | ||
| SECTION("long long") { test_uniform_int_distribution_type<long long>(); } | ||
| SECTION("unsigned long long") { test_uniform_int_distribution_type<unsigned long long>(); } | ||
|
|
||
| // Make sure non-integral types are not constructible | ||
| SECTION("float") { test_uniform_int_distribution_type_fails<float>(); } | ||
| SECTION("double") { test_uniform_int_distribution_type_fails<double>(); } | ||
| SECTION("long double") { test_uniform_int_distribution_type_fails<long double>(); } | ||
| SECTION("std::string") { test_uniform_int_distribution_type_fails<std::string>(); } | ||
| SECTION("std::pair<int, int>") { test_uniform_int_distribution_type_fails<std::pair<int, int>>(); } | ||
| SECTION("std::vector<int>") { test_uniform_int_distribution_type_fails<std::vector<int>>(); } | ||
| } |
There was a problem hiding this comment.
| TEST_CASE("The uniform int distribution is constructed with the right type") | |
| { | |
| // Make sure integral types are constructible | |
| SECTION("char") { test_uniform_int_distribution_type<char>(); } | |
| SECTION("unsigned char") { test_uniform_int_distribution_type<unsigned char>(); } | |
| SECTION("short") { test_uniform_int_distribution_type<short>(); } | |
| SECTION("unsigned short") { test_uniform_int_distribution_type<unsigned short>(); } | |
| SECTION("int") { test_uniform_int_distribution_type<int>(); } | |
| SECTION("unsigned int") { test_uniform_int_distribution_type<unsigned int>(); } | |
| SECTION("long") { test_uniform_int_distribution_type<long>(); } | |
| SECTION("unsigned long") { test_uniform_int_distribution_type<unsigned long>(); } | |
| SECTION("long long") { test_uniform_int_distribution_type<long long>(); } | |
| SECTION("unsigned long long") { test_uniform_int_distribution_type<unsigned long long>(); } | |
| // Make sure non-integral types are not constructible | |
| SECTION("float") { test_uniform_int_distribution_type_fails<float>(); } | |
| SECTION("double") { test_uniform_int_distribution_type_fails<double>(); } | |
| SECTION("long double") { test_uniform_int_distribution_type_fails<long double>(); } | |
| SECTION("std::string") { test_uniform_int_distribution_type_fails<std::string>(); } | |
| SECTION("std::pair<int, int>") { test_uniform_int_distribution_type_fails<std::pair<int, int>>(); } | |
| SECTION("std::vector<int>") { test_uniform_int_distribution_type_fails<std::vector<int>>(); } | |
| } | |
| TEMPLATE_TEST_CASE("The uniform int distribution is constructed with the right type", "[util]", char, unsigned char, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long) | |
| { | |
| // Make sure integral types are constructible | |
| test_uniform_int_distribution_type<TestType>(); | |
| } | |
| TEMPLATE_TEST_CASE("The uniform int distribution is not constructed with the wrong type", "[util]", float, double, long double, std::string, (std::pair<int, int>), std::vector<int>) | |
| { | |
| // Make sure non-integral types are not constructible | |
| test_uniform_int_distribution_type_fails<TestType>(); | |
| } |
Catch2 has a mechanism to generate test cases, which I think is useful here.
| std::map<dist_type, std::size_t> buckets{}; | ||
| for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) { | ||
| buckets[uut(generator)]++; | ||
| } | ||
|
|
||
| // Check that the number of samples in each bucket is within the expected range | ||
| for (auto& [bucket, count] : buckets) { |
There was a problem hiding this comment.
| std::map<dist_type, std::size_t> buckets{}; | |
| for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) { | |
| buckets[uut(generator)]++; | |
| } | |
| // Check that the number of samples in each bucket is within the expected range | |
| for (auto& [bucket, count] : buckets) { | |
| std::vector<std::size_t> buckets(num_buckets); | |
| for (std::size_t sample_i = 0; sample_i < num_samples; sample_i++) { | |
| REQUIRE_NOTHROW(buckets.at(uut(generator))++); | |
| } | |
| // Check that the number of samples in each bucket is within the expected range | |
| for (std::size_t bucket; bucket < std::size(buckets); ++bucket) { | |
| count = buckets.at(bucket) |
The use of at() and REQUIRE_NOTHROW have the benefits here of bounds checking the distribution.
| } | ||
|
|
||
| TEST_CASE("The uniform int distribution is approximately uniform") { test_if_uniform_int_distribution_is_uniform<std::mt19937_64>(1'000'000, 3, 0.01, 0); } | ||
|
|
There was a problem hiding this comment.
| /* | |
| * Satisfies the requirements for UniformRandomBitGenerator, but only barely. | |
| */ | |
| struct VeryVeryBadURBG | |
| { | |
| using result_type = int; | |
| [[nodiscard]] constexpr static result_type min() { return 0; } | |
| [[nodiscard]] constexpr static result_type max() { return 2; } | |
| explicit VeryVeryBadURBG(int /*ignored seed*/) : state(min()) {} | |
| [[nodiscard]] result_type operator()() { return state++ % (max() + 1); } | |
| private: | |
| result_type state; | |
| }; | |
| TEST_CASE("The uniform int distribution is unbiased under bad generators") { test_if_uniform_int_distribution_is_uniform<VeryVeryBadURBG>(1'000'000, 2, 0.01, 0); } |
Let's make sure we test really bad cases as well as good ones.
| FAIL(fmt::format("Bucket {bucket} has {actual} samples, expected {expected} (tolerance: {min} to {max})", | ||
| "bucket"_a = bucket, // Value tested | ||
| "actual"_a = count, // Actual count for this value | ||
| "expected"_a = expected_samples, // Expected count for this value | ||
| "min"_a = min_samples, // Minimum acceptable count for this value the test will accept | ||
| "max"_a = max_samples // Maximum acceptable count for this value | ||
| )); |
There was a problem hiding this comment.
| FAIL(fmt::format("Bucket {bucket} has {actual} samples, expected {expected} (tolerance: {min} to {max})", | |
| "bucket"_a = bucket, // Value tested | |
| "actual"_a = count, // Actual count for this value | |
| "expected"_a = expected_samples, // Expected count for this value | |
| "min"_a = min_samples, // Minimum acceptable count for this value the test will accept | |
| "max"_a = max_samples // Maximum acceptable count for this value | |
| )); | |
| FAIL("Bucket " << bucket // Value tested | |
| << " has " << count << " samples," // Actual count for this value | |
| << " expected " << expected_samples // Expected count for this value | |
| << " (tolerance: " << min_samples // Minimum acceptable count for this value the test will accept | |
| << " to " << max_samples << ")" // Maximum acceptable count for this value the test will accept | |
| ); |
FAIL does accept streaming input. It would remove the need for the fmt header, but at the cost of admittedly gross syntax. I could go either way on this suggestion, but I thought it's worth pointing out.


While building ChampSim on two different machines, I noticed that I was getting different results for the exact same configuration. After some investigation, I found out it was due to the machines (Ubuntu 20.04 and Arch Linux) having different libstdc++ versions with different implementations of std::uniform_int_distribution. This led to the shuffling of vmem pages inside std::shuffle to differ between the two machines. I'm happy to provide more details on the exact issue I observed and the two systems in question.
As a quick fix for this, I replaced all uses of the random header (which thankfully weren't many) with Boost.random, since Boost's implementation is platform-agnostic. This appears to have resolved the issue.
To avoid this in the future, it might be a good idea to add a CI test which ensures that the outputs of the simulations match for every compiler.