Tags: UQ-RCC/rocPRIM
Tags
[rocm-libraries] ROCm/rocm-libraries#1401 (commit 609fac3) Add cstring header include to texture_cache_iterator Do not merge in this PR until given PM approval as it is targeting ROCm 7.0.1. ## Motivation If a user includes the texture_cache_iterator header in some way without also including cstring, then compilation will fail. ## Technical Details In https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocprim/rocprim/include/rocprim/iterator/texture_cache_iterator.hpp#L178 rocprim references memset, which requires the cstring header. It is not included in the texture_cache_iterator header, leading to the compilation failure. ## Test Plan Even an empty program that includes the rocprim header will fail to compile as the main rocprim header includes texture_cache_iterator.hpp: ``` #include <rocprim/rocprim.hpp> int main() { return 0; } ``` ## Test Result The above compiles with this fix. ## Submission Checklist - [ ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
[rocm-libraries] ROCm/rocm-libraries#1036 (commit 539ba71) [rocPRIM][Code Coverage][Cherry Pick] Increase code coverage for rocPRIM (#1036) In this commit we have added test coverage, made some bug fixes and added a wrapper function for a common pattern in the tests. This `test_kernel_wrapper` function reduces the amount of code used. In ROCm/rocm-libraries@0318f42 we have added coverage of the `warp_scan_shuffle` by forcing dpp off in cmake on the `warp_scan` test. In ROCm/rocm-libraries@30d54c1 we have added the missing test cases for the thread_algos in the thread directory. (Except for thread_operators, these functions are mostly duplicates probably needs to be cleanly moved/deprecated). We also found a bug in `thread_reduce` with the tests and fixed this. In ROCm/rocm-libraries@48780ff ROCm/rocm-libraries@37aa542 ROCm/rocm-libraries@05a9e12 ROCm/rocm-libraries@f5f1dbe ROCm/rocm-libraries@343e275 ROCm/rocm-libraries@8007fe6 we have added unit tests for the files in the types directory. In ROCm/rocm-libraries@40ca3e2 we have introduced the `test_kernel_wrapper` and added the missing `PartitionTwoWayFlag` test. In ROCm/rocm-libraries@7a4e7ba unit tests for the rocprim::tuple type are added. In ROCm/rocm-libraries@29eac9a we have added extra test coverage by introducing a new type that will go into the untested path. We have also added this type to the benchmark to show [this specialization](ROCm/rocm-libraries@29eac9a#diff-22a70b2ad081732e222004ded43ce0db7145ff196f7b723289425dcb5b6c732dR228) is still needed. We also changed the `std::is_integral` to `rocprim::is_integral` to not include `(u)int128_t` for this specialization, this does not impact performance. The specialization enable_if was also slightly changed to make it clearer which path is chosen (does not make a difference in actual executed code). Also custom config and a iterator was added to the tests of merge_sort. In ROCm/rocm-libraries@f5614d5 test coverage was added for the device_scan_common.hpp file. In ROCm/rocm-libraries@1d2a40a test coverage was increased by actually using `const fixed_array` and including a `Level` type which goes into the base path of sample_to_bin_even struct. We also added an iterator as a type to the test. Also the new test wrapper was used. In ROCm/rocm-libraries@c518f42 test coverage was increased for the iterators, a lot of the operators where missing. Also some cleaning up of the tests was done including the wrapper when possible. There was also a bug found for the comperator in `arg_index_iterator` and `texture_cache_iterator`, which was also fixed. The `->` operator is not tested for `test_texture_cache`, `arg_index_iterator`, `transform_iterator` and `zip_iterator`. They currently do not seem to work, I am working on a possible fix but will be added in a later PR.
[rocm-libraries] ROCm/rocm-libraries#1036 (commit 539ba71) [rocPRIM][Code Coverage][Cherry Pick] Increase code coverage for rocPRIM (#1036) In this commit we have added test coverage, made some bug fixes and added a wrapper function for a common pattern in the tests. This `test_kernel_wrapper` function reduces the amount of code used. In ROCm/rocm-libraries@0318f42 we have added coverage of the `warp_scan_shuffle` by forcing dpp off in cmake on the `warp_scan` test. In ROCm/rocm-libraries@30d54c1 we have added the missing test cases for the thread_algos in the thread directory. (Except for thread_operators, these functions are mostly duplicates probably needs to be cleanly moved/deprecated). We also found a bug in `thread_reduce` with the tests and fixed this. In ROCm/rocm-libraries@48780ff ROCm/rocm-libraries@37aa542 ROCm/rocm-libraries@05a9e12 ROCm/rocm-libraries@f5f1dbe ROCm/rocm-libraries@343e275 ROCm/rocm-libraries@8007fe6 we have added unit tests for the files in the types directory. In ROCm/rocm-libraries@40ca3e2 we have introduced the `test_kernel_wrapper` and added the missing `PartitionTwoWayFlag` test. In ROCm/rocm-libraries@7a4e7ba unit tests for the rocprim::tuple type are added. In ROCm/rocm-libraries@29eac9a we have added extra test coverage by introducing a new type that will go into the untested path. We have also added this type to the benchmark to show [this specialization](ROCm/rocm-libraries@29eac9a#diff-22a70b2ad081732e222004ded43ce0db7145ff196f7b723289425dcb5b6c732dR228) is still needed. We also changed the `std::is_integral` to `rocprim::is_integral` to not include `(u)int128_t` for this specialization, this does not impact performance. The specialization enable_if was also slightly changed to make it clearer which path is chosen (does not make a difference in actual executed code). Also custom config and a iterator was added to the tests of merge_sort. In ROCm/rocm-libraries@f5614d5 test coverage was added for the device_scan_common.hpp file. In ROCm/rocm-libraries@1d2a40a test coverage was increased by actually using `const fixed_array` and including a `Level` type which goes into the base path of sample_to_bin_even struct. We also added an iterator as a type to the test. Also the new test wrapper was used. In ROCm/rocm-libraries@c518f42 test coverage was increased for the iterators, a lot of the operators where missing. Also some cleaning up of the tests was done including the wrapper when possible. There was also a bug found for the comperator in `arg_index_iterator` and `texture_cache_iterator`, which was also fixed. The `->` operator is not tested for `test_texture_cache`, `arg_index_iterator`, `transform_iterator` and `zip_iterator`. They currently do not seem to work, I am working on a possible fix but will be added in a later PR.
Port MI300 lookback_scan optimization to ROCm 6.4.x release (ROCm#753) * fix(rocprim/device_partition): resolve deadlock when using device partition and multiple streams on gfx942 * Made the workaround optional with ROCPRIM_ENFORCE_BLOCK_ORDER flag * fix copyright dates, clang-format and removed deprecation * Using different API to call workaround --------- Co-authored-by: Nick Breed <[email protected]>
Workaround memory allocation issues on Windows for large size test ca… …ses (ROCm#690) Radix sort and merge sort test cases that use the largest data sizes hang or cause an SEH exception on Windows when using the latest version of the HIP SDK. Unfortunately, the HIP SDK issue interferes with our current strategies for skipping tests that require too much memory for a device (querying available device memory or examining the return value from hipMalloc), making them unreliable. While we wait for a fix, this change introduces a workaround by reducing the maximum size these test cases can use on Windows.
Workaround memory allocation issues on Windows for large size test ca… …ses (ROCm#690) Radix sort and merge sort test cases that use the largest data sizes hang or cause an SEH exception on Windows when using the latest version of the HIP SDK. Unfortunately, the HIP SDK issue interferes with our current strategies for skipping tests that require too much memory for a device (querying available device memory or examining the return value from hipMalloc), making them unreliable. While we wait for a fix, this change introduces a workaround by reducing the maximum size these test cases can use on Windows.
PreviousNext