Refactor LSet to use C++ std::multimap by BrentBaccala · Pull Request #1301 · Singular/Singular

BrentBaccala · 2025-10-23T17:04:45Z

Refactor LSet to use C++ std::multimap

This PR replaces the hand-coded array-based implementation of LSet (the lazy set data structure in Groebner basis computation) with a C++ std::multimap container wrapped in a custom writable_set class.

Key Changes

New data structure: Introduced writable_set<LObject, CompareLObject> in kernel/GBEngine/writable_set.h as a wrapper around std::multimap
Simplified API: Replaced manual array management (enterL, deleteInL, array indexing) with modern C++ container methods (push, pop, top, empty) and a few non-standard methods (would_be_top, pop_and_erase, reorder)
Improved comparator handling: Converted posInL* functions (position-finding functions) to compareL* functions that return -1/0/1 for use with the multimap's ordering
More abstract iteration: Use iterators instead of array indices to traverse LSet containers
Explicit initialization of strategy object: Required (instead of memset 0) since LSet can no longer be zero-initialized
FIFO/LIFO support: Added sequence numbering to maintain FIFO or LIFO ordering for equal elements depending on the comparator (in the older code, some comparators were FIFO and some were LIFO)

Files Modified

13 files changed with 1,829 insertions and 1,335 deletions
Core changes in kernel/GBEngine/kutil.{cc,h}, kernel/GBEngine/kInline.h
Updates throughout GB engine: kstd1.cc, kstd2.cc, kstdfac.cc, khstd.cc, gr_kstd2.cc, sca.cc, kverify.cc
Module updates: customstd.cc, std_wrapper.cc

Benefits

More maintainable code using standard library containers
Faster insertion operation on large LSets
Cleaner separation between ordering logic and container management

The PR is split into two commits to make the first (and primary) commit easier to read. git doesn't format the second commit as well, but all the second commit does is remove the unused posInL* functions. The primary commit is a squashed version of a much longer commit chain from https://github.com/BrentBaccala/Singular/tree/Lqueue

All core testsuite tests (Short/ok_s.lst, Long/ok_l.lst, Old/universal.lst, Plural/short.lst, Plural/dmod.lst, Plural/long.lst, Buch/buch.lst) run and pass without modification.

Some of the debugging messages have changed, due to the difficulty in finding an element's index in the LSet now that we use a tree instead of an array.

BrentBaccala · 2025-10-27T15:10:40Z

Hans Schönemann suggested some benchmarks, and I asked Claude to generate a script to run them. Here are some results, 5 runs each on an AMD Ryzen 5 2500U

Average Times Comparison

Test	V1-spielwiese	V2-pr1301	Comparison
cyclic6-ZZ-lp	0.3073s (±0.2127)	0.3379s (±0.2944)	V1 1.09x faster
cyclic6-ZZ-dp	0.0638s (±0.0086)	0.2124s (±0.1696)	V1 3.32x faster
cyclic6-hom-QQ-dp	0.1752s (±0.0246)	0.1485s (±0.0113)	V2 1.17x faster
newellp1	2523.0554s (±40.7906)	2487.6538s (±19.9567)	similar
cyclic6-QQ-lp	0.9288s (±0.4223)	0.4438s (±0.0138)	V2 2.09x faster
katsura9-QQ-dp	12.1965s (±4.7289)	12.6604s (±4.4944)	similar
katsura9-ZZ-dp	0.8516s (±0.1712)	0.8214s (±0.1108)	similar
cyclic6-hom-ZZ-dp	0.0613s (±0.0062)	0.0673s (±0.0050)	V1 1.09x faster
cyclic6-QQ-dp	0.2409s (±0.1053)	0.1844s (±0.0469)	V2 1.30x faster

The only runs where this PR shows a slowdown are very fast. One possible explanation is that the older array-based code is faster for small L sets, while the tree-based code in this PR becomes more performant for larger L sets.

BrentBaccala · 2025-10-27T15:19:04Z

The patches are much easier to read with the patience diff algorithm, like this:
git log --diff-algorithm=patience -p

BrentBaccala · 2025-11-05T16:07:45Z

Here's some timings from an improved benchmarking script that compares different algorithms:

Test	Singular-spielwiese	Singular-LSet
newellp1-modstd	316.6912s (±9.3126)	332.2769s (±5.5500)
newellp1-sba	69.5517s (±6.6773)	75.1145s (±6.2697)
newellp1-slimgb	39.3158s (±6.2254)	35.6399s (±4.8627)
newellp1-std	2571.6555s (±32.7144)	2577.7527s (±39.1309)

Each time is the average of 5 runs on my AMD Ryzen 5 2500U.

There's not much change, but the PR paves the way for future improvements.

hannes14 · 2025-11-17T15:37:36Z

Am Wed, Nov 05, 2025 at 08:13:45AM -0800 schrieb Brent Baccala:

BrentBaccala left a comment (Singular/Singular#1301) Here's some timings from an improved benchmarking script that compares different algorithms: | Test | Singular-spielwiese | Singular-LSet | |------|----------|----------| | newellp1-modstd | 316.6912s (±9.3126) | 332.2769s (±5.5500) | | newellp1-sba | 69.5517s (±6.6773) | 75.1145s (±6.2697) | | newellp1-slimgb | 39.3158s (±6.2254) | 35.6399s (±4.8627) | | newellp1-std | 2571.6555s (±32.7144) | 2577.7527s (±39.1309) | Each time is the average of 5 runs on my AMD Ryzen 5 2500U. There's not much change, but the PR paves the way for future improvements. -- Reply to this email directly or view it on GitHub: #1301 (comment) You are receiving this because you are subscribed to this thread. Message ID: ***@***.***>

Dear Brent Baccala, thanks for your work with the LSet. You are right about the main problem: the elements in the LSet are too large (they grew over time.. I was not aware that an entry has now around 140 bytes) respectively it takes too much time to move them around. But I will not merge that PR: - the tests Tst/New/ringlocal.tst Tst/New/stdZtests.tst fail (they do not finish) - no improvements in the timings (see above) - std::multiset does not work (well) with omalloc - it uses more memory: some (8 bytes for the pointer) can not be avoided, but there are also vtables etc. - it is difficult to accces more than just the top element II wiull try to make LSet a an array of pointers. Hans Schoenemann

BrentBaccala · 2025-11-18T18:10:36Z

Dear Hans, Thank you for your reply and feedback. Your decision is completely understandable, and I support it fully. Let me elaborate on a few points: - I didn't know some of the tests were failing. I'll look into them. - I didn't consider how std::multiset would interact with omalloc. I'll look into that. - I do think we should shift (if possible) to iterators (of some kind). Not being able to easily access more than the top element might end up being a common situation if part of the L set is either on disk or on another node in a cluster. - I think shifting from "posInL" to "compareL" is a real improvement that should be made (see below) I'm currently exploring "mathicgb". What's your opinion of this code? It looks like we've got an F4 implementation there, as well as a newer implementation of standard Buchberger and F5. I'm trying to organize the library code to be more independent of the choice of algorithm. It's unfortunate that we use the names "std" for the original algorithm and "groebner" for a heuristic selection of algorithm. I wonder even about changing "std" to be a synonym for "groebner" and adding "bba" as the new name for "std". That might break compatibility too much; what do you think? Failing that, I'm thinking that a lot of the places we use "std" should be changed to use "groebner". What are your thoughts of making that change throughout the LIB directory? I'm also working to develop a protocol-independent test suite, by making copies of a lot of our tests, removing "option(prot)", using "groebner" instead of "std", and just checking their return values. This will be useful in checking the correct operation of "mathicgb". I've got a nice benchmark script (1100 lines of bash) written by Claude. You can select different algorithms, different characteristics, different numbers of variables for the cyclic and katsura tests. It also does the newellp1 test. Should I put it in as a separate PR? As far as the L set PR is concerned, I've improved it somewhat, so I'll push those changes and flag it "draft" (which I just did). I'll look into the concerns you've got about it, and perhaps we can revisit it at a later date. I would especially recommend factoring out the LObject comparison code and keeping it separate from the binary search algorithm (i.e, introducing "compareL" as the switchable function). I think it makes the code a lot easier to understand and maintain. I think we should do the same thing for the T set, as well. What about a PR to just make that change, without shifting everything to standard containers? Thank you. agape brent

…

On Mon, Nov 17, 2025 at 10:37 AM Hans Schoenemann ***@***.***> wrote: *hannes14* left a comment (Singular/Singular#1301) <#1301 (comment)> Am Wed, Nov 05, 2025 at 08:13:45AM -0800 schrieb Brent Baccala: > BrentBaccala left a comment (Singular/Singular#1301) > > Here's some timings from an improved benchmarking script that compares different algorithms: > > | Test | Singular-spielwiese | Singular-LSet | > |------|----------|----------| > | newellp1-modstd | 316.6912s (±9.3126) | 332.2769s (±5.5500) | > | newellp1-sba | 69.5517s (±6.6773) | 75.1145s (±6.2697) | > | newellp1-slimgb | 39.3158s (±6.2254) | 35.6399s (±4.8627) | > | newellp1-std | 2571.6555s (±32.7144) | 2577.7527s (±39.1309) | > > Each time is the average of 5 runs on my AMD Ryzen 5 2500U. > > There's not much change, but the PR paves the way for future improvements. > > -- > Reply to this email directly or view it on GitHub: > #1301 (comment) > You are receiving this because you are subscribed to this thread. > > Message ID: ***@***.***> Dear Brent Baccala, thanks for your work with the LSet. You are right about the main problem: the elements in the LSet are too large (they grew over time.. I was not aware that an entry has now around 140 bytes) respectively it takes too much time to move them around. But I will not merge that PR: - the tests Tst/New/ringlocal.tst Tst/New/stdZtests.tst fail (they do not finish) - no improvements in the timings (see above) - std::multiset does not work (well) with omalloc - it uses more memory: some (8 bytes for the pointer) can not be avoided, but there are also vtables etc. - it is difficult to accces more than just the top element II wiull try to make LSet a an array of pointers. Hans Schoenemann — Reply to this email directly, view it on GitHub <#1301 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABLXJBZL4XVYB3ETRIW242D35HTVNAVCNFSM6AAAAACKA6BX4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNBSGUYDSMJZGI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

jankoboehm · 2026-03-13T08:48:11Z

Dear Brent, as you may have heard, Hans passed away at the end of 2025.

I still think you have uncovered a relevant issue. However, this goes quite deep into the internals. So the change can only be justified if there is a clear performance benefit. Once the tests are passing, we will carry out a careful analysis of correctness and performance beyond the standard test suites.

Still, thanks a lot for your work — and we will see how it turns out!

jankoboehm · 2026-03-13T08:50:02Z

I also think it makes sense to focus on the original issue.

BrentBaccala · 2026-03-13T16:55:01Z

Dear Janko,

No, actually I did not know that Hans had passed away. I see the announcement now on the Singular website. I'm sorry to hear that. I don't know if there's anything I can do to assist you, please please reach out if you think that I can.

About the pull request...

There's several issues here. One is code cleanup - refactoring so that we have comparator functions that compare two LObjects without every one of them having a binary search mixed in with the function (like we do now). I've been thinking of opening a pull request just for that.

The other issue is actual performance improvement. I haven't achieved that yet. A big part of the reason I opened this PR originally was that I didn't want to wait until I've got a massive patch that touches a thousand lines of code, so I was thinking that converting to std::multimap was an incremental step. But no actual improvement, and some regression because sometimes the array operations are faster than the tree operations.

So, right now, I've got a side branch that just changes the LObject comparator functions, and I haven't looked at it for a while. Most of my focus is on trying to grapple with the performance issues. I've got a benchmark script and some tracing code that dumps all of the Gröbner basis calculations within a primdec calculation as SSI files. It's not totally disorganized, but it's not quite suitable as a PR just yet.

Key Changes - Introduced writable_set<LObject, CompareLObject> in kernel/GBEngine/writable_set.h as a wrapper around std::multimap - Replaced manual array management (enterL, deleteInL, array indexing) with C++ container methods (push, pop, top, empty) - Converted posInL* functions (position-finding functions) to compareL* functions that return -1/0/1 for use with the multimap's ordering - Use iterators instead of array indices to traverse LSet containers - Use explicit initialization of strategy object members (instead of memset 0) since LSet can no longer be zero-initialized

… blocks

Two bugs fixed: 1. updateL() used std::iter_swap to move pure-power elements to the front of the L set. With the multimap-based LSet, iter_swap corrupts the sorted invariant since the multiset expects elements to stay in order. Fix: remove both iter_swap calls and rely on the compareL10 comparator (which already gives pure-power elements priority). Add missing strat->L.reorder() call after updateL() in enterSMora. 2. In spielwiese, posInL11Ringls normalizes polynomials by negating those with negative leading coefficients (as a side effect of insertion). The LSet branch's compareL11Ringls comparator doesn't have this side effect. Without normalization, the nEqual() checks in redRiloc_Z and redRiloc make wrong lazy-reinsertion decisions, causing stdZtests.tst to hang on "Yue's Bug" (integer ring with mixed ws ordering). Fix: normalize leading coefficients in LSet::push() for non-global Z-ring orderings, and before comparison points in redRiloc/redRiloc_Z. Test results: 184/184 (ringlocal, stdZtests, Short/ok_s.lst all pass). Co-Authored-By: Claude Opus 4.6 <[email protected]>

BrentBaccala · 2026-04-10T05:28:31Z

An update on this PR:

It's still not ready to merge. I've progressed quite a bit past it and have gotten good performance gains (60-70%) on the large real-world polynomial systems that I'm trying to solve, but shifting LSet to use a tree introduces performance regressions on small Gröbner basis calculations, for which the original array implementation is faster.

I'm planning to handle this by further enhancing LSet so that it can use either data structure, and switch between them dynamically, but that's further refactoring. It will come. Right now, I'm focusing on refactoring S, which is starting to look like it's as challenging to handle as L, so that the whole bba algorithm can be parallelized.

BrentBaccala marked this pull request as draft November 18, 2025 18:01

BrentBaccala force-pushed the LSet branch from fabdb90 to ea60d7f Compare March 13, 2026 03:24

BrentBaccala and others added 10 commits March 23, 2026 22:06

Add build*/ to .gitignore

41cb178

task-runner: bbb-boot-test-network

27d99c1

Add install*/ to .gitignore

4f8e440

task-runner: investigate-test-regressions

f9b9b74

remove unused posInL* functions that were already disabled with #if 0…

21ce53e

… blocks

task-runner: singular-build-spielwiese

cd54ba5

task-runner: bbb-boot-test-network

6714ef3

task-runner: comprehensive-test-all

ef53af7

BrentBaccala force-pushed the LSet branch from ea60d7f to ef53af7 Compare March 24, 2026 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor LSet to use C++ std::multimap#1301

Refactor LSet to use C++ std::multimap#1301
BrentBaccala wants to merge 10 commits intoSingular:spielwiesefrom
BrentBaccala:LSet

BrentBaccala commented Oct 23, 2025

Uh oh!

BrentBaccala commented Oct 27, 2025

Uh oh!

BrentBaccala commented Oct 27, 2025

Uh oh!

BrentBaccala commented Nov 5, 2025 •

edited

Loading

Uh oh!

hannes14 commented Nov 17, 2025 via email

Uh oh!

BrentBaccala commented Nov 18, 2025 via email

Uh oh!

jankoboehm commented Mar 13, 2026

Uh oh!

jankoboehm commented Mar 13, 2026

Uh oh!

BrentBaccala commented Mar 13, 2026

Uh oh!

BrentBaccala commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

BrentBaccala commented Oct 23, 2025