Use contiguous Arrays more#1438
Conversation
- remove some occurences if #ifdef STIR_TOF - use newer constructors taking a Bin
use them in ProjDataInMemory
|
@markus-jehl I'm still hunting for an elusive speed-up. Maybe this one does help a bit... |
[ci skip]
|
This merge broke something in the reconstruction. Projections still work, but when I do the initial reconstruction it comes out full of NaN‘s in the FoV. Haven‘t had time to look into the changes yet, but maybe this is also what impacts you on the psmr branch (since I also got a test failure on the MacOS build on my fork after pulling this merge). |
|
ah. sorry. This PR is quite innocent really. Essentially the only change is in There are some other formatting changes as I got rid of I've added tests for all this, but I suppose none of our tests do a recon with |
|
@markus-jehl I cannot find a problem, nor reproduce it. Are you sure it is this PR? Suggested debugging steps:
thanks for your help! |
|
I'm fairly certain it is this one: I went back to each PR in my fork and this is the one that first displayed the issue. Will go through your suggested debug steps today! |
|
The problem is the first STIR/src/include/stir/Array.inl Line 283 in 31e730e What is the purpose of _full_pointer_access, actually? It seems to be never checked and since it's only used to read data, not modify them it shouldn't be required. Still, I don't understand why |
which one is the first? L60?
that's true.
For STIR/src/include/stir/VectorWithOffset.inl Lines 611 to 613 in 31e730e I put those in as a safety check really.
I have no clue. Note that this whole stuff is not thread-safe: if multiple processes modify an I'm not so sure how to proceed. Can you find out where the NaNs are created? Is the sensitivity image ok for instance? Can you please create an issue, just pointing to this discussion here? (I suggest we continue here for clarity) |
|
Yes, L60 is the problematic one. I'm currently running with as much debug output as possible to narrow down where it happens. Sure, will create an issue. |
|
ok. Narrowing it down. Relevant lines are By the way, do your |
|
All tests with ctest pass, but recon_test_pack I first need to look up how to run. Will also look into the distributable sensitivity computation, since that already sounds like it might be a good candidate :-) |
interesting, as that is using OSMAPOSL etc as well
the thanks for debugging! |
|
Interestingly it looks exactly the same in repeat runs and also suing just 1 thread. So my suspicion is that the |
|
it certainly is easier if it's an ordinary bug as opposed to a race-condition! I guess I'd recommend compiling without OpenMP then and running Other (more desperate) thing to try: use a more conservative approach for
Vital to run If that works, we could try to check consistency between the old and new |
|
I was actually thinking about this pessimistic approach! Because I agree, running valgrind or a sanitiser should highlight where the issue is. For this I first need to create a minimal working example of the issue (currently I just test it in python with an iterative reconstruction that would take forever in serial with a sanitiser attached to it...). The recon_test_pack passes as well, btw. |
|
ok. From Python, things are always trickier, as there's more variations possible. That'll be why I didn't see it yet. As it's the sensitivity image, you can stop after the |
|
Just an update: I've now ported the test across to C++ and I get this weird looking sensitivity map even when disabling the new copy. So the more I dig the less I understand... Will update you when I have more clarity. |
|
Ok. Any chance you can do this on vanilla master as opposed to your branch? |
|
Yes, I'm currently doing it on the official master. It looks like in C++ some additional problems are caused by reading ProjDataInMemory from file and using them - as if some of that memory is either not persistent for long enough or gives issues with copying. |
|
By the way, if you have a small test example, potentially we could add it as a test, as it's escaping all other tests somehow. Can also do a |
|
The address sanitiser actually already throws an error in the first line of my test, where I'm reading in ProjDataInMemory as such: This is the ASAN output: When I replace the first if statement with |
|
Ok, progress! Managed to debug into STIR to see where it's failing. All that is needed is to handle the case where an array has only one index. Currently it is not entering the for loop and therefore not checking whether the sub-array is contiguous (plus the last sub-array was never checked). Changing the for loop from |
|
great! That was a bug. Can you submit a PR? |
|
I suggest if (i == this->get_max_index()
return true;
mem += (*this)[i].size_all();
if (mem != &(*(*this)[i + 1].begin_all()))
return false;and remove the final |
|
By the way, while auto inputProjData = std::make_shared<stir::ProjDataInMemory>(*(stir::ProjDataInMemory::read_from_file("filename.hs")));is fine, it might be clearer to use std::shared<stir::ProjDataInMemory> inputProjData(stir::ProjDataInMemory::read_from_file("filename.hs")); |
|
Yes, I'll issue a PR. But first I also want to understand the memory leak that still exists. |
|
Ok. Doesn't TSAN tell you where it came from? Valgrind could. |
|
This is all it says: I suspect it is when the object goes out of scope at the end of the test that the memory is not correctly freed. Has something changed there recently? |
|
Seems unrelated to the rest. Nothing changed there. It's not so useful that it doesn't say what object. Clearly, it cannot be the Just for sanity, please try the above constructor instead. |
|
I actually did (must have also tried before and then given up), but it gives me the following error: It's as if it can't convert pointers to convertable objects, only the objects themselves. With my old approach I initialise a new pointer with the converted object, so it worked. My suspicion is that the leak might be in the interfile parsing somewhere, but I haven't yet found an obvious allocation somewhere that might cause this. |
|
Update: |
oops, sorry. it needs a The |
oh right. There might even be an issue on that. I'll keep ignoring that for a bit longer. sorry... |
Hmm... The move still gives me the same error: |
|
ok. Of course, |
I couldn't find an issue related to this. Should I create one? |
|
Neither can I. Please do so. Thanks a lot! |

copy_to/fill_fromspecialisations forArraythat check if they are contigous. This enhances the likelihood that the compiler will recognise that it can callmemmoveas opposed to doing a loop. (related to std::copy of VectorWithOffset<float> does not fall back to memmove #1259 )ProjDataInMemory