Skip to content

Lazy-load ImageServer from ImageData#1489

Merged
petebankhead merged 18 commits intoqupath:mainfrom
petebankhead:lazy-images
Aug 8, 2024
Merged

Lazy-load ImageServer from ImageData#1489
petebankhead merged 18 commits intoqupath:mainfrom
petebankhead:lazy-images

Conversation

@petebankhead
Copy link
Member

Rough draft at an alternative approach to address the issue raised in #1488

Rough draft at an alternative approach to address the issue raised in qupath#1488
@carlocastoldi
Copy link
Contributor

carlocastoldi commented Apr 2, 2024

This approach is much more elegant than #1488, i like it!
However I feel it doesn't exploit all the possible info that can be gathered without reading the image files. Specifically images' metadata for info such as the number of channels and their names or even as well as pixel calibration.
If available, they're available in the .qpproj and they're already read when opening the project.

Perhaps this could be done by exposing a getCurrentMetadata() method, so that it does not require to pass through calling getServer()?

We no longer permit an ImageData with a null server.
Support accessing the metadata for an ImageServer directly for the `ImageData`. This is a step towards making metadata accessible without having to load the full image.
@carlocastoldi
Copy link
Contributor

Thank you for the big work on the metadata as well! I will test this PR thoroughly as soon as i can.

In the mean time i have a spotted a couple of typos for which i hope i won't pass as annoying if i point them out

@petebankhead
Copy link
Member Author

Thanks, I've fixed two of the typos but not sure about the other. This is still a draft PR and I'll need to work on it some more. It is a slightly scary change to make since it is so core, and there are (at least) three things I haven't really thought through:

  1. When ImageServer metadata is written in a project and when it isn't (I see it missing sometimes, which has previously been irrelevant since it's generated when the ImageServer is built... but becomes much more important if the server isn't necessarily built)
  2. What happens when a script changes the metadata, but the server itself hasn't been read (e.g. setting channel names or pixel size)
  3. How exceptions are handled when lazy loading fails

At the moment, I doubt this PR does the right thing in those cases.

I'll mark it was ready to review when I think it's ok - that's probably the better time for thorough testing.

@carlocastoldi
Copy link
Contributor

It is a slightly scary change to make since it is so core

Yes, i get it. It's better being cautious here!

When ImageServer metadata is written in a project and when it isn't (I see it missing sometimes, which has previously been irrelevant since it's generated when the ImageServer is built... but becomes much more important if the server isn't necessarily built)

what I would say that in that case it would load the server, read the metadata and write them in the .qpproj file for future accesses. I wouldn't assure that ImageData.getServerMetadata() never loads the image server. It avoids it as long as it is possible, otherwise it will.

What happens when a script changes the metadata, but the server itself hasn't been read (e.g. setting channel names or pixel size)

Again, if ImageData.updateServerMetadata() i would actually load the server()+update qpproj file. Avoid doing it lazily, as that would easily lead to unexpected states for the users.

How exceptions are handled when lazy loading fails

What's wrong in behaving the same as when an exception occurs while creating an ImageServer?

Avoid deprecated `setLenient()` and add `getPrettyPrintInstance()` to avoid relying on the mysterious `getInstance(boolean)`.
Fix (hopefully) but whereby the wrong thumbnail image could sometimes be shown.
Serialize the `ServerBuilder` and not the `ImageServer`, which means we can serialize without needing to build the server itself.

Also support creating an `ImageServer` from a `ServerBuilder` JSON representation directly.
This means that an`ImageServer` can be created with `fromJson(json, ImageServer.class)` applied to the output of both `toJson(obj, ServerBuilder.class)` and `toJson(obj, ImageServer.class)`
This includes setting the server name when the project entry is being loaded. Usually, this should already have been set previously - but could be out of sync if the name was changed and the data not saved.
@petebankhead petebankhead marked this pull request as ready for review August 7, 2024 16:05
@petebankhead
Copy link
Member Author

I've flagged this as ready for review now. It definitely needs more testing.

As described by @carlocastoldi the server is loaded whenever any change to the metadata is made.

The server can be loaded more often might be expected, sometimes for subtle, hard-to-address reasons. Some relevant facts:

  • ImageServerMetadata isn't usually saved with an image when it is first added to a project. This only happens the first time the data file is saved.
  • The ImageServerMetadata is updated as soon as an image is opened in a project to ensure that the name specified in the project matches that in the server metadata. This triggers the server to be loaded if
    • The ImageServerMetadata isn't available, or
    • The ImageServerMetadata is available, but contains the wrong name (e.g. the name was changed in a project, but then the data file wasn't saved afterwards)
  • Run for project will always force the ImageServer to be loaded, because it will always save the data - and this act of saving requires the ImageServer. So you can only get advantages if you avoid Run for project, e.g. by using Run for project (without save) instead.
    • We can't rely on not saving if there have been no changes, because the script editor now automatically fires a hierarchy change after completion. This means QuPath always thinks that the script may have changed the ImageData, so ImageData.isChanged() returns true. We didn't used to do this, but then we had to keep telling users to add fireHierarchyUpdate() at the end of many otherwise simple-looking scripts, and that was a pain for everyone.

This basically means that lazy-loading only works if the data for an image has been saved at least once, and the user hasn't messed around too much with image names within their project.

The 'easy' way to trigger an image to be saved once is to do a 'Run for project' script - even if the script doesn't do anything. This should be enough to prompt the ImageServerMetadata to become embedded within the project. Although it should also usually happen in practice anyway, just through working with the images for some kind of annotation or analysis, so the other solution is... just don't worry about it. If this works properly, you should end up with lazy-loading a lot of the time - just not necessarily quite as often as you'd want.

@petebankhead
Copy link
Member Author

I'll merge this now to avoid having too many conflicts to resolve, and to have more time to try it in combination with all the other v0.6.0 changes.

I'd still be grateful for anyone interested to test this!

@petebankhead petebankhead added this to the v0.6.0 milestone Aug 8, 2024
@petebankhead petebankhead merged commit 3544e61 into qupath:main Aug 8, 2024
@petebankhead petebankhead deleted the lazy-images branch August 8, 2024 14:07
@carlocastoldi
Copy link
Contributor

Thank you, this was pretty fast!

Accordingly to my testing it works all okay and pretty safe. I think the idea to write as soon as something is being modified is worth it. The fact that now it requires that the images are opened & modified at least once is not a problem. I think we can put up with it for the time being and leave it for some future adjustment, if needed.

However i think you got something off: it should be possible to write the imagedata if the server was never loaded. The problem is not that the image name is always written to the ImageData's file, as that does not make the metadata different from the old ones when the name was kept as is.
The issue is that, when writing the .qpdata, it wants to know the ImageServer unique identifier and summary. Both of these cannot have changed without the server being loaded first (the unique identifier possibly yes if it uses the filepath, but in that case moving the image folder after the ImageData was saved would have the same result).

@carlocastoldi
Copy link
Contributor

One more thing: should we expose the builders in qupath.lib.images.servers.ImageServers.* so that you can test what kind of ImageServer is being used, if the builder is available?

@petebankhead
Copy link
Member Author

I'd rather not expose more than necessary. What would it solve?

(I seem to have introduced a big bug in PathIO, so currently working on this... v0.6.0-SNAPSHOT isn't in a very usable state right now)

@carlocastoldi
Copy link
Contributor

I am sorry, I must have missed this message!

In my case, I was thinking to help extensions I use to take full advantage of this new feature. Specifically, I was looking into qupath-extension-abba, but the only thing stopping me from being able to port it is that it checks whether the current image is rotated or not. If it is, it applies a transformation to the imported ROIs.

I guess interrogating specific image server is unfeasible (i.e. using rotated_server.getRotation()), however we could perhaps avoid requesting for the server if it can't be interrogated. Hence why I was thinking to expose the builders: extensions/scripts can decide whether to make the server concrete based on their implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants