Hello Jeroen,
Staged resources are currently not cleaned up automatically. There are a few reasons that have contributed to this decision and our estimation that this is not an issue:
- Most clients that use resource staging are Universal GUIs and the Universal GUI always cleans up resources when it’s done with them, except, as you stated, when it’s closed improperly, in which case it could leak one staged resource.
- Those clients that are not Universal GUIs almost always use the simpler APIs where the resource is staged and committed within a single request and Indicium itself ensures that the resource is cleaned up. Those clients that operate more like the Universal GUI does (such as yours), should try to ensure that staged resources are always either committed or deleted.
- Staged resources almost never use a noteworthy amount of memory. The typical staged resource probably uses less than a kilobyte of memory. Blob and clob data types such as varbinary(max) and varchar(max) do allow resources to potentially use a lot of memory. We are looking into the possibility of allowing a maximum length setting on these data types as well. It's important to add here that files which are uploaded to a staged resource using /upload_<file_property_id> are never stored in memory in their entirety. Instead they are streamed to a file cache and then streamed to their final location upon committing the resource.
- We recommend recycling Indicium periodically as a policy. A good recycling policy is, for instance, once a week. Recycling Indicium will clean up all staged resources.
There are also some more technical reasons, such as the fact that user sessions are not statefully managed by Indicium as such. Authentication cookies are self-contained tokens and Indicium therefore has no concept of sessions expiring. Although we could in theory tie staged resources to the user sessions that are tracked by IAM and periodically clear them.
With that said, the only relevant risk to consider here is that someone is purposefully creating staged resources and not cleaning them up. First, it is important to realize that only authenticated users would be able to do this. Secondly, having staged resource expire with a user’s session would not mitigate this risk in any way, as this person could simply keep their session alive. A better solution would be some form of rate limiting, which is something that we are looking into to support natively, but it is also something that firewalls (WAFs) will take care of for you.
Another thing to mention here is that we are currently finalizing our implementation to make Indicium fully stateless with regard to application state (i.e. staged resources, process flows, etc.) as well. This is an important goal for us to improve Indicium’s ability to scale horizontally, which removes the need for sticky sessions when load balancing and also allows for seamless fail-over scenarios where users can continue working without noticing a thing when an instance of Indicium goes down. With this, all application state will be stored in a Redis cache which can be hosted anywhere (e.g. Azure, AWS, on-premises). Every Redis cache entry will in fact have an expiry time to ensure that the cache doesn’t grow indefinitely. By default Indicium will keep storing these things in memory, but it does provide an option to achieve what you want, if it’s a worry.
So I would not worry about the memory filling up with staged resources by accident. I would simply recommend to recycle Indicium once a week to mitigate this risk entirely. As for the memory filling up with staged resources through malicious means, I would recommend strong passwords and two-factor authentication and if that feels insufficient, using a competent WAF.
I hope this clarifies our position. Please let me know if you have additional questions.
Hello @Vincent Doppenberg,
Thank you for the swift reply!
Staged resources almost never use a noteworthy amount of memory. The typical staged resource probably uses less than a kilobyte of memory. Blob and clob data types such as varbinary(max) and varchar(max) do allow resources to potentially use a lot of memory. We are looking into the possibility of allowing a maximum length setting on these data types as well. It's important to add here that files which are uploaded to a staged resource using /upload_<file_property_id> are never stored in memory in their entirety. Instead they are streamed to a file cache and then streamed to their final location upon committing the resource.
That is good to know. Our API limits the max number of incoming requests from a single IP per minute, so it would be very difficult for someone to (purposefully) overload Universal’s memory through it with such a low memory footprint per staged resource.
We recommend recycling Indicium periodically as a policy. A good recycling policy is, for instance, once a week. Recycling Indicium will clean up all staged resources.
I checked this in our servers, we do periodically recycle our Indicium Universal instance, so idle staged resources do not result in permanently lost memory.
I see your point in regards to connecting a staged resource's lifetime to a user session not mitigating the risk of overloading server memory. Perhaps an idea for improving memory usage is to check for any possible staged resources connected to an account which was just logged out (as in, the IAM session is manually closed by a user clicking a “log out” button), and removing them from memory, if something like this does not already exist. That way we do not need to rely on our server configurations for efficient memory usage. If need be this is something we could implement in our internal application as well, but it seems like it would do Universal GUIs some good as well.
With this, all application state will be stored in a Redis cache which can be hosted anywhere (e.g. Azure, AWS, on-premises). Every Redis cache entry will in fact have an expiry time to ensure that the cache doesn’t grow indefinitely. By default Indicium will keep storing these things in memory, but it does provide an option to achieve what you want, if it’s a worry.
This is also very good to know. Even though we do not really have to worry about memory overload, it should still be a good addition to have that little extra security in our memory handling. I will definitely take this up with our engineers working in the Thinkwise platform.
Thanks again for the thorough and lengthy answer! Good to know that this is a minimal risk and can be easily mitigated entirely, as well as that there are more solutions on the way to further improve it.