Skip to main content

Every now and then I have a system flow that does not get abandoned, whilst it should. For example last weekend we had a flow that kept on running but threw an error somewhere. There seems to be no other way to force cancel it than rebooting the server. 

I did however find this documentation: https://docs.thinkwisesoftware.com/docs/indicium/process_flows 

But I couldn't get it to work. 

DELETE
/iam/appl/{process_flow_id}({id})/

I figured appl would be my app id or short name (in my case: 12 or pp) and {process_flow_id} would be ‘flow_export_messages’ (for example)

What is the {id} that the documentation is referring to?

This Api is only for process flows are started as a result of a user action.
The id in this case is a guid (the same as we have for staged resources). So for this usecase this won't work.

I've seen similar issues when the IAM database became unavailable, this made it impossible for Indicium to update the scheduled flow. Indicium should always update the scheduled system flow (even when it contains errors). Do you see any errors in the log around that time?

 - Are there errors in the Indicium log that start with something like: "Error scheduling system flow ...”
 - Do the system flows happen to stop around the same time? (maybe a scheduled backup time?)
 - Do you still see "Agent check ins”? (In IAM you can find the "Agent check in” screen in the advanced listbar.

 

 


I got this error:

2023-01-28T20:45:14.0558272+01:00  [err] Error scheduling system flow '"flow_export_message"' for application 12. (b95a9b29)
Microsoft.Data.SqlClient.SqlException (0x80131904): Connection Timeout Expired. The timeout period elapsed during the post-login phase. The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=1; handshake=21; [Login] initialization=0; authentication=0; [Post-Login] complete=13996;
---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
at Microsoft.Data.ProviderBase.DbConnectionPool.CheckPoolBlockingPeriod(Exception e)
at Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionPool.WaitForPendingOpen()
--- End of stack trace from previous location ---
at Indicium.Data.SQLServer.Helpers.TSFSqlConnectionHelper.createOpenedSqlConnection(DbConnectionInfo connectionInfo, IConnectionCredentials credentials, TSFApplication application, TSFUserContext userContext, TSFMessageHandler messageHandler) in C:\azp\agent\_work\1\s\src\Data\Indicium.Data.SQLServer\Helpers\TSFSqlConnectionHelper.cs:line 104
at Indicium.Data.SQLServer.Connection.SQLConnectionProvider.CreateConnection(String originalUserID, String effectiveUserID, TSFMessageHandler messageHandler) in C:\azp\agent\_work\1\s\src\Data\Indicium.Data.SQLServer\Connection\SQLConnectionProvider.cs:line 29
at Indicium.BackgroundServices.SystemFlowSchedulerBase.RunSystemFlow(Int32 guiApplID, TSFApplication application, FullProcessFlow systemFlow, DateTime scheduledTime) in C:\azp\agent\_work\1\s\src\Indicium\BackgroundServices\SystemFlowSchedulerBase.cs:line 137
at Indicium.BackgroundServices.SystemFlowScheduler.RunSystemFlow(Int32 guiApplID, TSFApplication application, FullProcessFlow systemFlow, DateTime scheduledTime) in C:\azp\agent\_work\1\s\src\Indicium\BackgroundServices\SystemFlowScheduler.cs:line 90
at Indicium.BackgroundServices.SystemFlowSchedulerBase.ScheduleSystemFlow(Int32 guiApplID, String systemFlowID, DateTime scheduledTime) in C:\azp\agent\_work\1\s\src\Indicium\BackgroundServices\SystemFlowSchedulerBase.cs:line 109
ClientConnectionId:e63d1d9f-70f1-43cc-bf35-9799803bedd7
Error Number:-2,State:0,Class:11

 


From the stacktrace it looks like it is indeed the case I mentioned earlier. Indicium can not connect to the IAM database to update the process flow status.

Do you happen to know if during that time something is happening on the database server?
What version of Indicium are you using? I had to compare the stacktrace with an older version. We updated the SQL libraries recently, maybe it might be more resilient to this type of failure.


This is version 2022.2 and indicium build 17. I am planning on upgrading to 2023.1 in 2 weeks so if that solves any of these issues it would be awesome!

 

It would be nice though to be able to kill system flows that are running in IAM.


Hi Kasper,

A follow up question because I am curious to know why the timeout is occurring. 

Are you running on an Azure DB?

If so, I am interested in some metrics of the Azure DB. If not, the rest of this message can be ignored.

 

The Azure databases have some nice metrics we can use to potentially investigate this problem.
The interesting ones are "Successful Connections", "Sessions percentage" and "Workers percentage".
The DTU percentage might also be useful in the case of a DTU database tier.
Can you provide some screenshots from the IAM database?
By default, the metrics view shows the last 24 hours. It might be more useful to zoom in on the error, so the granularity is smaller.

Can you give us some information about the Database? (The IAM DB)
 - Which Service tier (vCore or DTU)
 - How many vCore or DTU
 - In case of vCore General Purpose, are you using Provisioned or Server less?
The screenshots can also be shared privately if desired.

 

 


We're using the DTU model with 200 DTU. I've sent you a message for the other information.


Hi @Kasper Reijnders,

Has this been resolved or do you need further assistance?😄


Yes and no. The answer to the original question is there, however the second problem still persists, but I'll create a second topic for that.