How to handle scenarios when keycloak stops responding?

By default in MOSIP, we use keycloak as the IAM application. Keycloak stores all its active sessions in its memory for quick authentication and authorization. We have seen that whenever there is a high number of sessions in keycloak (more than 70K sessions) it stops responding and is not reachable. Hence, it is recommended that a new token or session in keycloak should be created only when the previous session expires.

Steps

Here we have documented some steps to recover from the same (when keycloak UI is unreachable) and continue to use the application without any hiccups.

  1. Restart the keycloak pods and wait for them to come back.

  2. Restart the Datashare service pod once keycloak is restarted.

We also recommend monitoring the keycloak sessions on a timely basis, if needed the excess sessions can be manually cleared off.

We had observed this issue when one of our ABIS partners were requesting a new token for every insert operation when they were fetching data from data share. Hence, we have created a script to remove active sessions for the client id: mosip-abis-client. You can find the script details in the below link.
https://mosip.atlassian.net/wiki/spaces/MSD/pages/865468419/Monitor+keycloak+sessions+and+clear+on+timely+and+need+basis+for+ABIS