What is Heartbeat Monitoring? (HBM)
Hubject’s Heartbeat Monitor has been developed as a tool to quickly inform Partners about Hubject's system status, in particular about the current state of a selection of our microservices (MS).
Heartbeat Monitoring ensures the reliability, availability, and performance of systems by constantly monitoring the runtimes of server communications. The intention behind this is to be transparent about situations where our system is likely to cause high runtimes so that Partners do not have to worry about if they are facing problems on their own backend.
What MS do we monitor?
* This information is updated every 30 minutes on the clock (i.e., 11:00 AM, 11:30 AM, 12:00 PM, etc.)
What does it look like?
When entering the HBS Process Monitor section, you will find yourself in the HBM section.
On the left side of this page, you may find the performance.
The performance is based on the average run time that the system takes to process each request. Based on the range of the Average runtime (ART) we can specify if the system is in optimal performance, is having some issues, or is down.
On the right side of this page, you may find the incident logs.
Every time the ART reaches levels that are considered slow performance or performance issues, information will appear in the incident logs dashboard, so that the users can know when the HBS had performance issues for the last 7 days.
How can I get more detailed information?
Please click on the 'see more' link when you want to have more detailed information on each MS.
You will be redirected to another page where you can find: the average run time and the targets for the MS. The performance events in the last 7 days will be shown below (if there are any).
What is the mechanism behind it?
For each microservice, Hubject’s Heartbeat monitor will show the current ‘average eRoaming Runtime’ which is updated every 30 minutes and is calculated by summing up the runtimes of all requests over that period of time and dividing it by the respective number of counts. The eRoaming Runtime is a part of the total runtime of a request. Specifically, the part of the time that is exclusively spent on Hubject server systems. Not counting runtime on Partner’s backend systems, nor the time spent for communication between servers, ensuring that we can make a statement about Hubject’s server health.
A ‘normal’ runtime is represented by a ‘green’ color, indicating that Hubject’s system is running ‘healthy’ and server processes on our end are working as intended. In that case, Partners that face errors or high runtimes themselves might have a problem on their own end. ‘Unusual’ runtimes are either marked as ‘yellow’, or even ‘red’. In those cases, Hubject’s system might face problems and is actually causing high runtimes that are faced by Partners.
At this point, you should already have enough insight into this feature. If you would like to know how we calculate the color categories, please feel free to contact us.