System monitoring / Health Check


Introduction

This article describes how the CoreOne services can be monitored.

Usage

This information can be used for normal system monitoring as well as for load balancing and high availability functions. This helps customers to monitor the application by themselves. Please do not forget to duplicate the monitoring system on all nodes if the application is deployed in a high availability deployment scenario. This System monitoring recommendations is not including a rudimentary server monitoring like CPU, RAM or disk usage. This is of course also recommended.

Components you should monitor

Windows Services

We recommend to monitor the state of the following Windows Services:

  • World Wide Web Publishsing Service (IIS)

  • CoreOne Suite

  • CoreOne Suite System Connector

  • MySQL

App Pools and Sites

Please monitor the state of all App Pools and Sites starting with CoreOne*

Health Check functions

The following functions work like normal HTTP GET calls:

Service

URL

Answer OK

Answer ERROR

What is being tested?

Description

Service

URL

Answer OK

Answer ERROR

What is being tested?

Description

CoreOne Authentication Services

https://${authenticationUrl}/health

HTTP 200

HTTP 500

HTTP 404

  • Database Connection

  • Application Service API

  • Discovery Document



CoreOne Authentication Services

https://${authenticationUrl}/health/details

HTTP 200



HTTP 200 

HTTP 404

HTTP 500



  • Database Connection

  • Application Service API

  • Internal Services

Gives a 200 answer with a detailed list of which subsystems work and which do not. The 200 also appears when subsystems are not available. The content is intended for graphical assessment by system administrators.

The 404 and 500 message appears if, for example, .NET Core hosting is missing on the system.

CoreOne Authentication Services

https://${authenticationUrl}/health/authenticationservice

HTTP 200

HTTP 500

HTTP 404

  • Database Connection

  • Internal Services

This checks the same connections as the /health/details but without the API-Connection.

This is used to check the connection from the Backend- to the Auth-Server.

CoreOne Web Service

https://${webUrl}/health

HTTP 200

HTTP 500

HTTP 404

  • Application Service Health

  • Application Service API

  • Authentication Service Health



CoreOne Web Service

https://${webUrl}/health/details

HTTP 200

HTTP 200 

HTTP 404

HTTP 500

  • Application Service Health

  • Application Service API

  • Authentication Service Health

Gives a 200 answer with a detailed list of which subsystems work and which do not. The 200 also appears when subsystems are not available. The content is intended for graphical assessment by system administrators.

The 404 and 500 message appears if, for example, .NET Core hosting is missing on the system.

CoreOne Self-Service Portal

https://${portalUrl}/health

HTTP 200

HTTP 500

HTTP 404

  • Authentication Service API

  • Application Service Health

 

CoreOne Self-Service Portal

https://${portalUrl}/health/details

HTTP 200

HTTP 200 

HTTP 404

HTTP 500

  • Authentication Service API

  • Application Service Health

 

CoreOne Application Services

http://${applicationUrl}:7000/health

HTTP 200

HTTP 500

HTTP 404

  • Service is up and running (Implies Database Connectivity)



CoreOne Database Services

We recommend to monitor the database instance you are using for the CoreOne Database Services. Depending on the database you use, the parameters to monitor may vary. Here is an example list of metrics to track for an MySQL database:

Status Version Uptime Aborted clients per second Aborted connections per second Connection errors accept per second Connection errors internal per second Connection errors max connections per second Connection errors peer address per second Connection errors select per second Connection errors tcpwrap per second Connections per second Max used connections Threads cached Threads connected Threads created per second Threads running Buffer pool efficiency Buffer pool utilization Created tmp files on disk per second Created tmp tables on disk per second Created tmp tables on memory per second InnoDB buffer pool pages free InnoDB buffer pool pages total InnoDB buffer pool read requests per second InnoDB buffer pool reads per second InnoDB row lock time InnoDB row lock time max InnoDB row lock waits Slow queries per second Bytes received Bytes sent Command Delete per second Command Insert per second Command Select per second Command Update per second Queries per second Questions per second Binlog cache disk use Innodb buffer pool wait free Innodb number open files Open table definitions Open tables Innodb log written Calculated value of innodb_log_file_size Size of database {#DBNAME} Binlog commits Binlog group commits Master GTID wait count Master GTID wait time Master GTID wait timeouts Get status variables InnoDB buffer pool read requests InnoDB buffer pool reads

If you have a Galera Cluster in place, you should also monitor the health of the cluster and each node.

We also recommend to observe the Backup Process of MySQL. You could check the file size of the backup and the timestamp.

Errors in Logs & Tasks

We recommend to observe the Log’s for Errors. A corrupt import or if you delete objects in a target system, can generate a lot of errors and affect the performance. An overview of Log-Files can be found here: Logs

You also can and should monitor tasks: Just check the lastrun_message-column. For further details see: Task configuration

Please get in touch with us, if you want to observe the tasks- and log_log-Table with a read only user.

Appendix

Some monitoring systems are not able to check all these components directly. You can use workarounds like Powershell and Windows Task Scheduler to create a monitoring-text-file. Then you can check this text file content (please do not forget to check the age of the text-file with the monitoring - this way you make sure, the values are up to date). If you need assistance with that, we are happy to help you.

Please also make sure to keep a documentation of what to do, if the monitoring check state is not in a accepted condition.

Detailed monitoring

The health checks listed above only provide information about whether the relevant Services are available, started and operational. The checks do not provide any information about internal processes, pending processes or the like.

This detailed information can also be monitored using managed services. Please get in touch with your contact person.

© ITSENSE AG. Alle Rechte vorbehalten. ITSENSE und CoreOne sind eingetragene Marken der ITSENSE AG.