Monitoring
Output Organizer provides a comprehensive set of metrics that allow monitoring and analyzing the performance and health of your application. You can use this information to identify bottlenecks, optimize resource usage, and ensure the overall reliability of your system.
Exposure takes place via Spring Boot Actuator standards and also includes standard metrics such as startup time, resource utilization, JVM metrics, Tomcat sessions, and more.
Configuration
To enable metrics, use the appropriate settings in the Helm values:
values.yml
management:
endpoints:
web.exposure.include: health, prometheus, metrics
prometheus:
metrics:
export:
enabled: true
jmx:
metrics:
export:
enabled: true
This example enables the health, prometheus and metrics endpoints.
Access Metrics
Prometheus
If prometheus is enabled, the metrics will be propagated under the http://localhost:8080/actuator/prometheus endpoint and is scraped by the Prometheus server.
The metrics can be visualized, e.g. with Grafana by adding the prometheus adapter.
JMX
The JMX exposure exposes the metrics via a default Java MBean server handling the jmx request. Depending on the JVM Vendor there are specific Java Mission Control Derivatives which can be used to access the JVM and read the metrics.
By default, most JDKs come with the JConsole-Tool, which provides the most basic access to the metrics.
Application Metrics
Basic application metrics like CPU usage, JVM memory will be provided via the default actuator. In addition, application-specific metrics are provided to gain measurements specific to the application.
The metrics will be propagated under the http://localhost:8080/actuator/metrics endpoint.
Metrics recommended to be included in monitoring and alerting are written in bold.
Key Performance Metrics
Organizer ("fusion")
The most important metrics are response times for retrieving and modifying collections and their elements. The following table lists the key metrics you should monitor:
Metric Name | Description |
---|---|
fusion.collection.get | Duration to retrieve a collection (current version) from the database in JSON format. Note: This does not include external API calls. |
fusion.collection.create | Duration to create (POST) or update (PUT) a collection. Currently, both operations are combined under this metric. |
fusion.content.get | Time required to load the data stream of a single source document. A composite document may require multiple calls to display the whole document. Depending on the storage location this includes a call to an external archive or storage. |
fusion.lock.get | Duration to get lock status. |
fusion.lock.acquire | Duration to acquire a lock. |
Viewer
The viewer is responsible for serving and rendering documents. The following metrics are most relevant for performance monitoring:
Metric Name | Description |
---|---|
jwt.fusion.document.load | Time required to load the data stream of a single source document and process modifications. A composite document may require multiple calls to display the whole document. Data typically comes from the Organizer pipeline before being made available to the viewer. That's why this strongly correlates to the organizer metric "fusion.content.get" * |
jwt.fusion.document.load.firstpage | Time to load a document up to its first page. |
jwt.render.tile | Time the Viewer Pod needs to render a single page (or tile) of a document into a PNG image. Rendering is performed in the Viewer, not in the browser. |
jwt.fusion.document.load.failed | Number of failed document loads. |
Additional Viewer Metrics
For comprehensive viewer monitoring, refer to the jadice webtoolkit documentation:
- Tile Rendering Metrics: jadice webtoolkit monitoring - tile-rendering
- Cache Metrics: jadice webtoolkit monitoring - cache
- Document Loading Metrics: jadice webtoolkit monitoring - loading-documents