Monitoring
This guide describes monitoring capabilities and health checks in VibeMQ.
Health Checks
VibeMQ provides HTTP endpoints for health checks and metrics.
Enabling Health Checks
.ConfigureHealthChecks(options => {
options.Enabled = true;
options.Port = 8081;
})
Endpoints
GET /health/
Returns server health status.
Request:
curl http://localhost:8081/health/
Response (200 OK):
{
"status": "healthy",
"active_connections": 15,
"queue_count": 5,
"memory_usage_mb": 256
}
Response (503 Service Unavailable):
{
"status": "unhealthy",
"active_connections": 0,
"queue_count": 0,
"memory_usage_mb": 512
}
Status codes:
200 OK— server healthy503 Service Unavailable— server unhealthy (critical memory usage)
GET /metrics/
Returns detailed server metrics.
Request:
curl http://localhost:8081/metrics/
Response:
{
"total_messages_published": 125000,
"total_messages_delivered": 124850,
"total_messages_acknowledged": 124800,
"total_retries": 150,
"total_dead_lettered": 50,
"total_errors": 5,
"total_connections_accepted": 500,
"total_connections_rejected": 10,
"active_connections": 15,
"active_queues": 5,
"in_flight_messages": 42,
"memory_usage_bytes": 268435456,
"average_delivery_latency_ms": 2.5,
"timestamp": "2026-02-18T10:30:00Z",
"uptime": "02:15:30.5000000"
}
Metrics
Counters
Counters only increase and show accumulated values.
TotalMessagesPublished
Type: long
Description: Total number of published messages.
Usage:
Track server load
Calculate throughput (messages/sec)
TotalMessagesDelivered
Type: long
Description: Total number of delivered messages.
Usage:
Monitor delivery success rate
Calculate percentage of successful deliveries
TotalMessagesAcknowledged
Type: long
Description: Total number of acknowledged messages.
Usage:
Verify ACK mechanism
Identify acknowledgment issues
TotalRetries
Type: long
Description: Total number of delivery retries.
Usage:
Identify delivery issues
Optimize timeouts
TotalDeadLettered
Type: long
Description: Number of messages in Dead Letter Queue.
Usage:
Monitor failed deliveries
Requires attention when growing
TotalErrors
Type: long
Description: Total number of errors.
Usage:
Overall system health indicator
Requires investigation when growing
TotalConnectionsAccepted
Type: long
Description: Number of accepted connections.
Usage:
Monitor server load
Resource planning
TotalConnectionsRejected
Type: long
Description: Number of rejected connections.
Usage:
Identify capacity issues
Configure rate limiting
Gauge Metrics
Gauge metrics can increase and decrease, showing current state.
ActiveConnections
Type: int
Description: Current number of active connections.
Normal values: Depends on load
Alert: Approaching MaxConnections
ActiveQueues
Type: int
Description: Current number of active queues.
Usage:
Monitor queue usage
Identify unused queues
InFlightMessages
Type: int
Description: Number of messages in processing (waiting for ACK).
Normal values: Depends on load
Alert: High values may indicate: - Handler issues - Slow subscribers - Network problems
MemoryUsageBytes
Type: long
Description: Current memory usage in bytes.
Normal values: < 80% of available memory
Alert: > 90% — possible backpressure
Latency
AverageDeliveryLatencyMs
Type: double
Description: Average message delivery latency (ms).
Normal values: < 10 ms
Alert: > 50 ms — requires optimization
Monitoring with Prometheus
Example metrics exporter:
using Prometheus;
public class VibeMQMetricsExporter : BackgroundService {
private readonly IBrokerMetrics _metrics;
private readonly ILogger<VibeMQMetricsExporter> _logger;
// Counters
private readonly Counter _messagesPublished;
private readonly Counter _messagesDelivered;
private readonly Counter _messagesAcknowledged;
private readonly Counter _errors;
// Gauges
private readonly Gauge _activeConnections;
private readonly Gauge _activeQueues;
private readonly Gauge _inFlightMessages;
private readonly Gauge _memoryUsage;
// Histogram
private readonly Histogram _deliveryLatency;
public VibeMQMetricsExporter(
IBrokerMetrics metrics,
ILogger<VibeMQMetricsExporter> logger) {
_metrics = metrics;
_logger = logger;
_messagesPublished = Metrics.CreateCounter(
"vibemq_messages_published_total",
"Total number of messages published");
_messagesDelivered = Metrics.CreateCounter(
"vibemq_messages_delivered_total",
"Total number of messages delivered");
_messagesAcknowledged = Metrics.CreateCounter(
"vibemq_messages_acknowledged_total",
"Total number of messages acknowledged");
_errors = Metrics.CreateCounter(
"vibemq_errors_total",
"Total number of errors");
_activeConnections = Metrics.CreateGauge(
"vibemq_active_connections",
"Number of active connections");
_activeQueues = Metrics.CreateGauge(
"vibemq_active_queues",
"Number of active queues");
_inFlightMessages = Metrics.CreateGauge(
"vibemq_in_flight_messages",
"Number of messages in flight");
_memoryUsage = Metrics.CreateGauge(
"vibemq_memory_usage_bytes",
"Memory usage in bytes");
_deliveryLatency = Metrics.CreateHistogram(
"vibemq_delivery_latency_seconds",
"Delivery latency histogram",
new HistogramConfiguration {
Buckets = new[] { 0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0 }
});
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken) {
long lastPublished = 0;
long lastDelivered = 0;
long lastAcknowledged = 0;
long lastErrors = 0;
while (!stoppingToken.IsCancellationRequested) {
await Task.Delay(TimeSpan.FromSeconds(5), stoppingToken);
var snapshot = _metrics.GetSnapshot();
// Update counters (only increments)
if (snapshot.TotalMessagesPublished > lastPublished) {
_messagesPublished.Inc(snapshot.TotalMessagesPublished - lastPublished);
lastPublished = snapshot.TotalMessagesPublished;
}
if (snapshot.TotalMessagesDelivered > lastDelivered) {
_messagesDelivered.Inc(snapshot.TotalMessagesDelivered - lastDelivered);
lastDelivered = snapshot.TotalMessagesDelivered;
}
if (snapshot.TotalMessagesAcknowledged > lastAcknowledged) {
_messagesAcknowledged.Inc(snapshot.TotalMessagesAcknowledged - lastAcknowledged);
lastAcknowledged = snapshot.TotalMessagesAcknowledged;
}
if (snapshot.TotalErrors > lastErrors) {
_errors.Inc(snapshot.TotalErrors - lastErrors);
lastErrors = snapshot.TotalErrors;
}
// Update gauges
_activeConnections.Set(snapshot.ActiveConnections);
_activeQueues.Set(snapshot.ActiveQueues);
_inFlightMessages.Set(snapshot.InFlightMessages);
_memoryUsage.Set(snapshot.MemoryUsageBytes);
_deliveryLatency.Observe(snapshot.AverageDeliveryLatencyMs / 1000.0);
}
}
}
Registration:
services.AddHostedService<VibeMQMetricsExporter>();
// Prometheus server
var server = new MetricServer(port: 9090);
server.Start();
Grafana Dashboard
Example JSON dashboard for Grafana:
{
"dashboard": {
"title": "VibeMQ Monitor",
"panels": [
{
"title": "Messages Published/Delivered",
"targets": [
{
"expr": "rate(vibemq_messages_published_total[5m])",
"legendFormat": "Published"
},
{
"expr": "rate(vibemq_messages_delivered_total[5m])",
"legendFormat": "Delivered"
}
]
},
{
"title": "Active Connections",
"targets": [
{
"expr": "vibemq_active_connections",
"legendFormat": "Connections"
}
]
},
{
"title": "Delivery Latency",
"targets": [
{
"expr": "vibemq_delivery_latency_seconds",
"legendFormat": "Latency (p50)"
}
]
},
{
"title": "Memory Usage",
"targets": [
{
"expr": "vibemq_memory_usage_bytes",
"legendFormat": "Memory"
}
]
}
]
}
}
Logging
Logging Configuration
using Microsoft.Extensions.Logging;
using var loggerFactory = LoggerFactory.Create(builder => {
builder
.SetMinimumLevel(LogLevel.Information)
.AddConsole()
.AddDebug()
.AddFile("logs/vibemq-.log");
});
var broker = BrokerBuilder.Create()
.UsePort(8080)
.UseLoggerFactory(loggerFactory)
.Build();
Log Levels
Trace:
Detailed information about each message
For protocol debugging
Debug:
Connection information
Operation status
Information:
Server start/stop
Queue creation/deletion
Authentication errors
Warning:
Limit exceeded
Delivery issues
High memory usage
Error:
Message processing errors
Connection errors
Handler exceptions
Critical:
Critical server errors
Unable to start
Data loss
Log Examples
Server startup:
[10:30:00 INF] VibeMQ server starting on port 8080
[10:30:00 INF] Health check server started on port 8081
[10:30:00 INF] Server ready to accept connections
Client connection:
[10:31:00 INF] Client connected from 192.168.1.100:54321
[10:31:00 INF] Client authenticated successfully
[10:31:00 DBG] Connection ID: srv_100
Message publishing:
[10:32:00 DBG] Message msg_123 published to queue notifications
[10:32:00 DBG] Message delivered to subscriber sub_456
[10:32:01 DBG] Message msg_123 acknowledged
Warnings:
[10:33:00 WRN] Memory usage high: 85%
[10:33:00 WRN] Backpressure applied for queue notifications
[10:33:00 WRN] Rate limit exceeded for client srv_100
Errors:
[10:34:00 ERR] Failed to deliver message msg_123: timeout
[10:34:00 ERR] Authentication failed for client 192.168.1.100
[10:34:00 ERR] Queue notifications not found
Alerting
Example rules for Prometheus Alertmanager:
groups:
- name: vibemq
rules:
- alert: VibeMQHighMemory
expr: vibemq_memory_usage_bytes / 1073741824 > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "VibeMQ high memory usage"
description: "Memory usage is above 90% for more than 5 minutes"
- alert: VibeMQHighLatency
expr: histogram_quantile(0.95, vibemq_delivery_latency_seconds_bucket) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "VibeMQ high delivery latency"
description: "95th percentile latency is above 50ms"
- alert: VibeMQHighErrorRate
expr: rate(vibemq_errors_total[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "VibeMQ high error rate"
description: "Error rate is above 0.1 per second"
- alert: VibeMQDeadLetterQueueGrowing
expr: rate(vibemq_messages_dead_lettered_total[5m]) > 0
for: 10m
labels:
severity: warning
annotations:
summary: "VibeMQ DLQ growing"
description: "Messages are being dead lettered"
Next Steps
Server Setup — server setup
Troubleshooting — troubleshooting
Health Checks — health checks for orchestrators