Skip to content

Commit 21950c0

Browse files
brettplarsonBrett LarsonburaizuIainMethe
authored
Hazelcast integration - Add queues, topics and reltopics and name tag. (#17367)
* Update metrics.yaml to show name of object(s) and include include topics, reltopic and queue metrics. The hazelcast integration is not showing important details such as the map name that's being monitored. It would be great to include that, as well as include metrics for other things such as queues and topics. * more changes * change log. * update order. * changelog changes * more change log changes * sort metadata.csv * sort * sort instead * resort this. * Update hazelcast/metadata.csv Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> * Update hazelcast/metadata.csv Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> * Update hazelcast/metadata.csv Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> * Update hazelcast/metadata.csv Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> * Update hazelcast/metadata.csv Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> * update test version to LTS hazelcast version and create queue, topic and reltopics to test out of the box. * resolve typo * decouple versions of mancenter / hz * adjust dockerfile to reflect that log4j2 is included in hz 5.x * WIP - trying to figure out how this USED to work. * various changes to update mancenter check to be more correct. * update metrics reported in e2e test * add hazelcast dep * freeze the hazelcast dep * validate and sync licenses * validate and sync licenses again * updates licenses manually * whitespace * sort and validate metadata.csv * add back support for hazelcast 4 --------- Co-authored-by: Brett Larson <brett.larson@chicagotrading.com> Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com> Co-authored-by: IainMethe <20765808+IainMethe@users.noreply.github.com>
1 parent 8beb732 commit 21950c0

File tree

19 files changed

+549
-366
lines changed

19 files changed

+549
-366
lines changed

LICENSE-3rdparty.csv

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ flup-py3,Vendor,BSD-3-Clause,"Copyright (c) 2005, 2006 Allan Saddi <allan@saddi.
3535
foundationdb,PyPI,Apache-2.0,Copyright 2017 FoundationDB
3636
futures,PyPI,PSF,Copyright (c) 2015 Brian Quinlan
3737
gearman,PyPI,Apache-2.0,Copyright 2010 Yelp
38+
hazelcast-python-client,PyPI,Apache-2.0,"Copyright (c) 2008-2023, Hazelcast, Inc. All Rights Reserved."
3839
importlib-metadata,PyPI,Apache-2.0,"Copyright 2017-2019 Jason R. Coombs, Barry Warsaw"
3940
in-toto,PyPI,Apache-2.0,Copyright 2018 New York University
4041
ipaddress,PyPI,PSF,Copyright (c) 2013 Philipp Hagemeister

agent_requirements.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ enum34==1.1.10; python_version < '3.0'
2929
foundationdb==6.3.24; python_version > '3.0'
3030
futures==3.4.0; python_version < '3.0'
3131
gearman==2.0.2; sys_platform != 'win32' and python_version < '3.0'
32+
hazelcast-python-client==5.3.0; python_version > '3.0'
3233
importlib-metadata==2.1.3; python_version < '3.8'
3334
in-toto==2.0.0; python_version > '3.0'
3435
ipaddress==1.0.23; python_version < '3.0'

hazelcast/assets/configuration/spec.yaml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,8 @@ files:
2323
value:
2424
type: object
2525
example:
26-
Active: OK
27-
No Migration: WARNING
28-
Frozen: CRITICAL
29-
Passive: CRITICAL
30-
In Transition: WARNING
26+
UP: OK
27+
ACTIVE: OK
3128
- template: instances/jmx
3229
overrides:
3330
host.required: false

hazelcast/changelog.d/17367.added

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Added metrics for queues, topics and reliable topics.

hazelcast/changelog.d/17367.changed

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Drop name from excluded tags.
2+
Updated Hazelcast and Management Center to use version 5.x.
3+
Updated the status to reflect health of Management Center (Not Hazelcast cluster state).

hazelcast/datadog_checks/hazelcast/check.py

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,15 +36,31 @@ def process_mc_health_check(self):
3636
tags.extend(self._tags)
3737

3838
try:
39-
response = self.http.get(self._mc_health_check_endpoint)
40-
response.raise_for_status()
41-
status = response.json()
39+
response_wrapper = self.http.get(self._mc_health_check_endpoint)
40+
response_wrapper.raise_for_status()
41+
response = response_wrapper.json()
4242
except Exception:
4343
self.service_check(self.SERVICE_CHECK_CONNECT, AgentCheck.CRITICAL, tags=tags)
4444
raise
4545
else:
4646
self.service_check(self.SERVICE_CHECK_CONNECT, AgentCheck.OK, tags=tags)
4747

48-
self.service_check(
49-
self.SERVICE_CHECK_MC_CLUSTER_STATE, self._mc_cluster_states.get(status['managementCenterState']), tags=tags
50-
)
48+
status = None
49+
# Hazelcast 4 and 5 have different responses to this healthcheck endpoint
50+
if "status" in response:
51+
# hazelcast 5:
52+
# I cannot find documentation on this endpoint for management center 5.3 but it
53+
# does not represent "cluster status" because that is documented and available at a
54+
# different route:
55+
# https://docs.hazelcast.com/management-center/5.3/integrate/cluster-metrics#operation/getAllClustersStatus
56+
# It is probably the same as in 4.0 and just represents whether the management center
57+
# is available.
58+
status = response["status"]
59+
elif "managementCenterState" in response:
60+
# hazelcast 4:
61+
# https://docs.hazelcast.org/docs/management-center/4.0.1/manual/html/index.html#enabling-health-check-endpoint
62+
# "This endpoint responds with 200 OK HTTP status code once the Management Center web
63+
# application has started"
64+
status = response["managementCenterState"]
65+
66+
self.service_check(self.SERVICE_CHECK_MC_CLUSTER_STATE, self._mc_cluster_states.get(status), tags=tags)

hazelcast/datadog_checks/hazelcast/data/conf.yaml.example

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -106,11 +106,8 @@ instances:
106106
## Override cluster state to `hazelcast.mc_cluster_state` service check status mapping.
107107
#
108108
# mc_cluster_states:
109-
# Active: OK
110-
# No Migration: WARNING
111-
# Frozen: CRITICAL
112-
# Passive: CRITICAL
113-
# In Transition: WARNING
109+
# UP: OK
110+
# ACTIVE: OK
114111

115112
## @param user - string - optional
116113
## User to use when connecting to JMX.

hazelcast/datadog_checks/hazelcast/data/metrics.yaml

Lines changed: 131 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ jmx_metrics:
44
- include:
55
domain: ManagementCenter
66
exclude_tags:
7-
- name
87
- type
98
attribute:
109
LicenseExpirationTime:
@@ -18,7 +17,6 @@ jmx_metrics:
1817
type: HazelcastInstance
1918
exclude_tags:
2019
- instance
21-
- name
2220
- type
2321
tags:
2422
hazelcast_instance: $instance
@@ -32,12 +30,14 @@ jmx_metrics:
3230
memberCount:
3331
alias: hazelcast.instance.member_count
3432
metric_type: gauge
33+
clusterTime:
34+
alias: hazelcast.instance.cluster_time
35+
metric_type: gauge
3536
- include:
3637
domain: com.hazelcast
3738
type: HazelcastInstance.PartitionServiceMBean
3839
exclude_tags:
3940
- instance
40-
- name
4141
- type
4242
tags:
4343
hazelcast_instance: $instance
@@ -59,7 +59,6 @@ jmx_metrics:
5959
type: HazelcastInstance.ManagedExecutorService
6060
exclude_tags:
6161
- instance
62-
- name
6362
- type
6463
tags:
6564
hazelcast_instance: $instance
@@ -92,7 +91,6 @@ jmx_metrics:
9291
type: IMap
9392
exclude_tags:
9493
- instance
95-
- name
9694
- type
9795
tags:
9896
hazelcast_instance: $instance
@@ -179,7 +177,6 @@ jmx_metrics:
179177
type: MultiMap
180178
exclude_tags:
181179
- instance
182-
- name
183180
- type
184181
tags:
185182
hazelcast_instance: $instance
@@ -260,7 +257,6 @@ jmx_metrics:
260257
type: ReplicatedMap
261258
exclude_tags:
262259
- instance
263-
- name
264260
- type
265261
tags:
266262
hazelcast_instance: $instance
@@ -326,7 +322,6 @@ jmx_metrics:
326322
type: Metrics
327323
exclude_tags:
328324
- instance
329-
- name
330325
- type
331326
- tag.*
332327
tags:
@@ -437,9 +432,6 @@ jmx_metrics:
437432
connectionListenerCount:
438433
alias: hazelcast.member.connection_listener_count
439434
metric_type: gauge
440-
connectionType:
441-
alias: hazelcast.member.connection_type
442-
metric_type: gauge
443435
count:
444436
alias: hazelcast.member.count
445437
metric_type: gauge
@@ -548,15 +540,6 @@ jmx_metrics:
548540
invocationTimeoutMillis:
549541
alias: hazelcast.member.invocation_timeout_millis
550542
metric_type: gauge
551-
invocations.lastCallId:
552-
alias: hazelcast.member.invocations.last_call_id
553-
metric_type: gauge
554-
invocations.pending:
555-
alias: hazelcast.member.invocations.pending
556-
metric_type: gauge
557-
invocations.usedPercentage:
558-
alias: hazelcast.member.invocations.used_percentage
559-
metric_type: gauge
560543
ioThreadId:
561544
alias: hazelcast.member.io_thread_id
562545
metric_type: gauge
@@ -629,9 +612,6 @@ jmx_metrics:
629612
missingMembers:
630613
alias: hazelcast.member.missing_members
631614
metric_type: gauge
632-
monitorCount:
633-
alias: hazelcast.member.monitor_count
634-
metric_type: gauge
635615
nodes:
636616
alias: hazelcast.member.nodes
637617
metric_type: gauge
@@ -659,12 +639,6 @@ jmx_metrics:
659639
ownerId:
660640
alias: hazelcast.member.owner_id
661641
metric_type: gauge
662-
packetsReceived:
663-
alias: hazelcast.member.packets_received
664-
metric_type: gauge
665-
packetsSend:
666-
alias: hazelcast.member.packets_send
667-
metric_type: gauge
668642
parkQueueCount:
669643
alias: hazelcast.member.park_queue_count
670644
metric_type: gauge
@@ -785,9 +759,6 @@ jmx_metrics:
785759
startedMigrations:
786760
alias: hazelcast.member.started_migrations
787761
metric_type: gauge
788-
stateVersion:
789-
alias: hazelcast.member.state_version
790-
metric_type: gauge
791762
syncDeliveryFailureCount:
792763
alias: hazelcast.member.sync_delivery_failure_count
793764
metric_type: gauge
@@ -854,9 +825,6 @@ jmx_metrics:
854825
unknownTime:
855826
alias: hazelcast.member.unknown_time
856827
metric_type: gauge
857-
unknownCount:
858-
alias: hazelcast.member.unknown_count
859-
metric_type: gauge
860828
unloadedClassesCount:
861829
alias: hazelcast.member.unloaded_classes_count
862830
metric_type: gauge
@@ -881,3 +849,131 @@ jmx_metrics:
881849
writeQueueSize:
882850
alias: hazelcast.member.write_queue_size
883851
metric_type: gauge
852+
853+
# metrics in 4 and 5 but at different locations
854+
# hazelcast 5
855+
lastCallId:
856+
alias: hazelcast.member.invocations.last_call_id
857+
metric_type: gauge
858+
pending:
859+
alias: hazelcast.member.invocations.pending
860+
metric_type: gauge
861+
usedPercentage:
862+
alias: hazelcast.member.invocations.used_percentage
863+
metric_type: gauge
864+
unknownCount:
865+
alias: hazelcast.member.unknown_count
866+
metric_type: gauge
867+
# hazelcast 4
868+
invocations.lastCallId:
869+
alias: hazelcast.member.invocations.last_call_id
870+
metric_type: gauge
871+
invocations.pending:
872+
alias: hazelcast.member.invocations.pending
873+
metric_type: gauge
874+
invocations.usedPercentage:
875+
alias: hazelcast.member.invocations.used_percentage
876+
metric_type: gauge
877+
unknowsnCount: # typo on hazelcast's side
878+
alias: hazelcast.member.unknown_count
879+
metric_type: gauge
880+
881+
# hazelcast 4 only
882+
connectionType:
883+
alias: hazelcast.member.connection_type
884+
metric_type: gauge
885+
monitorCount:
886+
alias: hazelcast.member.monitor_count
887+
metric_type: gauge
888+
packetsReceived:
889+
alias: hazelcast.member.packets_received
890+
metric_type: gauge
891+
packetsSend:
892+
alias: hazelcast.member.packets_send
893+
metric_type: gauge
894+
stateVersion:
895+
alias: hazelcast.member.state_version
896+
metric_type: gauge
897+
# Queue
898+
- include:
899+
domain: com.hazelcast
900+
type: IQueue
901+
exclude_tags:
902+
- instance
903+
- type
904+
tags:
905+
hazelcast_instance: $instance
906+
attribute:
907+
localMinAge:
908+
alias: hazelcast.iqueue.minimum_age
909+
metric_type: gauge
910+
localMaxAge:
911+
alias: hazelcast.iqueue.maximum_age
912+
metric_type: gauge
913+
localAverageAge:
914+
alias: hazelcast.iqueue.average_age
915+
metric_type: gauge
916+
localOwnedItemCount:
917+
alias: hazelcast.iqueue.owned_item_count
918+
metric_type: gauge
919+
localBackupItemCount:
920+
alias: hazelcast.iqueue.backup_item_count
921+
metric_type: gauge
922+
localOfferOperationCount:
923+
alias: hazelcast.iqueue.offer_operation_count
924+
metric_type: gauge
925+
localRejectedOfferOperationCount:
926+
alias: hazelcast.iqueue.rejected_offer_operation_count
927+
metric_type: gauge
928+
localPollOperationCount:
929+
alias: hazelcast.iqueue.poll_operation_count
930+
metric_type: gauge
931+
localEmptyPollOperationCount:
932+
alias: hazelcast.iqueue.empty_poll_operation_count
933+
metric_type: gauge
934+
localOtherOperationsCount:
935+
alias: hazelcast.iqueue.other_operation_count
936+
metric_type: gauge
937+
localEventOperationCount:
938+
alias: hazelcast.iqueue.event_operation_count
939+
metric_type: gauge
940+
941+
# ReliableTopic
942+
- include:
943+
domain: com.hazelcast
944+
type: ReliableTopic
945+
exclude_tags:
946+
- instance
947+
- type
948+
tags:
949+
hazelcast_instance: $instance
950+
attribute:
951+
localCreationTime:
952+
alias: hazelcast.reliabletopic.creation_time
953+
metric_type: gauge
954+
localPublishOperationCount:
955+
alias: hazelcast.reliabletopic.publish_operation_count
956+
metric_type: gauge
957+
localReceiveOperationCount:
958+
alias: hazelcast.reliabletopic.receive_operation_count
959+
metric_type: gauge
960+
961+
# Topic
962+
- include:
963+
domain: com.hazelcast
964+
type: ITopic
965+
exclude_tags:
966+
- instance
967+
- type
968+
tags:
969+
hazelcast_instance: $instance
970+
attribute:
971+
localCreationTime:
972+
alias: hazelcast.topic.creation_time
973+
metric_type: gauge
974+
localPublishOperationCount:
975+
alias: hazelcast.topic.publish_operation_count
976+
metric_type: gauge
977+
localReceiveOperationCount:
978+
alias: hazelcast.topic.receive_operation_count
979+
metric_type: gauge

hazelcast/datadog_checks/hazelcast/utils.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,12 @@
33
# Licensed under a 3-clause BSD style license (see LICENSE)
44
from datadog_checks.base.constants import ServiceCheck
55

6-
# https://docs.hazelcast.org/docs/management-center/latest/manual/html/index.html#cluster-state
6+
# Cluster state is not reflected here, this check is purely for Management Center.
77
MC_CLUSTER_STATES = {
8-
'Active': ServiceCheck.OK,
9-
'No Migration': ServiceCheck.WARNING,
10-
'Frozen': ServiceCheck.CRITICAL,
11-
'Passive': ServiceCheck.CRITICAL,
12-
'In Transition': ServiceCheck.WARNING,
8+
# hazelcast 5:
9+
'UP': ServiceCheck.OK,
10+
# hazelcast 4:
11+
'ACTIVE': ServiceCheck.OK,
1312
}
1413

1514

hazelcast/hatch.toml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,14 @@
22

33
[[envs.default.matrix]]
44
python = ["3.11"]
5-
version = ["4.0"]
5+
version = ["4.0", "5.0"]
66

77
[envs.default.overrides]
88
matrix.version.env-vars = [
99
{ key = "HAZELCAST_VERSION", value = "4.0.1", if = ["4.0"] },
10+
{ key = "HAZELCAST_VERSION", value = "5.3.7", if = ["5.0"] },
11+
{ key = "HAZELCAST_MANCENTER_VERSION", value = "4.0.1", if = ["4.0"] },
12+
{ key = "HAZELCAST_MANCENTER_VERSION", value = "5.3.3", if = ["5.0"] },
13+
{ key = "HAZELCAST_MC_INIT_CMD", value = "./mc-conf.sh", if = ["4.0"] },
14+
{ key = "HAZELCAST_MC_INIT_CMD", value = "./bin/mc-conf.sh", if = ["5.0"] },
1015
]

0 commit comments

Comments
 (0)