Skip to content

Commit 2b3e48c

Browse files
authored
Adding schema collection to sqlserver (#17258)
* Adding schema collection to sqlserver * rather use sys tables * snapshot collect data * remove unused function * small improvments * improving code * fixed errors * refactored code * Introduced a function that iterates between databases * minor changes * put in a separate class * some clean up * Corrected column query * added partitions count * Added foreign count * Added stop * fixed errors * fixed errors * removed old code * Fixed some bugs in chunk schema collection * removed some comments * some diffs * working version send data in chunks * introduced collection per tables * Introduced a class for data submit * pretending to be postgres for testing * Adopted to Postgres * adopted payload to the backend * remove breakpoints * adding a test * Put back resolved hostname * ficed resolved host name * Fixed more resolved host name * Imporved unit test * trying to add deepdiff pkg * Fixed test to combine payloads * added deepdiff to the sqlserver hatch * Tried to add deepdifff deferently * Enabled test * Added a total limit of columns * Improved exception treatment * fixed hostname * Added Foreign key columns * Added Foreign key columns * Sorted tables * add time log for individual query * removed other jobs * Added timestamps * Add more logs * increase to 500 * removing postgres simulation * fix errors * added collection interval * Added arrays to indexes nd partitions * added error logs * formatted queries * format queries * refactored queries execution * improved formatting * applied lnter * Updated test expectations * Removed pdb * Adding a changelog * removed pdb * Formatted * removed populate * Clean up empty lines * put back the driver * put remove check * put back space * remove space * reapplied linter * Improved changelog * Improved docs * improved example * corrected comment * added submitter unit test * xchanged config * Added param query * improved logging * changelog changed * Added stop iteration * pujt back odb * Inherited from async job * Added conf parameters * Fixed unit test * removed pdb * Formatted comments * Added a chnage to dbmasync * Update spec * Removed pdb * put back driver * fixed changelogs * applied linter * minor improvments * fixed typo * removed base change * Moved do for db in schemas * Improved const * Applied linter * Improved specs * added more tests * Improved doc * improve variable names * Applied linter * linter * Added test for truncation * Add db to the message * Fixed unit test * Applied linter * Changed truncation msg * applied linter * Require base package version * Removed deepdiff from ddev hatch * resolved errors after merge * remove modification from base * removed white space * removed white space again * synced example * Added a license * Put correct date in license * applied model sync * create a dedicated test db for schemas * applied linter * added test schema db to all envs * lint test * normalized ids * convert to bool windows value * fix convert function * fixed put back index row * Make test agnostic to order of index columns * updated with latest ddev
1 parent 7832d0b commit 2b3e48c

File tree

20 files changed

+1388
-5
lines changed

20 files changed

+1388
-5
lines changed

sqlserver/assets/configuration/spec.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -713,6 +713,30 @@ files:
713713
type: number
714714
example: 1800
715715
display_default: false
716+
- name: schemas_collection
717+
description: |
718+
Configure collection of schemas. If `database_autodiscovery` is not enabled, data is collected
719+
only for the database configured with `database` parameter.
720+
options:
721+
- name: enabled
722+
description: |
723+
Enable schema collection. Requires `dbm: true`. Defaults to false.
724+
value:
725+
type: boolean
726+
example: false
727+
- name: collection_interval
728+
description: |
729+
Set the database schema collection interval (in seconds). Defaults to 600 seconds.
730+
value:
731+
type: number
732+
example: 600
733+
- name: max_execution_time
734+
description: |
735+
Set the maximum time for schema collection (in seconds). Defaults to 10 seconds.
736+
Capped by `schemas_collection.collection_interval`
737+
value:
738+
type: number
739+
example: 10
716740
- template: instances/default
717741
- template: logs
718742
example:

sqlserver/changelog.d/17258.added

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Adding schema collection to sqlserver
2+
Schema data includes information about the tables, their columns, indexes, foreign keys, and partitions.

sqlserver/datadog_checks/sqlserver/config.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,10 @@
77

88
from datadog_checks.base.config import is_affirmative
99
from datadog_checks.base.utils.common import to_native_string
10-
from datadog_checks.sqlserver.const import DEFAULT_AUTODISCOVERY_INTERVAL, PROC_CHAR_LIMIT
10+
from datadog_checks.sqlserver.const import (
11+
DEFAULT_AUTODISCOVERY_INTERVAL,
12+
PROC_CHAR_LIMIT,
13+
)
1114

1215

1316
class SQLServerConfig:
@@ -45,6 +48,7 @@ def __init__(self, init_config, instance, log):
4548
self.procedure_metrics_config: dict = instance.get('procedure_metrics', {}) or {}
4649
self.settings_config: dict = instance.get('collect_settings', {}) or {}
4750
self.activity_config: dict = instance.get('query_activity', {}) or {}
51+
self.schema_config: dict = instance.get('schemas_collection', {}) or {}
4852
self.cloud_metadata: dict = {}
4953
aws: dict = instance.get('aws', {}) or {}
5054
gcp: dict = instance.get('gcp', {}) or {}

sqlserver/datadog_checks/sqlserver/config_models/instance.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,16 @@ class QueryMetrics(BaseModel):
139139
samples_per_hour_per_query: Optional[int] = None
140140

141141

142+
class SchemasCollection(BaseModel):
143+
model_config = ConfigDict(
144+
arbitrary_types_allowed=True,
145+
frozen=True,
146+
)
147+
collection_interval: Optional[float] = None
148+
enabled: Optional[bool] = None
149+
max_execution_time: Optional[float] = None
150+
151+
142152
class InstanceConfig(BaseModel):
143153
model_config = ConfigDict(
144154
validate_default=True,
@@ -199,6 +209,7 @@ class InstanceConfig(BaseModel):
199209
query_activity: Optional[QueryActivity] = None
200210
query_metrics: Optional[QueryMetrics] = None
201211
reported_hostname: Optional[str] = None
212+
schemas_collection: Optional[SchemasCollection] = None
202213
server_version: Optional[str] = None
203214
service: Optional[str] = None
204215
stored_procedure: Optional[str] = None

sqlserver/datadog_checks/sqlserver/const.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,3 +268,5 @@
268268
]
269269

270270
PROC_CHAR_LIMIT = 500
271+
272+
DEFAULT_SCHEMAS_COLLECTION_INTERVAL = 600

sqlserver/datadog_checks/sqlserver/data/conf.yaml.example

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -659,6 +659,27 @@ instances:
659659
#
660660
# ignore_missing_database: false
661661

662+
## Configure collection of schemas. If `database_autodiscovery` is not enabled, data is collected
663+
## only for the database configured with `database` parameter.
664+
#
665+
# schemas_collection:
666+
667+
## @param enabled - boolean - optional - default: false
668+
## Enable schema collection. Requires `dbm: true`. Defaults to false.
669+
#
670+
# enabled: false
671+
672+
## @param collection_interval - number - optional - default: 600
673+
## Set the database schema collection interval (in seconds). Defaults to 600 seconds.
674+
#
675+
# collection_interval: 600
676+
677+
## @param max_execution_time - number - optional - default: 10
678+
## Set the maximum time for schema collection (in seconds). Defaults to 10 seconds.
679+
## Capped by `schemas_collection.collection_interval`
680+
#
681+
# max_execution_time: 10
682+
662683
## @param tags - list of strings - optional
663684
## A list of tags to attach to every metric and service check emitted by this instance.
664685
##

sqlserver/datadog_checks/sqlserver/queries.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,73 @@
143143
],
144144
}
145145

146+
DB_QUERY = """
147+
SELECT
148+
db.database_id AS id, db.name AS name, db.collation_name AS collation, dp.name AS owner
149+
FROM
150+
sys.databases db LEFT JOIN sys.database_principals dp ON db.owner_sid = dp.sid
151+
WHERE db.name IN ({});
152+
"""
153+
154+
SCHEMA_QUERY = """
155+
SELECT
156+
s.name AS name, s.schema_id AS id, dp.name AS owner_name
157+
FROM
158+
sys.schemas AS s JOIN sys.database_principals dp ON s.principal_id = dp.principal_id
159+
WHERE s.name NOT IN ('sys', 'information_schema')
160+
"""
161+
162+
TABLES_IN_SCHEMA_QUERY = """
163+
SELECT
164+
object_id AS id, name
165+
FROM
166+
sys.tables
167+
WHERE schema_id=?
168+
"""
169+
170+
COLUMN_QUERY = """
171+
SELECT
172+
column_name AS name, data_type, column_default, is_nullable AS nullable , table_name, ordinal_position
173+
FROM
174+
information_schema.columns
175+
WHERE
176+
table_name IN ({}) and table_schema='{}';
177+
"""
178+
179+
PARTITIONS_QUERY = """
180+
SELECT
181+
object_id AS id, COUNT(*) AS partition_count
182+
FROM
183+
sys.partitions
184+
WHERE
185+
object_id IN ({}) GROUP BY object_id;
186+
"""
187+
188+
INDEX_QUERY = """
189+
SELECT
190+
i.object_id AS id, i.name, i.type, i.is_unique, i.is_primary_key, i.is_unique_constraint,
191+
i.is_disabled, STRING_AGG(c.name, ',') AS column_names
192+
FROM
193+
sys.indexes i JOIN sys.index_columns ic ON i.object_id = ic.object_id
194+
AND i.index_id = ic.index_id JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
195+
WHERE
196+
i.object_id IN ({}) GROUP BY i.object_id, i.name, i.type,
197+
i.is_unique, i.is_primary_key, i.is_unique_constraint, i.is_disabled;
198+
"""
199+
200+
FOREIGN_KEY_QUERY = """
201+
SELECT
202+
FK.referenced_object_id AS id, FK.name AS foreign_key_name,
203+
OBJECT_NAME(FK.parent_object_id) AS referencing_table,
204+
STRING_AGG(COL_NAME(FKC.parent_object_id, FKC.parent_column_id),',') AS referencing_column,
205+
OBJECT_NAME(FK.referenced_object_id) AS referenced_table,
206+
STRING_AGG(COL_NAME(FKC.referenced_object_id, FKC.referenced_column_id),',') AS referenced_column
207+
FROM
208+
sys.foreign_keys AS FK JOIN sys.foreign_key_columns AS FKC ON FK.object_id = FKC.constraint_object_id
209+
WHERE
210+
FK.referenced_object_id IN ({}) GROUP BY FK.name, FK.parent_object_id, FK.referenced_object_id;
211+
"""
212+
146213

147214
def get_query_ao_availability_groups(sqlserver_major_version):
148215
"""

0 commit comments

Comments
 (0)