Skip to content

Commit 65120fb

Browse files
VaggelisDtreysp
andauthored
Chore: Add Linter documentation (#3950)
Co-authored-by: Trey Spiller <1831878+treysp@users.noreply.github.com>
1 parent e100fbf commit 65120fb

File tree

5 files changed

+256
-2
lines changed

5 files changed

+256
-2
lines changed

docs/concepts/linter.md

+243
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
# Linter guide
2+
3+
Linting is a powerful tool for improving code quality and consistency. It enables you to automatically validate model definition, ensuring they adhere to your team's best practices.
4+
5+
When a SQLMesh command is executed and the project is loaded, each model's code is checked for compliance with a set of rules you choose.
6+
7+
SQLMesh provides built-in rules, and you can define custom rules. This improves code quality and helps detect issues early in the development cycle when they are simpler to debug.
8+
9+
## Rules
10+
11+
Each linting rule is responsible for identifying a pattern in a model's code.
12+
13+
Some rules validate that a pattern is *not* present, such as not allowing `SELECT *` in a model's outermost query. Other rules validate that a pattern *is* present, like ensuring that every model's `owner` field is specified. We refer to both of these below as "validating a pattern".
14+
15+
Rules are defined in Python. Each rule is an individual Python class that inherits from SQLMesh's `Rule` base class and defines the logic for validating a pattern.
16+
17+
We display a portion of the `Rule` base class's code below ([full source code](https://github.com/TobikoData/sqlmesh/blob/main/sqlmesh/core/linter/rule.py)). Its methods and properties illustrate the most important components of the subclassed rules you define.
18+
19+
Each rule class you create has four vital components:
20+
21+
1. Name: the class's name is used as the rule's name.
22+
2. Description: the class should define a docstring that provides a short explanation of the rule's purpose.
23+
3. Pattern validation logic: the class should define a `check_model()` method containing the core logic that validates the rule's pattern. The method can access any `Model` attribute.
24+
4. Rule violation logic: if a rule's pattern is not validated, the rule is "violated" and the class should return a `RuleViolation` object. The `RuleViolation` object should include the contextual information a user needs to understand and fix the problem.
25+
26+
``` python linenums="1"
27+
# Class name used as rule's name
28+
class Rule:
29+
# Docstring provides rule's description
30+
"""The base class for a rule."""
31+
32+
# Pattern validation logic goes in `check_model()` method
33+
@abc.abstractmethod
34+
def check_model(self, model: Model) -> t.Optional[RuleViolation]:
35+
"""The evaluation function that checks for a violation of this rule."""
36+
37+
# Rule violation object returned by `violation()` method
38+
def violation(self, violation_msg: t.Optional[str] = None) -> RuleViolation:
39+
"""Return a RuleViolation instance if this rule is violated"""
40+
return RuleViolation(rule=self, violation_msg=violation_msg or self.summary)
41+
```
42+
43+
### Built-in rules
44+
45+
SQLMesh includes a set of predefined rules that check for potential SQL errors or enforce code style.
46+
47+
An example of the latter is the `NoSelectStar` rule, which prohibits a model from using `SELECT *` in its query's outer-most select statement.
48+
49+
Here is code for the built-in `NoSelectStar` rule class, with the different components annotated:
50+
51+
``` python linenums="1"
52+
# Rule's name is the class name `NoSelectStar`
53+
class NoSelectStar(Rule):
54+
# Docstring explaining rule
55+
"""Query should not contain SELECT * on its outer most projections, even if it can be expanded."""
56+
57+
def check_model(self, model: Model) -> t.Optional[RuleViolation]:
58+
# If this model does not contain a SQL query, there is nothing to validate
59+
if not isinstance(model, SqlModel):
60+
return None
61+
62+
# Use the query's `is_star` property to detect the `SELECT *` pattern.
63+
# If present, call the `violation()` method to return a `RuleViolation` object.
64+
return self.violation() if model.query.is_star else None
65+
```
66+
67+
Here are all of SQLMesh's built-in linting rules:
68+
69+
| Name | Check type | Explanation |
70+
| -------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------ |
71+
| ambiguousorinvalidcolumn | Correctness | SQLMesh found duplicate columns or was unable to determine whether a column is duplicated or not |
72+
| invalidselectstarexpansion | Correctness | The query's top-level selection may be `SELECT *`, but only if SQLMesh can expand the `SELECT *` into individual columns |
73+
| noselectstar | Stylistic | The query's top-level selection may not be `SELECT *`, even if SQLMesh can expand the `SELECT *` into individual columns |
74+
75+
76+
### User-defined rules
77+
78+
You may define custom rules to implement your team's best practices.
79+
80+
For instance, you could ensure all models have an `owner` by defining the following linting rule:
81+
82+
``` python linenums="1" title="linter/user.py"
83+
import typing as t
84+
85+
from sqlmesh.core.linter.rule import Rule, RuleViolation
86+
from sqlmesh.core.model import Model
87+
88+
class NoMissingOwner(Rule):
89+
"""Model owner should always be specified."""
90+
91+
def check_model(self, model: Model) -> t.Optional[RuleViolation]:
92+
# Rule violated if the model's owner field (`model.owner`) is not specified
93+
return self.violation() if not model.owner else None
94+
95+
```
96+
97+
Place a rule's code in the project's `linter/` directory. SQLMesh will load all subclasses of `Rule` from that directory.
98+
99+
If the rule is specified in the project's [configuration file](#applying-linting-rules), SQLMesh will run it when the project is loaded. All SQLMesh commands will load the project, except for `create_external_models`, `migrate`, `rollback`, `run`, `environments`, and `invalidate`.
100+
101+
SQLMesh will error if a model violates the rule, informing you which model(s) violated the rule. In this example, `full_model.sql` violated the `NoMissingOwner` rule:
102+
103+
``` bash
104+
$ sqlmesh plan
105+
106+
Linter errors for .../models/full_model.sql:
107+
- nomissingowner: Model owner should always be specified.
108+
109+
Error: Linter detected errors in the code. Please fix them before proceeding.
110+
```
111+
112+
## Applying linting rules
113+
114+
Specify which linting rules a project should apply in the project's [configuration file](../guides/configuration.md).
115+
116+
Rules are specified as lists of rule names under the `linter` key. Globally enable or disable linting with the `enabled` key, which is `false` by default.
117+
118+
NOTE: you **must** set the `enabled` key to `true` key to apply the project's linting rules.
119+
120+
### Specific linting rules
121+
122+
This example specifies that the `"ambiguousorinvalidcolumn"` and `"invalidselectstarexpansion"` linting rules should be enforced:
123+
124+
=== "YAML"
125+
126+
```yaml linenums="1"
127+
linter:
128+
enabled: true
129+
rules: ["ambiguousorinvalidcolumn", "invalidselectstarexpansion"]
130+
```
131+
132+
=== "Python"
133+
134+
```python linenums="1"
135+
from sqlmesh.core.config import Config, LinterConfig
136+
137+
config = Config(
138+
linter=LinterConfig(
139+
enabled=True,
140+
rules=["ambiguousorinvalidcolumn", "invalidselectstarexpansion"]
141+
)
142+
)
143+
```
144+
145+
### All linting rules
146+
147+
Apply every built-in and user-defined rule by specifying `"ALL"` instead of a list of rules:
148+
149+
=== "YAML"
150+
151+
```yaml linenums="1"
152+
linter:
153+
enabled: True
154+
rules: "ALL"
155+
```
156+
157+
=== "Python"
158+
159+
```python linenums="1"
160+
from sqlmesh.core.config import Config, LinterConfig
161+
162+
config = Config(
163+
linter=LinterConfig(
164+
enabled=True,
165+
rules="all",
166+
)
167+
)
168+
```
169+
170+
If you want to apply all rules except for a few, you can specify `"ALL"` and list the rules to ignore in the `ignored_rules` key:
171+
172+
=== "YAML"
173+
174+
```yaml linenums="1"
175+
linter:
176+
enabled: True
177+
rules: "ALL" # apply all built-in and user-defined rules and error if violated
178+
ignored_rules: ["noselectstar"] # but don't run the `noselectstar` rule
179+
```
180+
181+
=== "Python"
182+
183+
```python linenums="1"
184+
from sqlmesh.core.config import Config, LinterConfig
185+
186+
config = Config(
187+
linter=LinterConfig(
188+
enabled=True,
189+
# apply all built-in and user-defined linting rules and error if violated
190+
rules="all",
191+
# but don't run the `noselectstar` rule
192+
ignored_rules=["noselectstar"]
193+
)
194+
)
195+
```
196+
197+
### Exclude a model from linting
198+
199+
You can specify that a specific *model* ignore a linting rule by specifying `ignored_rules` in its `MODEL` block.
200+
201+
This example specifies that the model `docs_example.full_model` should not run the `invalidselectstarexpansion` rule:
202+
203+
```sql linenums="1"
204+
MODEL(
205+
name docs_example.full_model,
206+
ignored_rules: ["invalidselectstarexpansion"] # or "ALL" to turn off linting completely
207+
);
208+
```
209+
210+
### Rule violation behavior
211+
212+
Linting rule violations raise an error by default, preventing the project from running until the violation is addressed.
213+
214+
You may specify that a rule's violation should not error and only log a warning by specifying it in the `warning_rules` key instead of the `rules` key.
215+
216+
=== "YAML"
217+
218+
```yaml linenums="1"
219+
linter:
220+
enabled: True
221+
# error if `ambiguousorinvalidcolumn` rule violated
222+
rules: ["ambiguousorinvalidcolumn"]
223+
# but only warn if "invalidselectstarexpansion" is violated
224+
warning_rules: ["invalidselectstarexpansion"]
225+
```
226+
227+
=== "Python"
228+
229+
```python linenums="1"
230+
from sqlmesh.core.config import Config, LinterConfig
231+
232+
config = Config(
233+
linter=LinterConfig(
234+
enabled=True,
235+
# error if `ambiguousorinvalidcolumn` rule violated
236+
rules=["ambiguousorinvalidcolumn"],
237+
# but only warn if "invalidselectstarexpansion" is violated
238+
warning_rules=["invalidselectstarexpansion"],
239+
)
240+
)
241+
```
242+
243+
SQLMesh will raise an error if the same rule is included in more than one of the `rules`, `warning_rules`, and `ignored_rules` keys since they should be mutually exclusive.

docs/concepts/models/overview.md

+6
Original file line numberDiff line numberDiff line change
@@ -446,6 +446,12 @@ to `false` causes SQLMesh to disable query canonicalization & simplification. Th
446446
### validate_query
447447
: Whether the model's query will be validated at compile time. This attribute is `false` by default. Setting it to `true` causes SQLMesh to raise an error instead of emitting warnings. This will display invalid columns in your SQL statements along with models containing `SELECT *` that cannot be automatically expanded to list out all columns. This ensures SQL is verified locally before time and money are spent running the SQL in your data warehouse.
448448

449+
!!! warning
450+
This flag is deprecated as of v.0.159.7+ in favor of the [linter](../linter.md). To preserve validation during compilation, the [built-in rules](../linter.md#built-in) that check for correctness should be [configured](../../guides/configuration.md#linter) to error severity.
451+
452+
### ignored_rules
453+
: Specifies which linter rules should be ignored/excluded for this model.
454+
449455
## Incremental Model Properties
450456

451457
These properties can be specified in an incremental model's `kind` definition.

docs/guides/configuration.md

+5
Original file line numberDiff line numberDiff line change
@@ -1108,6 +1108,11 @@ def grant_schema_usage(evaluator):
11081108

11091109
As demonstrated in these examples, the `environment_naming_info` is available within the macro evaluator for macros invoked within the `before_all` and `after_all` statements. Additionally, the macro `this_env` provides access to the current environment name, which can be helpful for more advanced use cases that require fine-grained control over their behaviour.
11101110

1111+
### Linting
1112+
1113+
SQLMesh provides a linter that checks for potential issues in your models' code. Enable it and specify which linting rules to apply in the configuration file's `linter` key.
1114+
1115+
Learn more about linting configuration on the [linting concepts page](../concepts/linter.md).
11111116

11121117
### Debug mode
11131118

docs/reference/model_configuration.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Configuration options for SQLMesh model properties. Supported by all model kinds
3939
| `enabled` | Whether the model is enabled. This attribute is `true` by default. Setting it to `false` causes SQLMesh to ignore this model when loading the project. | bool | N |
4040
| `gateway` | Specifies the gateway to use for the execution of this model. When not specified, the default gateway is used. | str | N |
4141
| `optimize_query` | Whether the model's query should be optimized. This attribute is `true` by default. Setting it to `false` causes SQLMesh to disable query canonicalization & simplification. This should be turned off only if the optimized query leads to errors such as surpassing text limit. | bool | N |
42-
| `validate_query` | Whether the model's query will be strictly validated at compile time. This attribute is `false` by default. Setting it to `true` causes SQLMesh to raise an error instead of emitting warnings. This will display invalid columns in your SQL statements along with models containing `SELECT *` that cannot be automatically expanded to list out all columns. | bool | N |
42+
| `ignored_rules` | A list of linter rule names (or "ALL") to be ignored/excluded for this model | str \| array[str] | N |
4343

4444
### Model defaults
4545

@@ -123,7 +123,6 @@ The SQLMesh project-level `model_defaults` key supports the following options, d
123123
- on_destructive_change (described [below](#incremental-models))
124124
- audits (described [here](../concepts/audits.md#generic-audits))
125125
- optimize_query
126-
- validate_query
127126
- allow_partials
128127
- enabled
129128
- interval_unit

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ nav:
3232
- SQLMesh tools:
3333
- guides/ui.md
3434
- guides/tablediff.md
35+
- concepts/linter.md
3536
- guides/observer.md
3637
- Concepts:
3738
- concepts/overview.md

0 commit comments

Comments
 (0)