You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add support for defining struct types via namedtuples
* test: add unit tests for checking namedtuples' support
* docs: add corresponding instructions for using NamedTuple to define a struct
* docs: make immutability requirements for struct keys clearer
* refactor: merge dataclass and namedtuple as one uniform `struct_type`
* feat: handle default values for namedtuple params in decoder
Copy file name to clipboardExpand all lines: docs/docs/core/data_types.mdx
+31-9Lines changed: 31 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -53,23 +53,41 @@ The native Python type is always more permissive and can represent a superset of
53
53
you can choose whatever to use.
54
54
The native Python type is usually simpler.
55
55
56
-
### Struct Type
56
+
### Struct Types
57
57
58
58
A Struct has a bunch of fields, each with a name and a type.
59
59
60
-
In Python, a Struct type is represented by a [dataclass](https://docs.python.org/3/library/dataclasses.html),
61
-
and all fields must be annotated with a specific type. For example:
60
+
In Python, a Struct type is represented by either a [dataclass](https://docs.python.org/3/library/dataclasses.html)
61
+
or a [NamedTuple](https://docs.python.org/3/library/typing.html#typing.NamedTuple), with all fields annotated with a specific type.
62
+
Both options define a structured type with named fields, but they differ slightly:
63
+
64
+
-**Dataclass**: A flexible class-based structure, mutable by default, defined using the `@dataclass` decorator.
65
+
-**NamedTuple**: An immutable tuple-based structure, defined using `typing.NamedTuple`.
66
+
67
+
For example:
62
68
63
69
```python
64
70
from dataclasses import dataclass
71
+
from typing import NamedTuple
72
+
import datetime
65
73
74
+
# Using dataclass
66
75
@dataclass
67
76
classPerson:
68
77
first_name: str
69
-
last_name
78
+
last_name: str
79
+
dob: datetime.date
80
+
81
+
# Using NamedTuple
82
+
classPersonTuple(NamedTuple):
83
+
first_name: str
84
+
last_name: str
70
85
dob: datetime.date
71
86
```
72
87
88
+
Both `Person` and `PersonTuple` are valid Struct types in CocoIndex, with identical schemas (three fields: `first_name` (Str), `last_name` (Str), `dob` (Date)).
89
+
Choose `dataclass` for mutable objects or when you need additional methods, and `NamedTuple` for immutable, lightweight structures.
90
+
73
91
### Table Types
74
92
75
93
A Table type models a collection of rows, each with multiple columns.
@@ -84,20 +102,24 @@ The row order of a KTable is not preserved.
84
102
Type of the first column (key column) must be a [key type](#key-types).
85
103
86
104
In Python, a KTable type is represented by `dict[K, V]`.
87
-
The `V` should be a dataclass, representing the value fields of each row.
88
-
For example, you can use `dict[str, Person]` to represent a KTable, with 4 columns: key (Str), `first_name` (Str), `last_name` (Str), `dob` (Date).
105
+
The `V` should be a struct type, either a `dataclass` or `NamedTuple`, representing the value fields of each row.
106
+
For example, you can use `dict[str, Person]`or `dict[str, PersonTuple]`to represent a KTable, with 4 columns: key (Str), `first_name` (Str), `last_name` (Str), `dob` (Date).
89
107
90
-
Note that if you want to use a struct as the key, you need to annotate the struct with `@dataclass(frozen=True)`, so the values are immutable.
108
+
Note that if you want to use a struct as the key, you need to ensure the struct is immutable. For `dataclass`, annotate it with `@dataclass(frozen=True)`. For `NamedTuple`, immutability is built-in.
91
109
For example:
92
110
93
111
```python
94
112
@dataclass(frozen=True)
95
113
classPersonKey:
96
114
id_kind: str
97
115
id: str
116
+
117
+
classPersonKeyTuple(NamedTuple):
118
+
id_kind: str
119
+
id: str
98
120
```
99
121
100
-
Then you can use `dict[PersonKey, Person]` to represent a KTable keyed by `PersonKey`.
122
+
Then you can use `dict[PersonKey, Person]`or `dict[PersonKeyTuple, PersonTuple]`to represent a KTable keyed by `PersonKey` or `PersonKeyTuple`.
101
123
102
124
103
125
#### LTable
@@ -118,4 +140,4 @@ Currently, the following types are key types
118
140
- Range
119
141
- Uuid
120
142
- Date
121
-
- Struct with all fields being key types
143
+
- Struct with all fields being key types (using `@dataclass(frozen=True)` or `NamedTuple`)
0 commit comments