|
2 | 2 | "cells": [
|
3 | 3 | {
|
4 | 4 | "cell_type": "markdown",
|
5 |
| - "id": "f60c52ce", |
| 5 | + "id": "5d5afc66", |
6 | 6 | "metadata": {
|
7 | 7 | "tags": []
|
8 | 8 | },
|
|
27 | 27 | },
|
28 | 28 | {
|
29 | 29 | "cell_type": "markdown",
|
30 |
| - "id": "7ec7724a", |
| 30 | + "id": "d20eee58", |
31 | 31 | "metadata": {},
|
32 | 32 | "source": [
|
33 | 33 | "## <span style='color:#ff5f27'> 📝 Imports"
|
|
36 | 36 | {
|
37 | 37 | "cell_type": "code",
|
38 | 38 | "execution_count": null,
|
39 |
| - "id": "42efa933", |
| 39 | + "id": "802d6425", |
40 | 40 | "metadata": {},
|
41 | 41 | "outputs": [],
|
42 | 42 | "source": [
|
|
46 | 46 | {
|
47 | 47 | "cell_type": "code",
|
48 | 48 | "execution_count": null,
|
49 |
| - "id": "a4120a95", |
| 49 | + "id": "bd55a691", |
50 | 50 | "metadata": {},
|
51 | 51 | "outputs": [],
|
52 | 52 | "source": [
|
|
64 | 64 | },
|
65 | 65 | {
|
66 | 66 | "cell_type": "markdown",
|
67 |
| - "id": "010286f1", |
| 67 | + "id": "879a539d", |
68 | 68 | "metadata": {},
|
69 | 69 | "source": [
|
70 | 70 | "## <span style=\"color:#ff5f27;\"> 💽 Loading the Data </span>\n",
|
|
84 | 84 | {
|
85 | 85 | "cell_type": "code",
|
86 | 86 | "execution_count": null,
|
87 |
| - "id": "491a37b4", |
| 87 | + "id": "586d4111", |
88 | 88 | "metadata": {},
|
89 | 89 | "outputs": [],
|
90 | 90 | "source": [
|
|
100 | 100 | {
|
101 | 101 | "cell_type": "code",
|
102 | 102 | "execution_count": null,
|
103 |
| - "id": "9ed8c4f0", |
| 103 | + "id": "0cdb9c3d", |
104 | 104 | "metadata": {},
|
105 | 105 | "outputs": [],
|
106 | 106 | "source": [
|
|
118 | 118 | {
|
119 | 119 | "cell_type": "code",
|
120 | 120 | "execution_count": null,
|
121 |
| - "id": "2543c511", |
| 121 | + "id": "5a6fb4f4", |
122 | 122 | "metadata": {},
|
123 | 123 | "outputs": [],
|
124 | 124 | "source": [
|
|
135 | 135 | },
|
136 | 136 | {
|
137 | 137 | "cell_type": "markdown",
|
138 |
| - "id": "717f9102", |
| 138 | + "id": "fbb301ce", |
139 | 139 | "metadata": {},
|
140 | 140 | "source": [
|
141 | 141 | "---"
|
142 | 142 | ]
|
143 | 143 | },
|
144 | 144 | {
|
145 | 145 | "cell_type": "markdown",
|
146 |
| - "id": "16357302", |
| 146 | + "id": "bf6f9299", |
147 | 147 | "metadata": {},
|
148 | 148 | "source": [
|
149 | 149 | "## <span style=\"color:#ff5f27;\"> 🛠️ Feature Engineering </span>\n",
|
|
158 | 158 | {
|
159 | 159 | "cell_type": "code",
|
160 | 160 | "execution_count": null,
|
161 |
| - "id": "0b919bf8", |
| 161 | + "id": "d1af85e4", |
162 | 162 | "metadata": {},
|
163 | 163 | "outputs": [],
|
164 | 164 | "source": [
|
|
181 | 181 | {
|
182 | 182 | "cell_type": "code",
|
183 | 183 | "execution_count": null,
|
184 |
| - "id": "5d6fd9ae", |
| 184 | + "id": "4956cb52", |
185 | 185 | "metadata": {},
|
186 | 186 | "outputs": [],
|
187 | 187 | "source": [
|
|
191 | 191 | },
|
192 | 192 | {
|
193 | 193 | "cell_type": "markdown",
|
194 |
| - "id": "92820fab", |
| 194 | + "id": "bc4e505f", |
195 | 195 | "metadata": {},
|
196 | 196 | "source": [
|
197 | 197 | "Next, you will create features that for each credit card aggregate data from multiple time steps.\n",
|
|
203 | 203 | {
|
204 | 204 | "cell_type": "code",
|
205 | 205 | "execution_count": null,
|
206 |
| - "id": "47116f13", |
| 206 | + "id": "ee689954", |
207 | 207 | "metadata": {},
|
208 | 208 | "outputs": [],
|
209 | 209 | "source": [
|
|
223 | 223 | },
|
224 | 224 | {
|
225 | 225 | "cell_type": "markdown",
|
226 |
| - "id": "0b3abcfc", |
| 226 | + "id": "ed7039bb", |
227 | 227 | "metadata": {},
|
228 | 228 | "source": [
|
229 | 229 | "Next lets compute windowed aggregates. Here you will use 4-hour windows, but feel free to experiment with different window lengths by setting `window_len` below to a value of your choice."
|
|
232 | 232 | {
|
233 | 233 | "cell_type": "code",
|
234 | 234 | "execution_count": null,
|
235 |
| - "id": "f9bf9dd7", |
| 235 | + "id": "791708c4", |
236 | 236 | "metadata": {},
|
237 | 237 | "outputs": [],
|
238 | 238 | "source": [
|
|
248 | 248 | },
|
249 | 249 | {
|
250 | 250 | "cell_type": "markdown",
|
251 |
| - "id": "b3021fa4", |
| 251 | + "id": "391e478f", |
252 | 252 | "metadata": {},
|
253 | 253 | "source": [
|
254 | 254 | "### <span style=\"color:#ff5f27;\">⚙️ Convert date time object to unix epoch in milliseconds </span>"
|
|
257 | 257 | {
|
258 | 258 | "cell_type": "code",
|
259 | 259 | "execution_count": null,
|
260 |
| - "id": "837bafd5", |
| 260 | + "id": "bc6bcae3", |
261 | 261 | "metadata": {},
|
262 | 262 | "outputs": [],
|
263 | 263 | "source": [
|
|
270 | 270 | },
|
271 | 271 | {
|
272 | 272 | "cell_type": "markdown",
|
273 |
| - "id": "a3df26a9", |
| 273 | + "id": "bc3fec23", |
274 | 274 | "metadata": {},
|
275 | 275 | "source": [
|
276 | 276 | "## <span style=\"color:#ff5f27;\">👮🏻♂️ Great Expectations </span>"
|
|
279 | 279 | {
|
280 | 280 | "cell_type": "code",
|
281 | 281 | "execution_count": null,
|
282 |
| - "id": "c9391579", |
| 282 | + "id": "cbc3ca3c", |
283 | 283 | "metadata": {},
|
284 | 284 | "outputs": [],
|
285 | 285 | "source": [
|
|
299 | 299 | {
|
300 | 300 | "cell_type": "code",
|
301 | 301 | "execution_count": null,
|
302 |
| - "id": "8969ff27", |
| 302 | + "id": "b3891fe5", |
303 | 303 | "metadata": {},
|
304 | 304 | "outputs": [],
|
305 | 305 | "source": [
|
|
340 | 340 | },
|
341 | 341 | {
|
342 | 342 | "cell_type": "markdown",
|
343 |
| - "id": "adf03efd", |
| 343 | + "id": "da2f174f", |
344 | 344 | "metadata": {},
|
345 | 345 | "source": [
|
346 | 346 | "---"
|
347 | 347 | ]
|
348 | 348 | },
|
349 | 349 | {
|
350 | 350 | "cell_type": "markdown",
|
351 |
| - "id": "7c069f5a", |
| 351 | + "id": "c48bbff9", |
352 | 352 | "metadata": {},
|
353 | 353 | "source": [
|
354 | 354 | "## <span style=\"color:#ff5f27;\"> 📡 Connecting to Hopsworks Feature Store </span>"
|
355 | 355 | ]
|
356 | 356 | },
|
357 | 357 | {
|
358 | 358 | "cell_type": "markdown",
|
359 |
| - "id": "ac32437d", |
| 359 | + "id": "4d68f207", |
360 | 360 | "metadata": {
|
361 | 361 | "tags": []
|
362 | 362 | },
|
|
373 | 373 | {
|
374 | 374 | "cell_type": "code",
|
375 | 375 | "execution_count": null,
|
376 |
| - "id": "287491fa", |
| 376 | + "id": "4c259c35", |
377 | 377 | "metadata": {},
|
378 | 378 | "outputs": [],
|
379 | 379 | "source": [
|
|
386 | 386 | },
|
387 | 387 | {
|
388 | 388 | "cell_type": "markdown",
|
389 |
| - "id": "23b61089", |
| 389 | + "id": "0f80ccc5", |
390 | 390 | "metadata": {},
|
391 | 391 | "source": [
|
392 | 392 | "To create a feature group you need to give it a name and specify a primary key. It is also good to provide a description of the contents of the feature group and a version number, if it is not defined it will automatically be incremented to `1`."
|
|
395 | 395 | {
|
396 | 396 | "cell_type": "code",
|
397 | 397 | "execution_count": null,
|
398 |
| - "id": "58353322", |
| 398 | + "id": "c78c614a", |
399 | 399 | "metadata": {},
|
400 | 400 | "outputs": [],
|
401 | 401 | "source": [
|
|
412 | 412 | },
|
413 | 413 | {
|
414 | 414 | "cell_type": "markdown",
|
415 |
| - "id": "832333fa", |
| 415 | + "id": "45fcf5c3", |
416 | 416 | "metadata": {},
|
417 | 417 | "source": [
|
418 | 418 | "A full list of arguments can be found in the [documentation](https://docs.hopsworks.ai/feature-store-api/latest/generated/api/feature_store_api/#create_feature_group).\n",
|
|
423 | 423 | {
|
424 | 424 | "cell_type": "code",
|
425 | 425 | "execution_count": null,
|
426 |
| - "id": "7f82e4a2", |
| 426 | + "id": "e97b5aab", |
427 | 427 | "metadata": {},
|
428 | 428 | "outputs": [],
|
429 | 429 | "source": [
|
|
435 | 435 | {
|
436 | 436 | "cell_type": "code",
|
437 | 437 | "execution_count": null,
|
438 |
| - "id": "0c6832a9", |
| 438 | + "id": "2f6cc0c0", |
439 | 439 | "metadata": {},
|
440 | 440 | "outputs": [],
|
441 | 441 | "source": [
|
|
462 | 462 | },
|
463 | 463 | {
|
464 | 464 | "cell_type": "markdown",
|
465 |
| - "id": "36f7e30a", |
| 465 | + "id": "a57846b4", |
466 | 466 | "metadata": {},
|
467 | 467 | "source": [
|
468 | 468 | "At the creation of the feature group, you will be prompted with an URL that will directly link to it; there you will be able to explore some of the aspects of your newly created feature group.\n",
|
|
472 | 472 | },
|
473 | 473 | {
|
474 | 474 | "cell_type": "markdown",
|
475 |
| - "id": "1250c3ce", |
| 475 | + "id": "777e6d4e", |
476 | 476 | "metadata": {},
|
477 | 477 | "source": [
|
478 | 478 | "You can move on and do the same thing for the feature group with our windows aggregation."
|
|
481 | 481 | {
|
482 | 482 | "cell_type": "code",
|
483 | 483 | "execution_count": null,
|
484 |
| - "id": "6b8ef9f6", |
| 484 | + "id": "35ca5bbb", |
485 | 485 | "metadata": {},
|
486 | 486 | "outputs": [],
|
487 | 487 | "source": [
|
|
498 | 498 | {
|
499 | 499 | "cell_type": "code",
|
500 | 500 | "execution_count": null,
|
501 |
| - "id": "1970657b", |
| 501 | + "id": "e16aa93e", |
502 | 502 | "metadata": {},
|
503 | 503 | "outputs": [],
|
504 | 504 | "source": [
|
|
510 | 510 | {
|
511 | 511 | "cell_type": "code",
|
512 | 512 | "execution_count": null,
|
513 |
| - "id": "68b0c84f", |
| 513 | + "id": "2d80f5e2", |
514 | 514 | "metadata": {},
|
515 | 515 | "outputs": [],
|
516 | 516 | "source": [
|
|
530 | 530 | },
|
531 | 531 | {
|
532 | 532 | "cell_type": "markdown",
|
533 |
| - "id": "e22dce87", |
| 533 | + "id": "eb2254b9", |
534 | 534 | "metadata": {},
|
535 | 535 | "source": [
|
536 | 536 | "Both feature groups are now accessible and searchable in the UI\n",
|
|
540 | 540 | },
|
541 | 541 | {
|
542 | 542 | "cell_type": "markdown",
|
543 |
| - "id": "4f8b5d8a", |
| 543 | + "id": "c6e783a2", |
544 | 544 | "metadata": {},
|
545 | 545 | "source": [
|
546 | 546 | "## <span style=\"color:#ff5f27;\">⏭️ **Next:** Part 02: Training Pipeline\n",
|
|
555 | 555 | "hash": "e1ddeae6eefc765c17da80d38ea59b893ab18c0c0904077a035ef84cfe367f83"
|
556 | 556 | },
|
557 | 557 | "kernelspec": {
|
558 |
| - "display_name": "Python 3", |
| 558 | + "display_name": "Python 3 (ipykernel)", |
559 | 559 | "language": "python",
|
560 | 560 | "name": "python3"
|
561 | 561 | },
|
|
569 | 569 | "name": "python",
|
570 | 570 | "nbconvert_exporter": "python",
|
571 | 571 | "pygments_lexer": "ipython3",
|
572 |
| - "version": "3.10.11" |
| 572 | + "version": "3.10.13" |
573 | 573 | }
|
574 | 574 | },
|
575 | 575 | "nbformat": 4,
|
|
0 commit comments