Releases: Eventual-Inc/Daft
Releases · Eventual-Inc/Daft
v0.3.13
Changes
✨ New Features
- [FEAT] Pre Shuffle Merge Strategy @colin-ho (#3191)
- [FEAT] Minimal indices dtype for FixedShapeSparseTensors @sagiahrac (#3149)
- [FEAT] implement range operation and data streaming @andrewgazelka (#3267)
- [FEAT] Support intersect as a DataFrame API @advancedxy (#3134)
- [FEAT] Adds a
read_generator
method that reads tables from a generator @colin-ho (#3258) - [FEAT] Add initial Spark Connect support @andrewgazelka (#3261)
👾 Bug Fixes
- [BUG] Always run telemetry codepath @jaychia (#3275)
- [BUG] Cleanup context side-effects @jaychia (#3270)
- [BUG]: bad merge from intersect PR @universalmind303 (#3273)
📖 Documentation
- [FEAT] Minimal indices dtype for FixedShapeSparseTensors @sagiahrac (#3149)
- [DOCS] Use an absolute path for the canonical link @desmondcheongzx (#3271)
- [DOCS] Set canonical link @desmondcheongzx (#3269)
🧰 Maintenance
- [CHORE] Allow manual launch of release-drafter.yml @jaychia (#3283)
- [CHORE] add daftrunner env to install and test step @colin-ho (#3279)
- [CHORE] Remove the concept of runner configs @jaychia (#3276)
- [CHORE] Expose read_sql partition bound strategy and default to min-max @colin-ho (#3246)
- [CHORE]: remove daft-table dependency from daft-logical-plan @universalmind303 (#3265)
- [CHORE] Remove daft-scan dependency from planning crates @kevinzwang (#3250)
v0.3.12
Changes
✨ New Features
- [FEAT]: Sql joins with duplicate cols @universalmind303 (#3241)
- [FEAT] Add tracing for runner @jaychia (#3113)
- [FEAT] add spark-connect protocol @andrewgazelka (#3189)
🚀 Performance Improvements
- [PERF] Harden GCP Retries @samster25 (#3253)
👾 Bug Fixes
- [BUG]: orderby with aggs @universalmind303 (#3190)
📖 Documentation
- [DOCS] Fix typo in write_parquet's parameters @desmondcheongzx (#3252)
- [DOCS] Changing docs for UDF @jaychia (#2880)
🧰 Maintenance
- [CHORE] implement mean and stddev for decimal @samster25 (#3159)
v0.3.11
Changes
✨ New Features
- [FEAT] Native Runner @colin-ho (#3178)
- [FEAT]: sql "extract" temporal function @universalmind303 (#3188)
🚀 Performance Improvements
- [PERF] Remove upfront buffer allocations for local CSV reader @desmondcheongzx (#3242)
📖 Documentation
- [FEAT] Native Runner @colin-ho (#3178)
- [DOCS] Update Iceberg roadmap on docs @kevinzwang (#3240)
- [FEAT]: sql "extract" temporal function @universalmind303 (#3188)
🧰 Maintenance
- [CHORE] Fix flaky test in test_decimal_to_decimal_cast @advancedxy (#3243)
- [CHORE] Split logical and physical plans into separate crates @kevinzwang (#3239)
v0.3.10
Changes
✨ New Features
- [FEAT] Overwrite mode for write parquet/csv @colin-ho (#3108)
- [FEAT] Support null equal safe join in SQL @advancedxy (#3166)
- [FEAT] Streaming Catalog Writes @colin-ho (#3160)
- [FEAT] Infer Azure storage account from uri @kevinzwang (#3165)
- [FEAT] Support null safe equal in joins @advancedxy (#3161)
- [FEAT] Support hive partitioned reads @desmondcheongzx (#3029)
- [FEAT] Add better detection of Ray Job environment @jaychia (#3148)
- [FEAT] Streaming physical writes for native executor @colin-ho (#2992)
- [FEAT]: Throw error for invalid ** usage outside folder segments (e.g. /tmp/**.csv) @conradsoon (#3100)
- [FEAT]: sql concat and stddev @universalmind303 (#3153)
- [FEAT]: Sql common table expressions (CTE's) @universalmind303 (#3137)
- [FEAT] enable decimal between @samster25 (#3154)
- [FEAT] dec128 math @samster25 (#3143)
- [FEAT] Support SQL
INTERVAL
@austin362667 (#3146) - [FEAT] Swordfish Stateful UDF support @kevinzwang (#3127)
- [FEAT]: sql cross join @universalmind303 (#3110)
- [FEAT] Add floor division @ConeyLiu (#3064)
- [FEAT] Compute pool for native executor @colin-ho (#2986)
🚀 Performance Improvements
- [PERF] Add a parallel local CSV reader @desmondcheongzx (#3055)
👾 Bug Fixes
- [BUG]: Sql groupby and orderby with aliases and projections @universalmind303 (#3177)
- [BUG] Separate PartitionTask done from results @jaychia (#3155)
- [BUG]: between panic on unsupported types @universalmind303 (#3150)
- [BUG] fix type widening for rem @samster25 (#3131)
📖 Documentation
- Temporal docs added to expressions.rst @sunaysanghani (#2487)
- [DOCS] Update banner on README.rst @ccmao1130 (#3130)
- [DOCS] Update Daft logo @ccmao1130 (#3129)
🧰 Maintenance
- [CHORE] Add tests for decimal casting @desmondcheongzx (#3179)
- [CHORE] Refactor RayRunner so that we can add tracing @jaychia (#3163)
- [CHORE] Swordfish specific test fixtures @colin-ho (#3164)
- [CHORE]: tpc-ds datagen @universalmind303 (#3103)
- [CHORE] Cancel tasks spawned on compute runtime @colin-ho (#3128)
- [CHORE] Enable debug in test profile @advancedxy (#3135)
- [FEATURE] add min_hash alternate hashers @andrewgazelka (#3052)
- [CHORE] (Revert:) Add rust cache to s3 build artifacts action @jaychia (#3147)
- [CHORE] Add rust cache to s3 build artifacts action @jaychia (#3144)
- [CHORE] Refactor shuffles to use a unified ShuffleExchange PhysicalPlan variant @jaychia (#3083)
⬆️ Dependencies
4 changes
- Bump orjson from 3.9.5 to 3.10.11 @dependabot (#3176)
- Bump adlfs from 2023.10.0 to 2024.7.0 @dependabot (#2547)
- Bump image from 0.24.9 to 0.25.4 @dependabot (#3088)
- Bump slackapi/slack-github-action from 1.26.0 to 1.27.0 @dependabot (#2776)
v0.3.9
Changes
✨ New Features
- [FEAT]: sql
IN
operator @universalmind303 (#3086) - [FEAT] Enable explode for swordfish @colin-ho (#3077)
- [FEAT]: add sql DISTINCT @universalmind303 (#3087)
- [FEAT] Enable concat for swordfish @colin-ho (#2976)
- [FEAT] Enable unpivot for swordfish @colin-ho (#3078)
- [FEAT] Outer joins for native executor @colin-ho (#2860)
- [FEAT] Enable pivot for swordfish @colin-ho (#3081)
- [FEAT] Enable sample for swordfish @colin-ho (#3079)
- [FEAT] Add stateful actor context and set CUDA_VISIBLE_DEVICES @kevinzwang (#3002)
- [FEAT]: sql tbl alias, and compount ident for joins @universalmind303 (#3066)
- [FEAT]: sql between @universalmind303 (#3062)
- [FEAT]: Interval dtype @universalmind303 (#3018)
- [FEAT] Enable to_json_string for physical plan @colin-ho (#3023)
- [FEAT]: Daft support for Azure storage for Unity Catalog
daft.read_deltalake
@anilmenon14 (#3025) - [FEAT] Iceberg MOR for streaming parquet @colin-ho (#2975)
- [FEAT] Include file paths as column from read_parquet/csv/json @colin-ho (#2953)
🚀 Performance Improvements
- [PERF] Remove stateful actor child materialization limit @kevinzwang (#3099)
👾 Bug Fixes
- [BUG] Bump up max_header_size @raunakab (#3068)
- [BUG] Autodetect AWS region during deltalake scan @kevinzwang (#3104)
- [BUG] Add over clause in read_sql percentile reads @colin-ho (#3094)
- [BUG] Disable Linux SSL CERT override @samster25 (#3098)
- [BUG] Fix into_partitions to use a more naive approach without materialization @jaychia (#3080)
- [BUG] Fix actor pool initialization in ray client mode @kevinzwang (#3028)
- [BUG]: joins with duplicate column names and qualified table expansion @universalmind303 (#3074)
- [BUG]: sql functions case sensitivit @universalmind303 (#3063)
- [BUG] Fix write_deltalake add action file path prefix @kevinzwang (#3053)
- [BUG] Fix intersection checking when unioning schemas @desmondcheongzx (#3039)
- [BUG] Sampling without replacement not working @colin-ho (#3035)
🧰 Maintenance
- [CHORE]: replace the
.venv
value with global variableVENV
@mohamedrezk122 (#3084) - [CHORE] Enable lancedb reads for native executor @colin-ho (#2925)
- [CHORE] Auto attach LLDB debugger to python #2940 @sagiahrac (#3020)
- [CHORE] Rename config.yaml to config.yml @samster25 (#3045)
- [CHORE] add config.yaml for issues @samster25 (#3044)
- [CHORE] validation on dropdown @samster25 (#3043)
- [CHORE] preserve quotes in yaml @samster25 (#3042)
- [CHORE] Checkbox for contribution @samster25 (#3041)
- [CHORE] update feature request @samster25 (#3040)
v0.3.8
Changes
👾 Bug Fixes
📖 Documentation
- [DOCUMENTATION] add value counts to rst @andrewgazelka (#3032)
🧰 Maintenance
- [DOCUMENTATION] add value counts to rst @andrewgazelka (#3032)
v0.3.7
v0.3.6
Changes
✨ New Features
- [FEAT] Implement standard deviation @raunakab (#3005)
- [FEAT] Add time travel to read_deltalake @kevinzwang (#3022)
- [FEAT] agg_list support for list and struct types @kevinzwang (#3019)
- [FEAT] Cast SparseTensor and FixedShapeSparseTensor to Python @sagiahrac (#3010)
- [FEAT] add
list.value_counts()
@andrewgazelka (#2902) - [FEAT] Infer timedelta literal as duration @colin-ho (#3011)
- [DOCS] Naming consistency of
length
functions @vicky1999 (#2942)
👾 Bug Fixes
- [BUG] Pass parquet2 io errors correctly into arrow2 @desmondcheongzx (#3012)
- [BUG] Fix actor pool project splitting when column is not renamed @kevinzwang (#2998)
- [BUG] Add resources to Ray stateful UDF actor @kevinzwang (#2987)
- [BUG] Fix join errors with same key name joins (resolves #2649) @anmolsingh20 (#2877)
- [BUG]: error messages for add @universalmind303 (#2990)
📖 Documentation
- [FEAT] Implement standard deviation @raunakab (#3005)
- [DOC] fix link in doc @amitschang (#2944)
- [DOCS] Update readme to use python syntax highlighting @jaychia (#3006)
- [DOCS] Naming consistency of
length
functions @vicky1999 (#2942) - [DOCS] Update readme to correctly reflect new messaging @jaychia (#3001)
🧰 Maintenance
- [CHORE] add/fix many clippy lints @andrewgazelka (#2978)
v0.3.5
Changes
✨ New Features
- [FEAT]: sql
read_deltalake
function @universalmind303 (#2974) - [FEAT]: SQL add hash and minhash @universalmind303 (#2948)
- [FEAT] Enable init args for stateful UDFs @kevinzwang (#2956)
👾 Bug Fixes
- [BUG]: add count_matches and fix a bunch of str functions @universalmind303 (#2946)
- [BUG] Writes from empty partitions should return empty micropartitions with non-null schema @colin-ho (#2952)
- [CHORE] Enable test_creation and test_parquet for native executor @colin-ho (#2672)
- [BUG] improve error reporting for multistatement sql @amitschang (#2916)
- [BUG]: sql nested and wildcard @universalmind303 (#2937)
- [BUG] Enable groupby with alias for native executor @colin-ho (#2917)
- [BUG] Use dashes for machete dependency ignores @colin-ho (#2919)
📖 Documentation
- [DOCS] Fix docs to add SQL capabilities @jaychia (#2931)
- [DOCS] update arch png @samster25 (#2970)
- [DOCS] Add docs on to_arrow and as_arrow @samster25 (#2965)
- [DOCS]: add a helper function to list all sql functions @universalmind303 (#2943)
- [CHORE] Additional fixes for nightly tests @kevinzwang (#2936)
- [CHORE] Fix issues from nightly tests @kevinzwang (#2926)
🧰 Maintenance
- [CHORE] ignore 45e2944 @andrewgazelka (#2979)
- [CHORE] Enable test_creation and test_parquet for native executor @colin-ho (#2672)
- [CHORE] pin cargo machete to 0.7.0 @andrewgazelka (#2920)
- [CHORE] Refactor Binary Ops @samster25 (#2876)
- [CHORE] add pytest to vscode settings.json @andrewgazelka (#2930)
- [CHORE] Additional fixes for nightly tests @kevinzwang (#2936)
- [CHORE] update GH template name from md to yml @samster25 (#2934)
- [CHORE] update GH bug template @samster25 (#2932)
- [CHORE] Fix issues from nightly tests @kevinzwang (#2926)
- [CHORE] Enable sources to return empty tables @colin-ho (#2915)
v0.3.4
Changes
✨ New Features
- [FEAT]
agg_concat
doesn't work on strings @vicky1999 (#2847) - [FEAT] Add ability for RayRunner to run actor pool projects (beta feature) @jaychia (#2881)
- [FEAT]: [SQL] struct subscript and json_query @universalmind303 (#2891)
- [FEAT] UTF8 to binary coercion flag @raunakab (#2893)
- [FEAT] Delta Lake partitioned writing @kevinzwang (#2884)
- [FEAT]: add partitioning_* functions to sql @universalmind303 (#2869)
- [FEAT]: add sql support for "DATE <date>" and "DATETIME <datetime>" @universalmind303 (#2870)
- [FEAT] Add Sparse Tensor logical type @michaelvay (#2722)
- [FEAT] [SQL] Enable SQL query to run on callers scoped variables @amitschang (#2864)
- Revert "[FEAT]:
shuffle_join_default_partitions
param" @jaychia (#2873) - [FEAT] Iceberg partitioned writes @kevinzwang (#2842)
- [FEAT]: SQL temporal functions @universalmind303 (#2858)
- [FEAT]: sql list operations @universalmind303 (#2856)
- [FEAT]:
shuffle_join_default_partitions
param @universalmind303 (#2844) - [FEAT] Add left/right/anti/semi joins to native executor @colin-ho (#2743)
🚀 Performance Improvements
- [PERF] Lazily import heavy modules to speed up import times @desmondcheongzx (#2826)
👾 Bug Fixes
- [BUG] Fix display for decimal types @raunakab (#2909)
- [BUG] Fix partitioning SQL scans on empty tables @desmondcheongzx (#2885)
- [BUG] Fix concat expression typing @colin-ho (#2868)
🧰 Maintenance
- [CHORE] Classify throttle and internal errors as Retryable in Python @samster25 (#2914)
- [CHORE] auto-fix prefer
Self
over explicit type @andrewgazelka (#2908) - [CHORE]: bump sqlparser version @universalmind303 (#2886)
- [CHORE]: Move daft.sql.sql module to daft.sql @universalmind303 (#2907)
- [CHORE] ignore vendored crates for codecov @samster25 (#2895)
- [CHORE]: move
numeric
out of daft-dsl and intodaft-functions
@universalmind303 (#2857) - [CHORE] Update documentation for config variables @jaychia (#2874)
- [CHORE] Move codspeed interactive tests to local files @samster25 (#2872)
- [CHORE]: move list functions from daft-dsl to daft-functions @universalmind303 (#2854)
- [CHORE] Change TPC-H q4 and q22 answers to use new join types @kevinzwang (#2756)
- [CHORE] Add native executor to CI @colin-ho (#2855)
⬆️ Dependencies
- Bump astral-sh/setup-uv from 2 to 3 @dependabot (#2888)
- Bump isbang/compose-action from 2.0.0 to 2.0.2 @dependabot (#2887)