Releases: Eventual-Inc/Daft
Releases · Eventual-Inc/Daft
v0.2.17
Changes
✨ New Features
- [FEAT] Add str.reverse() function @nsalerni (#1957)
- [FEAT] Add str.lower() function @nsalerni (#1938)
- [FEAT] MapArray @colin-ho (#1959)
- [FEAT]
any_value
groupby aggregation @kevinzwang (#1941) - [FEAT] adding floor function @chandbud5 (#1960)
- [FEAT] Expose
coerce_int96_timestamp_unit
flag on top leveldaft.read_parquet
call @samster25 (#1936) - [FEAT] Time Array @colin-ho (#1892)
- [FEAT] Add str.lstrip() and str.rstrip() functions @nsalerni (#1944)
- [FEAT] Add str.upper() function @nsalerni (#1942)
🚀 Performance Improvements
- [PERF] scan task in memory estimate @samster25 (#1901)
- [PERF] Spread scan tasks over Ray cluster. @clarkzinzow (#1950)
📖 Documentation
- [DOCS] [Delta Lake] Add user guide for Delta Lake reads. @clarkzinzow (#1969)
- [Catalogs] [Delta Lake] Add initial support for reading from Delta Lake. @clarkzinzow (#1879)
- [DOCS] Fix notebooks by falling back on null for URL downloads @jaychia (#1951)
- [DOCS] Add documentation for using and developing Daft on Ray @kevinzwang (#1896)
- [DOCS] Update schema hints documentation @jaychia (#1935)
🧰 Maintenance
v0.2.16
Changes
✨ New Features
- [FEAT] perform head operation instead of list when given a file without regex or / @samster25 (#1891)
🚀 Performance Improvements
- [PERF] Parallel glob @samster25 (#1897)
v0.2.15
Changes
👾 Bug Fixes
- [BUG] dont create dirs if non local fs @samster25 (#1888)
- [BUG] Fix Ray autoscaling from zero worker CPUs @kevinzwang (#1884)
- [BUG] Attempt to skip IMDS if region or credentials are provided @samster25 (#1886)
- [BUG] [Query Planner] Properly track ascending/descending sort order for range partitioning and sorting. @clarkzinzow (#1862)
- [BUG] Fix bug with merge tasks that allows for tasks larger than max size allowed @samster25 (#1882)
📖 Documentation
🧰 Maintenance
v0.2.14
Changes
✨ New Features
- [FEAT] Add ceil function @NormallyGaussian (#1867)
- [FEAT] show full schema on request @samster25 (#1868)
- [FEAT] Enable Requester Pay for S3 reads @colin-ho (#1856)
- [FEAT] _add_monotonically_increasing_id method for Dataframe @colin-ho (#1827)
🚀 Performance Improvements
👾 Bug Fixes
- [BUG] Protect Global Context With Mutex @samster25 (#1857)
- [BUG] Schema hints not working properly for json reads @colin-ho (#1845)
📖 Documentation
- [DOCS] Change show_optimized kwarg to show_all @jaychia (#1874)
- [DOCS] Drop use of "Complex Data" in favor of multimodal @samster25 (#1875)
- [DOCS] Add docs for AWS S3 IO @colin-ho (#1855)
🧰 Maintenance
v0.2.13
Changes
✨ New Features
- [FEAT] Add group_by.map_groups @colin-ho (#1825)
- [FEAT] [Join Optimizations] Add sort-merge join. @clarkzinzow (#1755)
- [FEAT] is_in expression @colin-ho (#1811)
- [FEAT] Dataframe __contains__ magic method @colin-ho (#1817)
🚀 Performance Improvements
- [PERF] Split parquet scan tasks into individual row groups @kevinzwang (#1799)
👾 Bug Fixes
- [BUG] Scan Operator Fix + Physical Plan Scan Task Summary @samster25 (#1850)
- [BUG] [Parquet] Fix double-await on
JoinHandle
s concurrency bug in Parquet reader. @clarkzinzow (#1841) - [BUG] Incorrect expression naming for struct get @kevinzwang (#1832)
- [BUG] Fix empty struct fields @colin-ho (#1833)
- [BUG] Fix for Iceberg schema projection @jaychia (#1815)
📖 Documentation
- [DOCS] Add docs for Azure IO @jaychia (#1851)
- [Query Planner] Add physical plan visualization option to
df.explain()
; implementTreeVisitor
forLogicalPlan
andPhysicalPlan
. @clarkzinzow (#1836) - [DOCS] Add type conversions between iceberg and daft @jaychia (#1835)
- [DOCS] Add dedicated Iceberg page @jaychia (#1830)
- [DOCS] Refactor expressions docs layout @jaychia (#1816)
- [CHORE] Add is_in to docs @colin-ho (#1819)
🧰 Maintenance
v0.2.12
Changes
👾 Bug Fixes
- [BUG] bugfix for empty partitions when writing out empty partitions @samster25 (#1814)
v0.2.11
Changes
✨ New Features
- [FEAT] Support Hive-Style Partitioned Writes for Tabular Writes @samster25 (#1794)
👾 Bug Fixes
- [BUG] Fix scheduler deadlock on concurrent broadcast joins. @clarkzinzow (#1812)
- [BUG] Fix type annotation on UDF @jaychia (#1807)
- [BUG] Materialize Dataframes created from file writes @colin-ho (#1785)
- [BUG] Materialize Dataframes created from in-memory data @colin-ho (#1780)
📖 Documentation
- [DOCS] Add warning during repartition to use into_partitions instead @jaychia (#1808)
- [BUG] Fix type annotation on UDF @jaychia (#1807)
- [DOCS] Update README.rst to remove beta disclaimer @jaychia (#1802)
- [CHORE] Update docs to reflect materialized Dataframes from writes and in-memory reads @colin-ho (#1795)
- [DOCS] Upgrade version of docs sphinx-book-theme dependency @jaychia (#1789)
- [DOCS] Fix notebooks to use new public parquet file @jaychia (#1788)
- [DOCS] Fix docs build for sphinxcontrib-applehelp versioning @jaychia (#1787)
- [DOCS] Update README.rst for broken links @jaychia (#1786)
- [CHORE] Update tutorials to use released version of Daft @jaychia (#1751)
🧰 Maintenance
- [CHORE] Update docs to reflect materialized Dataframes from writes and in-memory reads @colin-ho (#1795)
- [CHORE] Update tutorials to use released version of Daft @jaychia (#1751)
⬆️ Dependencies
- Bump actions/cache from 3 to 4 @dependabot (#1805)
v0.2.10
Changes
✨ New Features
- [FEAT] Add getter for Struct and List expressions @kevinzwang (#1775)
- [FEAT] Iceberg Murmur3 Hash function @samster25 (#1778)
- [FEAT] Not_Null Expression @colin-ho (#1777)
- [FEAT] Add sample function for Dataframe @colin-ho (#1770)
🚀 Performance Improvements
- [PERF] Iceberg Truncate Transform @samster25 (#1783)
- [PERF] Iceberg Hash Bucket Transform @samster25 (#1779)
👾 Bug Fixes
- [BUG] Invalidate PartitionSpec when we run Explode on it @samster25 (#1772)
📖 Documentation
- [CHORE] Add sample to docs @colin-ho (#1781)
- [CHORE] Add not_null to docs @colin-ho (#1782)
- [FEAT] Add getter for Struct and List expressions @kevinzwang (#1775)
- [DOCS] Fix broken links on readme @jaychia (#1774)
- [DOCS] Add documentation for read_iceberg @jaychia (#1769)
- [DOCS] Documentation reorganization @jaychia (#1762)
🧰 Maintenance
v0.2.9
v0.2.8
Changes
✨ New Features
- [PERF] Iceberg Partition Pruning @samster25 (#1688)
- [FEAT] annotate ray tasks with name of instructions @samster25 (#1729)
🚀 Performance Improvements
- [PERF] Iceberg Partition Pruning @samster25 (#1688)
- [PERF] Speed up CSV Reader with SIMD and reduced allocations @samster25 (#1749)
- [PERF] Greatly speed up Variable Length Concat @samster25 (#1748)
- [PERF] Predicate Pushdown into Scan Operator @samster25 (#1730)
- [PERF] Json Predicate Pushdown while reading @samster25 (#1727)
- [PERF] Predicate Pushdown for CSV Reader @samster25 (#1724)
👾 Bug Fixes
- [BUG] Concat Fix when Variable Length Array is sliced @samster25 (#1750)
- [BUG] bugfix when cluster has no workers and key error happens when fetching num cores @samster25 (#1745)
- [BUG] Fix comparing date and timestamps @samster25 (#1735)
- [BUG] Apply the default IOConfig in daft.from_glob_path @jaychia (#1731)
- [BUG] [Hotfix] Fix limit pushdown test. @clarkzinzow (#1728)
📖 Documentation
- Revert "[DOCS] Add proper robots.txt and sitemap.xml to index only latest and stable" @jaychia (#1753)
- [DOCS] Add proper robots.txt and sitemap.xml to index only latest and stable @jaychia (#1752)
- [DOCS] Add documentation on memory @jaychia (#1736)
- [DOCS] Add anonymous io_config for notebook @jaychia (#1721)
🧰 Maintenance
- [CHORE] kernel override for notebook checker @samster25 (#1746)
- [CHORE] Clean up Repr for GlobScanOperator and Explain @samster25 (#1734)
- [CHORE] Generate S3 manifests @samster25 (#1732)
- [CHORE] update dev version to 0.2.0 dev @samster25 (#1723)