Skip to content

Slow Loading of Larger Datasets in Dash AG Grid #375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
BalaNagendraReddy opened this issue May 8, 2025 · 1 comment
Open

Slow Loading of Larger Datasets in Dash AG Grid #375

BalaNagendraReddy opened this issue May 8, 2025 · 1 comment
Labels
performance something is slow

Comments

@BalaNagendraReddy
Copy link

BalaNagendraReddy commented May 8, 2025

Description: When deploying our Dash application on an Apache server, loading Parquet files with 3 million rows in Dash AG Grid takes 10-20 minutes. This severely impacts user experience and usability.

Observed Issue:
The initial page load is taking excessive time.
Caching may not be a viable solution since calculations need to be performed dynamically based on user selection.

Expected Behavior:
The page should load significantly faster even with large datasets.
Parquet file processing should be optimized for smoother interaction.

Steps to Reproduce:
Deploy the Dash application on an Apache server.
Load a Parquet file containing 3 million rows into Dash AG Grid.
Observe page loading time (~10-20 minutes).

from dash import Dash, html
import dash_ag_grid as dag
import polars as pl

df = pl.read_parquet("green_tripdata_2019-01.parquet")
# print(df.describe())
rowData = df.to_dicts()
defaultColDef = {
    "floatingFilter": True,
    "wrapHeaderText": True,
    "autoHeaderHeight": True,
    "flex": 1,
}


app = Dash(__name__)

app.layout = html.Div(
    [
        html.H6("Dash AG Grid"),
        dag.AgGrid(
            rowData=rowData,
            columnDefs=[
                {"field": "VendorID"},
                {"field": "lpep_pickup_datetime"},
                {"field": "lpep_dropoff_datetime"},
                {"field": "total_amount"},
                {"field": "payment_type"},
                {"field": "trip_type"},
            ],
            columnSize="sizeToFit",
            defaultColDef=defaultColDef,
            className="ag-theme-alpine",
            enableEnterpriseModules=True,
        ),
    ]
)

if __name__ == "__main__":
    app.run(debug=True)

Link to download parquet file:
https://d37ci6vzurychx.cloudfront.net/trip-data/green_tripdata_2019-01.parquet

Floating filter is also not working as expected.

Image

In VectorID Column it is not showing vector id values to filter.

Any suggestions will be appreciated.

@gvwilson gvwilson added bug something broken P2 considered for next cycle performance something is slow labels May 8, 2025
@BSd3v
Copy link
Collaborator

BSd3v commented May 9, 2025

Hello @BalaNagendraReddy,

With that many rows, it is recommended to use things like infinite rowModels, or serverside models with enterprise.

Other things you can do is to introduce pagination to help present the data.

@AnnMarieW AnnMarieW removed bug something broken P2 considered for next cycle labels May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance something is slow
Projects
None yet
Development

No branches or pull requests

4 participants