fix(go/adbc/driver/snowflake): use one session for connection #2494

zeroshade · 2025-01-28T20:31:49Z

Instead of creating a separate connection for simple queries like SELECT COUNT and the metadata queries, we'll keep using the single connection to maintain the existing session throughout. This should avoid the problem mentioned in #2128 where a new session cancels the remaining COPY INTO queries. It also simplifies some of the work on metadata queries so that we don't need to explicitly propagate catalog, schema and database names between connections.

Fixes #2517

zeroshade · 2025-01-28T20:32:58Z

@davlee1972 @Zan-L @Niivii Could all three of you try testing using this PR and confirm that it eliminates the issue you were having with adbc_ingest?

davlee1972 · 2025-01-29T15:39:41Z

How often are nightly builds run? I’m using python so the driver is a Linux shared library object.

https://gemfury.com/arrow-adbc-nightlies/python:adbc-driver-snowflake/adbc_driver_snowflake-1.5.0-py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl

zeroshade · 2025-01-29T16:39:21Z

@davlee1972 you would probably need to pull down this PR and build it yourself to test it. If you need I can build it and upload the shared lib and wheel

Niivii · 2025-01-29T17:05:07Z

@davlee1972 you would probably need to pull down this PR and build it yourself to test it. If you need I can build it and upload the shared lib and wheel

This would really help me at least. I do not have enough knowledge in Go to build everything from scratch 🥲

I pulled the nightly 1.5 easily but I saw it was 8d old so I guess it wont have this change !

zeroshade · 2025-01-29T21:23:23Z

@Niivii here's a wheel I built using my branch. This should be installable and testable (I hope)

pr-wheel.zip

davlee1972 · 2025-01-29T23:33:26Z

I installed the 1.5.0dev wheel and copy into operations are still getting truncated.

Three Copy Into operations were running for 30 seconds before they were killed by select count(*) call.

Error messages..

time="2025-01-29T18:19:06-05:00" level=error msg="failed to get response. err: context canceled" func=gosnowflake.postRestfulQueryHelper file="restful.go:290"

time="2025-01-29T18:19:06-05:00" level=error msg="error: context canceled" func="gosnowflake.(*snowflakeConn).queryContextInternal" file="connection.go:413"

dbapi.py:937

return _blocking_call(self._stmt.execute_update, (), {}, self._stmt.cancel)

InternalError: INTERNAL: some files not loaded by COPY command, 96 files remain after 5 retries

zeroshade · 2025-01-30T00:11:42Z

Well, "failed to get response" is at least a different error. Is there any way you can give me a reproducer with synthetic data?

I know that you're testing with internal stuff, but I haven't been able to recreate the truncated copies myself.

Niivii · 2025-01-30T12:46:12Z

Hello, this didn't solve the issue, still get some remaining COPY commands.
We opened a ticket at Snowflake to get more details on the failed query.

This seems to happens more often with large amount of data. A couple millions of rows goes through without issue.
Several Gb in memory triggers it consistently from what I experience.

zeroshade · 2025-01-30T20:33:11Z

@Niivii would you be able to share your reproducer? If you can't share the data, can you generate synthetic data that I would be able to reproduce it with?

Niivii · 2025-01-30T21:00:19Z

I will make a small python project to simulate what is done and generate some dummy data. Not sure I will have the time before the week-end though. Will let you know

zeroshade · 2025-02-10T19:42:06Z

@davlee1972 @Niivii checking in again to see if either of you can provide a reproducer that I can try utilizing to debug and test this

Niivii · 2025-02-10T20:57:31Z

Hello @zeroshade,

This should do the trick:

import adbc_driver_snowflake.dbapi
import numpy as np
import polars as pl


def main() -> None:
    big_df = generate_big_dataframe(num_rows=5000000)
    with create_snowflake_connection() as snowflake_connection:
        big_df.write_database(table_name="my_table", connection=snowflake_connection,
                              if_table_exists="append", engine="adbc")


def generate_big_dataframe(num_rows: int) -> pl.DataFrame:
    rng = np.random.default_rng(12345)
    data = {
        "column_int": rng.integers(low=-10000, high=10000, size=num_rows),
        "column_float": rng.random(size=num_rows),
        "column_string": rng.choice(["A", "b", "cdefghijkl", "âozZEzaéXZez"], size=num_rows),
    }
    return pl.DataFrame(data=data)


def create_snowflake_connection() -> adbc_driver_snowflake.dbapi.Connection:
    return adbc_driver_snowflake.dbapi.connect(db_kwargs={
        "username": "myusername",
        "adbc.snowflake.sql.account": "myaccount",
        "adbc.snowflake.sql.auth_type": "auth_ext_browser",
        "adbc.snowflake.sql.warehouse": "mywarehouse",
        "adbc.snowflake.sql.role": "myrole",
        "adbc.snowflake.sql.db": "mydb",
        "adbc.snowflake.sql.schema": "myschema",
    })


if __name__ == "__main__":
    main()

python>=3.13
polars>=1.22.0
adbc-driver-snowflake>=1.4.0
pyarrow>=19.0.0
numpy>=2.2.2

Let me know if you can reproduce.
Problem might come from the 'append' method with existing data.

zeroshade · 2025-02-11T18:37:24Z

Thanks @Niivii with the help of that script I was able to narrow down and determine the precise cause of the issue. While this is needed to fix it, it will ultimately need another solution to fix that issue (I'm gonna start working on it now).

In the meantime, this does provide a good fix for some issues including #2517. (I've updated the PR description such that it marks fixing the correct issue)

@lidavidm Any further comments on this or do you think it's good to merge as-is while I work on the actual solution for the COPY into problem?

zeroshade · 2025-02-11T22:42:06Z

@Niivii @davlee1972 I think i figured it out without needing a larger refactor.... here's a built version that you can try and hopefully fixes the issue on your end, it fixed my reproducer at least.... Please report back!

adbc_driver_snowflake-1.5.0.tar.gz

Niivii · 2025-02-12T09:41:07Z

This didn't solve the issue on my side sadly 😭
Still get remaining copy into:

adbc_driver_manager.InternalError: INTERNAL: some files not loaded by COPY command, 1 files remain after 5 retries
Tested with roughly 10M rows (6 columns composed of 2 numbers, 1 timestampntz, 3 doubles)

Niivii · 2025-02-12T10:49:00Z

Disclaimer: I am not a Go dev !
I had a quick look at the stopFn handler and there is a timing issue imho.
If executeCopyQuery is slow it could trigger an early cancel() after the retries.
Could a local independent retry context help to avoid premature cancelation of the main context ?

Also worth to mention that it is systematically the first COPY INTO query that is slow.
Stats of the first failed query:

zeroshade · 2025-02-12T18:58:55Z

@Niivii Okay, I think i spotted where the other potential race condition was:

stopFn gets called leading to it closing the channels
the goroutine that calls executeCopyQuery has already checked that the channel is still open, but hasn't called g.Go yet
stopFn calls g.Wait and returns immediately because there are no running goroutines, so it continues on
g.Go is called kicking off executeCopyQuery which gets in and runs before the calls to it happen in stopFn

That's my theory as to how it still happened in your test. So I've updated stopFn to make sure that it won't call g.Wait until after the goroutine finishes running. i.e. until after it has kicked off any calls to executeCopyQuery so that g.Wait will definitely wait and we can't end up in-between. I've attached a new build that should include this fix to this, hopefully this finally fixes the problem for you!

If not, then I'm probably going to have to just rework and refactor the entire process here which i'm trying to avoid....

adbc_driver_snowflake-1.5.0.tar.gz

adbc_driver_snowflake_manylinux.whl.tar.gz

lidavidm · 2025-02-12T23:22:58Z

Is there a cleaner architecture we can achieve by refactoring? At some point it feels a bit brittle 😅 (that said I'm happy to merge this for the time being if it does fix everything)

Niivii · 2025-02-13T09:58:11Z

🎉 🎉 🎉

davlee1972 · 2025-02-13T21:43:00Z

Unfortunately this fix didn't work from my end after installing the manylinux whl file.

First time it aborted an active copy into.
Second time it aborted two active copy into.

Unfortunately I'm having a corporate issue between my personal GitHub and copilot GitHub account so here is a camera photo of the error.

The error I'm seeing is 64 files remain after 5 retries.

I'm wondering why we have all these copy into operations with no rows loaded and if there happens to be 5 of these in a row will it abort prematurely???

zeroshade · 2025-02-14T15:28:15Z

I'm wondering why we have all these copy into operations with no rows loaded and if there happens to be 5 of these in a row will it abort prematurely???

It's the finishCopy part. Once all the uploading has completed, there's a loop calling executeCopy if not all of the files have been loaded yet. It makes 5 attempts at performing the last copies and then aborts if there are still files that weren't copied yet. But with the latest changes, it should be waiting until the existing ones complete before moving on to the final attempts so I don't know how you're getting the abort in this scenario. I can't see any way in the code that it should be able to end up in the retry loop until the other COPY INTO commands finish...

can you give me any more information about the shape of your data? How many rows/columns? I want to try to see if I can reproduce the failure you're seeing somehow so I can try to debug how it's happening.

lidavidm · 2025-02-18T08:19:35Z

Should we merge this for the time being while we continue investigating? It seems it fixes at least a couple issues if not all

zeroshade · 2025-02-18T17:24:32Z

@lidavidm I agree, let's merge this to fix the issues it does fix, and I'll continue investigating.

zeroshade · 2025-02-18T23:29:17Z

@davlee1972 Please feel free to reply here if you can give me more information (number of rows / columns / batches, etc) about your dataset that can help me to try to reproduce your failure (also, please try to build it fresh locally with the updated changes and see if that helps). In the meantime, I have some ideas about reducing the extraneous copy calls that have 0 rows inserted.

Niivii · 2025-02-19T12:56:12Z

@davlee1972 Please feel free to reply here if you can give me more information (number of rows / columns / batches, etc) about your dataset that can help me to try to reproduce your failure (also, please try to build it fresh locally with the updated changes and see if that helps). In the meantime, I have some ideas about reducing the extraneous copy calls that have 0 rows inserted.

From his photo, you can see the number of inserted rows in Snowflake on the right.
This is roughly 5 times more than on my side but ofc it depends of the data type and number of columns too.
I guess you might be able to replicate by tweaking my script and adding a 0 to the number of rows if you have the memory for it on your computer (should be around ~5Gb ?):
big_df = generate_big_dataframe(num_rows=50000000)

Otherwise, try to saturate your RAM with the biggest dataframe possible and see how it goes 🤷

davlee1972 · 2025-02-19T20:53:11Z

Finally got my dual login to github issues resolved (for next 90 days)...

Here's my test code using the NYC Taxi Trip Data.

https://www.kaggle.com/datasets/elemento/nyc-yellow-taxi-trip-data?resource=download

It aborted after coming up 5 empty copy into(s)..

import pyarrow.dataset as ds
import adbc_driver_snowflake.dbapi as dbapi

csv_dataset = ds.dataset("archive", format="csv")

db_args = {}
db_args["db_kwargs"] = db_kwargs
db_args["autocommit"] = True

conn = dbapi.connect(**db_args)

cursor = conn.cursor()

cursor.adbc_ingest(
    table_name="taxi_data",
    data=csv_dataset,
    mode="create_append",
)

Exception ignored in: <function Connection.__del__ at 0x7ff8738f8790>
Traceback (most recent call last):
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 352, in __del__
    self.close()
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 332, in close
    self._conn.close()
  File "adbc_driver_manager/_lib.pyx", line 1046, in adbc_driver_manager._lib.AdbcConnection.close
  File "adbc_driver_manager/_lib.pyx", line 333, in adbc_driver_manager._lib._AdbcHandle._check_open_children
  File "adbc_driver_manager/_lib.pyx", line 335, in adbc_driver_manager._lib._AdbcHandle._check_open_children
RuntimeError: Cannot close AdbcConnection with open AdbcStatement
Exception ignored in: <function Connection.__del__ at 0x7ff8738f8790>
Traceback (most recent call last):
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 352, in __del__
    self.close()
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 332, in close
    self._conn.close()
  File "adbc_driver_manager/_lib.pyx", line 1046, in adbc_driver_manager._lib.AdbcConnection.close
  File "adbc_driver_manager/_lib.pyx", line 333, in adbc_driver_manager._lib._AdbcHandle._check_open_children
  File "adbc_driver_manager/_lib.pyx", line 335, in adbc_driver_manager._lib._AdbcHandle._check_open_children
RuntimeError: Cannot close AdbcConnection with open AdbcStatement
Exception ignored in: <function Connection.__del__ at 0x7ff8738f8790>
Traceback (most recent call last):
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 352, in __del__
    self.close()
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 332, in close
    self._conn.close()
  File "adbc_driver_manager/_lib.pyx", line 1046, in adbc_driver_manager._lib.AdbcConnection.close
  File "adbc_driver_manager/_lib.pyx", line 333, in adbc_driver_manager._lib._AdbcHandle._check_open_children
  File "adbc_driver_manager/_lib.pyx", line 335, in adbc_driver_manager._lib._AdbcHandle._check_open_children
RuntimeError: Cannot close AdbcConnection with open AdbcStatement
time="2025-02-19T15:42:23-05:00" level=error msg="error: context canceled" func="gosnowflake.(*snowflakeConn).queryContextInternal" file="connection.go:413"
Exception ignored in: <function Connection.__del__ at 0x7ff8738f8790>
Traceback (most recent call last):
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 352, in __del__
    self.close()
  File "/home/my_login/    miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py", line 332, in close
    self._conn.close()
  File "adbc_driver_manager/_lib.pyx", line 1046, in adbc_driver_manager._lib.AdbcConnection.close
  File "adbc_driver_manager/_lib.pyx", line 333, in adbc_driver_manager._lib._AdbcHandle._check_open_children
  File "adbc_driver_manager/_lib.pyx", line 335, in adbc_driver_manager._lib._AdbcHandle._check_open_children
RuntimeError: Cannot close AdbcConnection with open AdbcStatement
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
Cell In[19], line 14
     10 conn = dbapi.connect(**db_args)
     12 cursor = conn.cursor()
---> 14 cursor.adbc_ingest(
     15     table_name="taxi_data",
     16     data=csv_dataset,
     17     mode="create_append",
     18 )

File ~/miniconda3/lib/python3.9/site-packages/adbc_driver_manager/dbapi.py:937, in Cursor.adbc_ingest(self, table_name, data, mode, catalog_name, db_schema_name, temporary)
    934     self._stmt.bind_stream(handle)
    936 self._last_query = None
--> 937 return _blocking_call(self._stmt.execute_update, (), {}, self._stmt.cancel)

File ~/miniconda3/lib/python3.9/site-packages/adbc_driver_manager/_lib.pyx:1573, in adbc_driver_manager._lib._blocking_call_impl()

File ~/miniconda3/lib/python3.9/site-packages/adbc_driver_manager/_lib.pyx:1566, in adbc_driver_manager._lib._blocking_call_impl()

File ~/miniconda3/lib/python3.9/site-packages/adbc_driver_manager/_lib.pyx:1299, in adbc_driver_manager._lib.AdbcStatement.execute_update()

File ~/miniconda3/lib/python3.9/site-packages/adbc_driver_manager/_lib.pyx:260, in adbc_driver_manager._lib.check_error()

InternalError: INTERNAL: some files not loaded by COPY command, 44 files remain after 5 retries

zeroshade · 2025-02-19T22:27:38Z

Nice, I'll see if I can reproduce with the same data. Hopefully it'll be consistent and I can fix this. Thanks @davlee1972

fix(go/adbc/driver/snowflake): use one session for connection

2f492b9

zeroshade requested review from lidavidm and joellubi January 28, 2025 20:31

github-actions bot modified the milestone: ADBC Libraries 17 Jan 28, 2025

zeroshade requested a review from CurtHagenlocher January 28, 2025 20:33

clean up the scan for GetTableSchema

1c7fb2d

lidavidm approved these changes Jan 29, 2025

View reviewed changes

lidavidm mentioned this pull request Feb 11, 2025

go/adbc/driver/snowflake: failing to get columns when current schema is not set #2517

Closed

by george, i think i got it!

a88537a

lidavidm approved these changes Feb 12, 2025

View reviewed changes

fix another potential race condition

e21e95c

zeroshade merged commit 7706ace into apache:main Feb 18, 2025
38 of 40 checks passed

zeroshade deleted the snowflake-sessions branch February 18, 2025 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(go/adbc/driver/snowflake): use one session for connection #2494

fix(go/adbc/driver/snowflake): use one session for connection #2494

zeroshade commented Jan 28, 2025 •

edited

Loading

zeroshade commented Jan 28, 2025

davlee1972 commented Jan 29, 2025

zeroshade commented Jan 29, 2025

Niivii commented Jan 29, 2025

zeroshade commented Jan 29, 2025

davlee1972 commented Jan 29, 2025 •

edited

Loading

zeroshade commented Jan 30, 2025

Niivii commented Jan 30, 2025

zeroshade commented Jan 30, 2025

Niivii commented Jan 30, 2025

zeroshade commented Feb 10, 2025

Niivii commented Feb 10, 2025

zeroshade commented Feb 11, 2025

zeroshade commented Feb 11, 2025

Niivii commented Feb 12, 2025

Niivii commented Feb 12, 2025 •

edited

Loading

zeroshade commented Feb 12, 2025

lidavidm commented Feb 12, 2025

Niivii commented Feb 13, 2025

davlee1972 commented Feb 13, 2025 •

edited

Loading

zeroshade commented Feb 14, 2025

lidavidm commented Feb 18, 2025

zeroshade commented Feb 18, 2025

zeroshade commented Feb 18, 2025

Niivii commented Feb 19, 2025

davlee1972 commented Feb 19, 2025

zeroshade commented Feb 19, 2025

fix(go/adbc/driver/snowflake): use one session for connection #2494

fix(go/adbc/driver/snowflake): use one session for connection #2494

Conversation

zeroshade commented Jan 28, 2025 • edited Loading

zeroshade commented Jan 28, 2025

davlee1972 commented Jan 29, 2025

zeroshade commented Jan 29, 2025

Niivii commented Jan 29, 2025

zeroshade commented Jan 29, 2025

davlee1972 commented Jan 29, 2025 • edited Loading

zeroshade commented Jan 30, 2025

Niivii commented Jan 30, 2025

zeroshade commented Jan 30, 2025

Niivii commented Jan 30, 2025

zeroshade commented Feb 10, 2025

Niivii commented Feb 10, 2025

zeroshade commented Feb 11, 2025

zeroshade commented Feb 11, 2025

Niivii commented Feb 12, 2025

Niivii commented Feb 12, 2025 • edited Loading

zeroshade commented Feb 12, 2025

lidavidm commented Feb 12, 2025

Niivii commented Feb 13, 2025

davlee1972 commented Feb 13, 2025 • edited Loading

zeroshade commented Feb 14, 2025

lidavidm commented Feb 18, 2025

zeroshade commented Feb 18, 2025

zeroshade commented Feb 18, 2025

Niivii commented Feb 19, 2025

davlee1972 commented Feb 19, 2025

zeroshade commented Feb 19, 2025

zeroshade commented Jan 28, 2025 •

edited

Loading

davlee1972 commented Jan 29, 2025 •

edited

Loading

Niivii commented Feb 12, 2025 •

edited

Loading

davlee1972 commented Feb 13, 2025 •

edited

Loading