Profiling of labkey.post #36

juyeongkim · 2019-08-07T19:48:11Z

I've noticed that fetching a large table from labkey via Rlabkey is significantly slower than other API client like JavaScript, so I profiled the labkey.selectRows call for a large table (182779 rows) from DataSpace.

profvis::profvis(Rlabkey::labkey.selectRows(
  baseUrl = "https://dataspace.cavd.org",
  folderPath = "/CAVD",
  schemaName = "study",
  queryName = "ICS" # 182779 rows
))

As we can see, actual fetching of data via POST only takes fraction of time in labkey.SelectRows call, and the majority of time is spent processing the response (processResponse) and creating a data.frame object (makeDF).

We can break it down in 5 steps:

We can see that there are redundancies in this process.

We are parsing parsing json twice (step 2 and step 4)
We are creating data.frame twice (step 2 and step 5)

Another thing we should note is that jsonlite::fromJSON(simplifyDataFrame=TRUE) is more efficient in creating a data.frame than Rlabkey:::listToMatrix.

Can you please take a look into this and make changes accordingly?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiling of labkey.post #36

Profiling of labkey.post #36

juyeongkim commented Aug 7, 2019

Profiling of labkey.post #36

Profiling of labkey.post #36

Comments

juyeongkim commented Aug 7, 2019