Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix record history export #4

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

jasper-lyons
Copy link
Contributor

No description provided.

Previously we were erroneously testing to see if a value was a valid ruby key. This returned true even when the value was not a valid ruby key e.g. 

```
'A's Fake Name': {
```

```
Ghtc_Hyuik_1[1]_160x160: {
```
I've change this behavior so that all keys are wrapped in single quotes and, if any single quotes exist in the key string they are escaped e.g.

```
'A\'s Fake Name': {
```

```
'Ghtc_Hyuik_1[1]_160x160': {
```
def key_to_ruby(key)
valid_key?(key) ? key : "'#{key}'"
"'#{ key.gsub(/'/, "\\'") }'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you receive an integer you will get an exception. so you can keep the validation and add your escape to those that are not valid:

def key_to_ruby(key)
  return key if valid_key?(key)

  "'#{ key to_s.gsub(/'/, "\\'") }'"
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't want to rely on any valid_key? method for this, everything we're being passed should be a valid key given that it comes from an existing ruby key: value pair afaik.

I would instead just coerce everything to string before applying the .gsub.

On some installations we can see upwards of 5GB of attachment data. When we try and load all of this into rails at once, depending on the available machine memory, we can see OOM exceptions which can bring everything down! Yay.

We're injecting a manual GC every 100 attachment loads which will drastically reduce the lifetime of the attachment objects.
coerce all potential keys into strings before processing them
def key_to_ruby(key)
valid_key?(key) ? key : "'#{key}'"
"'#{ key.to_s.gsub(/'/, "\\'") }'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to keep valid_key? method call here, in order to avoid unnecesary conversion of a valid key

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -18,7 +18,7 @@ def exporters
timestamp = DateTime.now.strftime('%Y%m%d%H%M%S')
export_dir = "record-data-files-#{timestamp}"
exporters.each do |exporter|
data_exporter = Object.const_get(exporter).new(export_dir: export_dir, batch_size: 250)
data_exporter = Object.const_get(exporter).new(export_dir: export_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep 250 batch_size and overwrite in attachment_exporter

@@ -18,7 +18,7 @@ def exporters
timestamp = DateTime.now.strftime('%Y%m%d%H%M%S')
export_dir = "record-data-files-#{timestamp}"
exporters.each do |exporter|
data_exporter = Object.const_get(exporter).new(export_dir: export_dir, batch_size: 250)
data_exporter = Object.const_get(exporter).new(export_dir: export_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[this change is not related with record history export, please create a new pr]

@@ -1,6 +1,7 @@
# frozen_string_literal: true

require_relative('data_exporter.rb')
require 'erb'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[these changes is not related with record history export, please create a new pr]

initialize_script_for_attachment(folder_to_save, type, sufix)

attachments = []
data_object_names.each do |type|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a reduce/inject to avoid initialize array, please create a new method

end

@output.puts "# force ruby to gargabe collect here as attachmented files can build up in memory!"
@output.puts "GC.start"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC will be enable for every iteration. is not need to stop manually?

@pnabutovsky
Copy link
Contributor

We are holding off on merging this. The main performance motive for this PR has been addressed elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants