Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(storages): use ssec storage at correct times #798

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion alexandria/core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from django.contrib.postgres.search import SearchVectorField
from django.core.exceptions import ImproperlyConfigured, ObjectDoesNotExist
from django.core.files import File as DjangoFile
from django.core.files.storage import storages
from django.core.validators import RegexValidator
from django.db import models, transaction
from django.dispatch import receiver
Expand Down Expand Up @@ -172,7 +173,10 @@ def clone(self):
self.pk = None
self.save()

storage = File.content.field.storage
storage_backend = settings.ALEXANDRIA_FILE_STORAGE
if settings.ALEXANDRIA_ENABLE_AT_REST_ENCRYPTION:
storage_backend = "alexandria.storages.backends.s3.SsecGlobalS3Storage"
Comment on lines +177 to +178
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand this is the only place in your changes where storage selection is changed compared to before, right? This may however cause failure because the DynamicStorageFieldFile relies on the Field.encryption_status to determine which storage to use. The clone method should do that as well. If it doesn't and the settings.ALEXANDRIA_ENABLE_AT_REST_ENCRYPTION=True will make the copy procedure fail for a File instance with encryption_status empty and content unencrypted, do you agree or am I getting this wrong?

storage = storages.create_storage({"BACKEND": storage_backend})
Comment on lines -175 to +179
Copy link
Contributor

@fugal-dy fugal-dy Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clone method appears to bypass the whole django-storages implementation. shouldn't that create a new File aka copy the latest_original.content for the cloned Document?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloning was done this way on purpose. This is to avoid downloading and reuploading the same file content. We want to be able to use the S3 api in the storage class.
See #694

Copy link
Contributor

@fugal-dy fugal-dy Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware of that and I like it. My comment was not very clear, tbh. What I'm actually concerned with is: this is quite a lot of logic packed in the clone method that is not clone specific but storage specific.

I think it could be much clearer and easier to read if we could see here something like:

def clone: 
  make a copy of the document
  add a copy of the latest original file
  done

To achieve this I would suggest to e. g. add a copy method to the class DynamicStorageFieldFile

class DynamicStorageFieldFile(FieldFile):
    def __init__(self, instance, field, name):
        super().__init__(instance, field, name)
        self.storage = storages.create_storage(
            {"BACKEND": settings.ALEXANDRIA_FILE_STORAGE}
        )
        storage_backend = settings.ALEXANDRIA_FILE_STORAGE
        if settings.ALEXANDRIA_ENABLE_AT_REST_ENCRYPTION:
            from alexandria.core.models import File

            if instance.encryption_status == File.EncryptionStatus.SSEC_GLOBAL_KEY:
                storage_backend = "alexandria.storages.backends.s3.SsecGlobalS3Storage"
        self.storage = storages.create_storage({"BACKEND": storage_backend})

    def copy(self, target_name):
        # S3 compatible storage: copy the file in storage without downloading and reuploading
        if isinstance(self.storage, S3Storage):
            copy_args = {
                "CopySource": {
                    "Bucket": self.storage.bucket,
                    "Key": self.name,
                },
                # Destination settings
                "Bucket": self.storage.bucket,
                "Key": target_name,
            } 
            if isinstance(self.storage, SsecGlobalS3Storage):
                copy_args["CopySourceSSECustomerKey"] = self.storage.ssec_secret
                copy_args["CopySourceSSECustomerAlgorithm"] = self.storage.customer_algorithm
                copy_args["SSECustomerKey"] = storage.ssec_secret
                copy_args["SSECustomerAlgorithm"] = self.storage.customer_alorithm
             
            self.storage.bucket.meta.client.copy_object(**copy_args)
            self.instance.content = target_name
            self.instance.save()
            return

        # otherwise use filesystem storage
        with NamedTemporaryFile() as tmp:
                temp_file = Path(tmp.name)
                with temp_file.open("w+b") as file:
                    file.write(self.read())
                    self.instance.content = DjangoFile(file)
                    self.instance.save()
               return

Edited: The fieldfile instance has access to the File model instance through self.instance .

And a note: the above assumes that the SsecGlobalS3Storage sets self.customer_algorithm, so that needed fixing, too

latest_original.pk = None
latest_original.document = self
if isinstance(storage, S3Storage):
Expand Down
4 changes: 1 addition & 3 deletions alexandria/core/tests/test_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,7 @@ def test_document_no_files(


def test_clone_document_s3(db, mocker, settings, file_factory):
settings.ALEXANDRIA_FILE_STORAGE = (
"alexandria.storages.backends.s3.SsecGlobalS3Storage"
)
settings.ALEXANDRIA_FILE_STORAGE = "alexandria.storages.backends.s3.S3Storage"
settings.ALEXANDRIA_ENABLE_AT_REST_ENCRYPTION = True
name = "name-of-the-file"
mocker.patch("storages.backends.s3.S3Storage.save", return_value=name)
Expand Down
14 changes: 6 additions & 8 deletions alexandria/storages/fields.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,17 @@
from django.db.models.fields.files import FieldFile
from storages.backends.s3 import S3Storage

from alexandria.storages.backends.s3 import SsecGlobalS3Storage


class DynamicStorageFieldFile(FieldFile):
def __init__(self, instance, field, name):
super().__init__(instance, field, name)
self.storage = storages.create_storage(
{"BACKEND": settings.ALEXANDRIA_FILE_STORAGE}
)
storage_backend = settings.ALEXANDRIA_FILE_STORAGE
if settings.ALEXANDRIA_ENABLE_AT_REST_ENCRYPTION:
from alexandria.core.models import File

if instance.encryption_status == File.EncryptionStatus.SSEC_GLOBAL_KEY:
self.storage = SsecGlobalS3Storage()
storage_backend = "alexandria.storages.backends.s3.SsecGlobalS3Storage"
self.storage = storages.create_storage({"BACKEND": storage_backend})


class DynamicStorageFileField(models.FileField):
Expand Down Expand Up @@ -56,8 +53,9 @@ def pre_save(self, instance, add):
"Set `ALEXANDRIA_FILE_STORAGE` to `alexandria.storages.s3.S3Storage`."
)
raise ImproperlyConfigured(msg)
storage = SsecGlobalS3Storage()
if instance.encryption_status == File.EncryptionStatus.SSEC_GLOBAL_KEY:
self.storage = storage
self.storage = storages.create_storage(
{"BACKEND": "alexandria.storages.backends.s3.SsecGlobalS3Storage"}
)
_file = super().pre_save(instance, add)
return _file
2 changes: 1 addition & 1 deletion alexandria/storages/tests/test_dynamic_field.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ def test_dynamic_storage_select_global_ssec(
mocker.patch("alexandria.core.tasks.create_thumbnail.delay", side_effect=None)
if raises is not None:
with pytest.raises(raises):
file_factory()
file_factory(encryption_status=settings.ALEXANDRIA_ENCRYPTION_METHOD)
return
file_factory(encryption_status=settings.ALEXANDRIA_ENCRYPTION_METHOD)
assert SsecGlobalS3Storage.save.called_once()
Expand Down
Loading