-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Humanize byte size error for community image upload #1273
base: master
Are you sure you want to change the base?
Humanize byte size error for community image upload #1273
Conversation
run-tests.sh
Outdated
@@ -54,7 +54,7 @@ fi | |||
|
|||
python -m check_manifest | |||
python -m sphinx.cmd.build -qnNW docs docs/_build/html | |||
eval "$(docker-services-cli up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" | |||
eval "$(docker services up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script run-tests.sh has a command that I don’t think is available on OSX. I tried making some changes to see if I could find a match for OSX, but that’s going nowhere. I think the move might be to spin up containers specifically for this, but I’m not sure where to find the files that define the containers for these tests. The --env flag seems to imply there is a Dockerfile or some other type of file that is supposed to be around. I was thinking it might make sense to just write my own and fill it in with the container definitions I need, but that seems wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Installing test dependencies should provide you with the docker-services-cli
CLI utility: pipenv run pip install -e .[tests,opensearch2]
. This is an invenio utility, not a docker provided thing and should work on MacOS. So these sections shouldn't be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give that another shot. I was pretty sure I had run setup tools correctly, but there's a good chance I did something wrong and then went this painful road of ignorance and confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Devil is in the details. Write up a test for the humanizing function and with suggestions/investigation we should have something that can be submitted 👍
invenio_communities/errors.py
Outdated
limit_hr, limit_unit = human_readable_unit() | ||
file_size_hr, file_size_unit = human_readable_unit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functions as verbs + passing argument
limit_hr, limit_unit = human_readable_unit() | |
file_size_hr, file_size_unit = human_readable_unit() | |
limit_value, limit_unit = humanize_byte_size(limit) | |
file_size_value, file_size_unit = humanize_byte_size(file_size) |
invenio_communities/errors.py
Outdated
"Logo size limit exceeded. Limit: {limit} {limit_unit} Given: {file_size} {file_size_unit} bytes".format( | ||
limit=ceil(limit_hr), limit_unit=limit_unit, file_size=ceil(file_size_hr), file_size_unit=file_size_unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Logo size limit exceeded. Limit: {limit} {limit_unit} Given: {file_size} {file_size_unit} bytes".format( | |
limit=ceil(limit_hr), limit_unit=limit_unit, file_size=ceil(file_size_hr), file_size_unit=file_size_unit | |
"Logo size limit exceeded. Limit: {limit_value} {limit_unit}. Given: {file_size_value} {file_size_unit}".format( | |
limit_value=limit_value, limit_unit=limit_unit, file_size_value=file_size_value, file_size_unit=file_size_unit |
invenio_communities/utils.py
Outdated
@@ -109,3 +109,13 @@ def on_datastore_post_commit(sender, session): | |||
users = role.users.all() | |||
for user in users: | |||
on_user_membership_change(Identity(user.id)) | |||
|
|||
|
|||
def human_readable_unit(size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def human_readable_unit(size): | |
def humanize_byte_size(size): |
invenio_communities/utils.py
Outdated
for unit in ['B', 'KB', 'MB', 'GB', 'TB', 'PB']: | ||
if size < 1000.0: | ||
return size, unit | ||
size /= 1000.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# at top of file
from decimal import Decimal, ROUND_HALF_UP, getcontext
# in function
#
old_rounding = getcontext().rounding
getcontext().rounding = ROUND_HALF_UP # otherwise Python rounds differently than what we, as humans, are used to
# Do check however if we have set the rounding behaviour to ROUND_HALF_UP already
s = Decimal(size)
# do rest of operations with s (naming isn't a deal, just saying to do rest of ops with the Decimal version of the value.
# when ready to return round(s, 2):
s = round(s, 2)
getcontext().rounding = old_rounding
return s, unit
This is really to mimic what the frontend does.
- Maybe we can add a note about eventually supporting Kili, Mebi...
- Document the input and outputs
run-tests.sh
Outdated
@@ -54,7 +54,7 @@ fi | |||
|
|||
python -m check_manifest | |||
python -m sphinx.cmd.build -qnNW docs docs/_build/html | |||
eval "$(docker-services-cli up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" | |||
eval "$(docker services up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Installing test dependencies should provide you with the docker-services-cli
CLI utility: pipenv run pip install -e .[tests,opensearch2]
. This is an invenio utility, not a docker provided thing and should work on MacOS. So these sections shouldn't be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the feedback @fenekku! I'll add what I can to my ongoing notes and hopefully avoid some of this trouble next time.
run-tests.sh
Outdated
@@ -54,7 +54,7 @@ fi | |||
|
|||
python -m check_manifest | |||
python -m sphinx.cmd.build -qnNW docs docs/_build/html | |||
eval "$(docker-services-cli up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" | |||
eval "$(docker services up --db ${DB:-postgresql} --search ${SEARCH:-opensearch} --mq ${MQ:-redis} --env)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give that another shot. I was pretty sure I had run setup tools correctly, but there's a good chance I did something wrong and then went this painful road of ignorance and confusion.
19404f3
to
5591c32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and (finally) passing all tests locally. I was wondering where I should put a test for the humanize_byte_size function or if it was even something worth including?
I searched for some other functions in utils.py
across the tests
directory and was able to match a few to tests, but it doesn't seem like everything is unit tested, so I'm not sure if my changes are considered test worthy.
@@ -19,7 +19,9 @@ | |||
SearchOptionsMixin, | |||
) | |||
from invenio_records_resources.services.files.links import FileLink | |||
from invenio_records_resources.services.records.config import RecordServiceConfig | |||
from invenio_records_resources.services.records.config import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this wasn't causing the test suite to fail, but it triggered when I was running the tests locally. I can leave this out if it doesn't make sense for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this was triggered locally for you: this line is 81 chars long, which is considered longer than the classic 80 of some python linting tools, but it is not too long for the black
linter which we use. If you share what you ran and what you got, it may help.
In any case, let's leave this line as it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think isort could caused the problem with this line like here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try adding line_length = 88
to the isort setup.cfg section:
# setup.cfg
[isort]
profile=black
line_length = 88
I guess profile=black isn't quite enough anymore?
EDIT: And why are we getting this ridiculous amount of warnings in the test logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw it is already fixed on the master branch.
the line_length = 88
seems not to work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the followup @utnapischtim
invenio_communities/utils.py
Outdated
s = round(s, 2) | ||
getcontext().rounding = old_rounding | ||
return s, unit | ||
s /= 1000.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not include a note about using a different library as a comment because I didn't want to overstep as a first time contributor. I get the feeling that regular contributors who have more experience will have opinions on this, so it might make more senes to discuss this in the PR.
invenio_communities/errors.py
Outdated
super().__init__( | ||
_( | ||
"Logo size limit exceeded. Limit: {limit} bytes Given: {file_size} bytes".format( | ||
limit=ceil(limit), file_size=ceil(file_size) | ||
"Logo size limit exceeded. Limit: {limit_value} {limit_unit} Given: {file_size_value} {file_size_unit} bytes".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Logo size limit exceeded. Limit: {limit_value} {limit_unit} Given: {file_size_value} {file_size_unit} bytes".format( | |
"Logo size limit exceeded. Limit: {limit_value} {limit_unit} Given: {file_size_value} {file_size_unit}".format( |
invenio_communities/errors.py
Outdated
"Logo size limit exceeded. Limit: {limit} bytes Given: {file_size} bytes".format( | ||
limit=ceil(limit), file_size=ceil(file_size) | ||
"Logo size limit exceeded. Limit: {limit_value} {limit_unit} Given: {file_size_value} {file_size_unit} bytes".format( | ||
limit_value=ceil(limit_value), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ceil
should be omitted actually. It's confusing.
5591c32
to
5068c24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully close to being ready for official review.
def humanize_byte_size(size): | ||
"""Converts bytes to largest unit (e.g., KB, MB, GB).""" | ||
s = Decimal(size) | ||
for unit in ["B", "KB", "MB", "GB", "TB", "PB"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably unrealistic that anyone is going to upload anything that is even a gigabyte, let alone a terabyte or petabyte, but I don't think it adds a lot of overhead or anything, so it's likely fine.
invenio_communities/utils.py
Outdated
if s < BYTES_PER_UNIT: | ||
with localcontext() as ctx: | ||
ctx.rounding = ROUND_HALF_UP | ||
s = s.quantize(Decimal("0.00")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have all calls to Decimal() use a string argument? It seems to be the best practice is to use a string, but from my perspective it doesn't make a difference for whole number values like 1000. I guess consistency would be a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For small d decimal, a string is the way to go, but for whole numbers it doesn't matter too much like you are saying. It's not one of those consistency things that bothers me beyond that.
invenio_communities/utils.py
Outdated
@@ -16,6 +18,7 @@ | |||
from .generators import CommunityRoleNeed | |||
from .proxies import current_communities, current_identities_cache | |||
|
|||
BYTES_PER_UNIT = Decimal(1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be overkill or be better if it was renamed.
@@ -689,6 +689,9 @@ def test_logo_max_content_length( | |||
data=BytesIO(logo_data), | |||
) | |||
assert res.status_code == 400 | |||
assert ( | |||
res.json["message"] == "Logo size limit exceeded. Limit: 1.00 MB Given: 4.00 MB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to look for other places that I could test this new function, but this seemed like a good spot to start. I'm not sure how thorough the tests needs to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good enough 👍
ctx.rounding = ROUND_HALF_UP | ||
s = s.quantize(Decimal("0.00")) | ||
return s, unit | ||
s /= BYTES_PER_UNIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple last things:
- If we are going the
quantize
route, we may as well pass the rounding to simplify things:s = s.quantize(q, rounding=ROUND_HALF_UP)
(q = Decimal("0.00")
once and outside the loop) and remove all the context stuff - Let's keep
BYTES_PER_UNIT
in the function. Was there a particular reason why out? (keeping it out exposes us to problems that we shouldn't have to care about)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Good call on the first point
- Honestly, the only reason I put it outside of the function was that I thought it was the standard way to place constants. I'm fine with scoping it to a function. For my own understanding, are we doing this to avoid naming collisions or were you thinking of something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is module namespaced in python so it wouldn't really be a problem. Constants don't have to be global unless we want them to be part of an interface that other pieces of code use. And with such a large project, my philosophy is to tamp down on the coupling / global access it makes us vulnerable to. Especially here, where "bytes per unit" could be changed to 1024 or 1000 in the function in the future and still make sense as a concept. IOW, we want to keep coupled things close together and not introduce unintended 'dependencies'.
e3beac7
to
c58e322
Compare
c58e322
to
b059dfc
Compare
b059dfc
to
d211494
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
For context, @utnapischtim, @Meowcenary is my colleague at Northwestern who's getting into InvenioRDM.
❤️ Thank you for your contribution!
Description
Update logo size limit error to show humanized byte size error on community image upload. Add helper function to perform this conversion.
Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: