-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance binary file detection by integrating MIME type analysis, resolves #975 #1003
Enhance binary file detection by integrating MIME type analysis, resolves #975 #1003
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add a changelog
- Squash the commits
@regisb can you squash the commits while merging the branch |
@regisb A reminder for reviewing and squash-merging this PR, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm mistaken, this PR was designed to address issue #975 right? If yes, then we need to implement the filter that I described in this comment. This filter would then make use of the is_binary
function to decide whether any given file should be rendered. Does that make sense? If not we should have a live discussion about this.
Also, please rebase and squash your changes on top of master.
@@ -0,0 +1 @@ | |||
- [Improvement] This is a non-breaking change. Enhanced is_binary_file function in env file for better binary file detection. (by @Abdul-Muqadim-Arbisoft) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changelog entry is not very informative. As a Tutor user or plugin developer, what change(s) should I expect? Will some files now be considered as text/binary, that previously were not? If there is no change, what's the purpose of this PR?
To clarify my thought above, here's what I have in mind:
To understand the benefit of the above approach, consider the following scenarios:
Neither scenario can be addressed with simple configuration settings; in the scenarios 2 and 3, users have to run manual |
Closing this in favor of #1062 |
This update significantly refines the is_binary_file function with a more nuanced detection mechanism. The function now starts by attempting to guess the MIME type of the file based on its path. If the MIME type explicitly indicates a non-text nature or belongs to a set of known binary MIME types, the file is considered binary. However, this version introduces a critical improvement: it recognizes specific MIME types and file extensions that, despite not starting with 'text/', are commonly associated with text content (e.g., 'application/xml', 'application/json'). These are explicitly checked and treated as text files.
In situations where the MIME type is inconclusive or suggests a binary nature — but the file extension is known to be associated with text content (such as '.html', '.xml', '.json', '.css', '.js') — the function overrides the MIME type indication and classifies the file as text. Conversely, if the file's extension is among those recognized as binary, it is deemed as such, making this approach highly effective for distinguishing between binary and text files with greater precision