-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emoji in link title results in args-out-of-range
error
#32
Comments
i just tried adding the "Comprenhensive Rust" link ("https://google.github.io/comprehensive-rust/") to my own buku database, and then searching for it - both by searching for database entries for the "rust" tag, and by passing "rust" as an 'any' or 'all' argument. i got no errors, and the crab emoji is displayed. @edzhangsy: What is the value of the |
My value of this value is "UTF-8". |
Okay, thanks. Could you please:
|
I followed your instruction and get this trace.
The link causing trouble is this one https://coredumped.dev/2021/05/26/taking-org-roam-everywhere-with-logseq/ |
"Core dumped" is an unfortunate blog name in the context of a backtrace. :-) At first i thought Ebuku was somehow causing Emacs to dump core .... The code change in my previous comment was to take into consideration that the line ending on Windows is CR+LF / However, adding and searching for the "Core Dumped" bookmark resulted in no errors for me. But the difference between the two values in the "args-out-of-range" error is quite different to the one for the "Comprehensive Rust" link; in that case, the difference was 31, in this case it's 37. And the Unicode BULLET grapheme requires 3 bytes in UTF-8, whereas the CRAB grapheme requires 4 bytes, so i would have expected the difference between values in the "args-out-of-range" error value would be bigger in the CRAB case than the BULLET case. All that said, it occurs to me that, if i remember correctly, the default encoding used by Windows is UTF-16, not UTF-8. So i'm wondering if that's somehow being used to transfer data from the buku process to the Emacs process, regardless of the value of LANG and LC_ALL, and regardless of the encoding of the buku database itself? On my machine, file(1) reports the database as UTF-8:
Could you please share the value of the |
My The name of the bookmark is frightenning indeed. Sorry for that:-)
I think the Powershell will use UTF-16 to encode instead of UTF-8. |
*nod* Well, at this point, as a non-Windows user who doesn't have access to a Windows machine, i don't know what things i can get you to try/test in order to diagnose the problem. So i'm going to send an email about this issue to the emacs-devel list. For that email, can you please tell me:
|
Thank you very much. |
Okay, so, i've got some helpful responses from Eli Zaretskii. If you're not already aware, Eli is not only an Emacs maintainer, but is also very familiar with text encoding in general and Unicode in particular. Additionally, if i remember correctly, he's a Windows user himself. The discussion so far is available online here. (Although please note that for some reason my initial email got its formatting mangled upon sending; a less-mangled version is available here.) The first thing to note is that Eli wrote:
So: please remove Secondly, Eli wrote:
Could you please describe how you set LC_ALL? Thirdly, Eli wrote:
i've mentioned that buku is a Python program, but we now need to check what buku itself does, without any interaction with Ebuku:
|
Hi, I found the encoding setting from Emacs China website.
Right now, the new file would be saved with UTF-8 |
i wouldn't expect it to, as setting those variables won't influence the encoding of the data that Ebuku has to process. This is a very complex issue, so we need to control the various factors involved. This is why i wrote:
By that, i meant: When you're testing out things as we work on this problem, please don't have the Emacs language environment set to UTF-8, as it will complicate interactions with Windows in general and buku in particular. i understand that you don't want to use the GBK environment in general Unfortunately, making the configuration changes you described in your previous comment here only adds more factors to consider, and makes it more difficult to understand what's happening on your system. So, when you test out things in Emacs, i'd like you to do so by starting Emacs with the
then do any necessary Ebuku-related setup (e.g. setting the path to the buku database), and then try things with Ebuku. i'm think i'm going to have to ask you to interact directly with the Eli on the mailing list about this, as i'm finding it difficult to be the messenger going back and forth, and it will be much quicker if Eli can ask you questions directly, which you can respond to directly. Hopefully that process will make it clear what would need to be done by Ebuku in order to fix the problem, in a non-GBK environment. Could you please let me know an appropriate email address i can use to add you to the discussion on emacs-devel? |
There's no need to do the interaction through the mailing list, we can do it here. It will be easier and faster. |
The problem with ebuku is not solved yet. |
OK. Please try this: modify ebuku.el such that each time it calls
Then use the modified ebuku.el (byte-compile it before use), and see if the problem is resolved or not. |
I am new to the emacs, so I modified the code like this
Is it correct?
I eval the buffer with |
But that is not the only call to |
I edited the other two like this
And the error persists. |
OK. So now I need you to step with Edebug through
The above assumes that the error you see after all those changes is still the same, i.e.: Debugger entered--Lisp error: (args-out-of-range "1884. Welcome to Comprehensive Rust 🦀 - Comprehens..." 15862 15893) |
I tried to execute |
Unfortunately, you need to manually switch to the temp buffer each time after any Edebug command. |
As reported by @edzhangsy in #31:
The text was updated successfully, but these errors were encountered: