You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.
I was recently writing a personal Wordpress widget to display my most recent tweets. The widget makes a direct call to the tweetnest database tables. When I output the tweet data on my page, many special characters such as umlauts or emoji came out garbled (see attachment). It took me quite a while until I figured out, that if I convert the tweet from Windows-1252 encoding to UTF-8, the characters show up correctly.
Is that expected behavior? Shouldn't the tweet text that gets saved be the actual unicode characters, especially since the MySQL field is marked utf8_general_ci?
Is it possible this has something to do with my individual server configuration (regular Apache, etc.), or is this something that goes wrong when Tweetnest saves the tweets to the database.
I realize if this gets fixed, the database entries all need to be converted, or past installations become incompatible with newer versions.
Maybe it could be a config option, defaulting to the old Windows-1252 format, but with the option to declare UTF-8 on installation or first launch. That way, we'd have the choice to manually convert the database and switch to an actual unicode format.
The text was updated successfully, but these errors were encountered:
Your server configuration is probably based on a non UTF8 encoding somewhere, e.g. your MySql connection. Tweetnest, or any PHP script, will then use the default encoding, which sadly still is »latin1« on many systems. Even if your database or table field is marked as »utf8_general_ci«. Since the connection is always wrong, the encoding of your text will be converted when selected from or inserted to the database and therefore appear correct in your tweetnest installation, but broken in the database.
If anyone else is encountering this, bear in mind for mysql you may need the charset option "utf8mb4" in your Tweetnest db connection config, and corresponding data/tables to show the full panoply of smilies and other emoji. I found the best way was to clean out my whole database and re-fill it from my Twitter archive in the end, after I'd set this connection option and made sure my columns were utf8mb4.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I was recently writing a personal Wordpress widget to display my most recent tweets. The widget makes a direct call to the tweetnest database tables. When I output the tweet data on my page, many special characters such as umlauts or emoji came out garbled (see attachment). It took me quite a while until I figured out, that if I convert the tweet from Windows-1252 encoding to UTF-8, the characters show up correctly.
Is that expected behavior? Shouldn't the tweet text that gets saved be the actual unicode characters, especially since the MySQL field is marked utf8_general_ci?
Is it possible this has something to do with my individual server configuration (regular Apache, etc.), or is this something that goes wrong when Tweetnest saves the tweets to the database.
I realize if this gets fixed, the database entries all need to be converted, or past installations become incompatible with newer versions.
Maybe it could be a config option, defaulting to the old Windows-1252 format, but with the option to declare UTF-8 on installation or first launch. That way, we'd have the choice to manually convert the database and switch to an actual unicode format.
The text was updated successfully, but these errors were encountered: