Skip to content

Add language switcher #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

melissawm
Copy link
Contributor

Addresses #111

Hello, folks!

This PR adds a language switcher and some logic to deal with multiple languages on the zarr website.

Currently, as part of the Scientific Python grant, we have the site 100% translated into Portuguese (Brazilian) and Spanish. Other languages may follow soon - having the infrastructure set up will hopefully help us recruit new translators 😄

You can see a live preview of this PR here: https://axequalsb.github.io (this is a throaway github account, nevermind the URL - i just needed to deploy this somewhere at the top level because of the minimal-mistakes theme).

About the implementation: It may seem convoluted and, frankly, I'm not a javascript developer so there is certainly room for improvement. My hope was to get this all done in jinja, but unfortunately I couldn't figure out how due to some limitations of the minimal-mistakes jekyll theme. Most of the plugins for jekyll work on the assumption that you are building a blog, and this can cause issues for the format we were expecting on the site (see, for example mmistakes/minimal-mistakes#4618)

For now, only the top navbar and left sidebar are shown translated, because this is a proof of concept and these were the sections of the website that were more problematic. For the others, all pages under content/<language-code> will be synced automatically with Crowdin following the process our team has created (more details here: https://scientific-python-translations.github.io/docs/)

@goanpeca has been working on the automation process for the sync between crowdin and the websites and can help answer any questions you might have.

This PR is a proof of concept and we are happy to adapt to any specific format or needs you might have here. Feedback is appreciated ❤️

Thanks!

@goanpeca
Copy link

Thanks @melissawm for this awesome work 🚀


@joshmoore @sanketverma1704 this one is ready for review.

Do you think you could give it a try 😄 ?

Thanks a lot ! 🚀

@joshmoore
Copy link
Member

Thanks for the ping, @goanpeca, and so much for the PR, @melissawm! ❤️

  • In general, I'd be up for larger changes within the webpage itself if that would prevent the need to duplicate images, etc.
  • I'm not immediately seeing how the process works down the line. Let's say someone wants to open a small PR against the webpage. Do we need to do anything on our side?
  • Regarding the URLs, if possible, I'd drop the /contents/ element.

Does anyone in the @zarr-developers/community have any thoughts and/or interest in getting involved?

@melissawm
Copy link
Contributor Author

Hi all - me and @goanpeca are working together on this so I will add him as collaborator on this PR so he can help address any comments you have. Thanks!

@d-v-b
Copy link

d-v-b commented May 13, 2025

thanks so much for this work! I confess I'm also confused about how we would keep these translations updated?

@goanpeca
Copy link

goanpeca commented May 18, 2025

Hi @joshmoore, @d-v-b and team! And thanks @melissawm for the initial work 🚀 !

I updated the PR and restructured it more to be able to handle translations. Jekyll in general does not seem to provide a standard solution for handling translations. I looked at different options but none really worked with the workflow we are using for translations with Crowdin, an online collaborative platform for doing localization and internationalization. I understand the changes are big and some of the open PRs might need a rebase, but this structure really simplifies a lot of the handling.

Here are a list of changes that were made:

  1. All the website source content was moved to the content folder.
  2. Inside an en folder. was created to hold translatable content. es and pt were also added with updated translations.
  3. Updated some urls in the markdown and custom.html files to handle relative_url correctly, which was needed to test the site for my fork.
  4. The content/_data folder now has the structure of languages inside so /_data/navigation.yml became /content/_data/en/navigation.yml and I also added /content/_data/en/messages.yml to include other strings that needed to be translated. Any changes in the future to the sidebar and the main menu should be applied only on /content/_data/en/navigation.yml
  5. I had to change a couple of files for _includes now /content/_includesto be able to handle translations for the sidebar and the main menu (nav_list and masdhead.html). Due to some caching issues with the theme I also had to modify the default layout (which is what single uses) and update the lang attribute of the website
  6. A custom small plugin /content/_plugins was created to be able to copy the files inside the /content/_data/ folder to the /content/en/_data folder so it can be translated and uploaded to the service.
  7. The config was updated to include the language data
  8. The dockerfile, README and readthedocs config were updated to build the site from the content folder as bundle exec jekyll serve --source content/
  9. All the images were moved to the /content/assets/img folder so they are not duplicated
  10. Because I created the custom plugin and wanted to have a bit more control of the deploy, a new workflow was created to deploy the site
  11. I added files to handle redirects from old pages to the new ones that include the en prefix
  12. Finally if this PR is accepted the setting for the github pages deploy needs to be updated. https://github.com/zarr-developers/zarr-developers.github.io/settings/pages: (see image below)

You can see the final result over at https://goanpeca.github.io/zarr-developers.github.io/en/

Screenshot 2025-05-17 at 7 05 20 PM

Now regarding how the translations workflow works

  1. We create a mirror of this repo for the translatable content located at https://github.com/Scientific-Python-Translations/zarr.dev-translations
  2. This repo is connected to crowdin over at https://scientific-python.crowdin.com/u/projects/18
  3. The configuration of the files that are to be translated can be found at https://github.com/Scientific-Python-Translations/zarr.dev-translations/blob/main/crowdin.yml
  4. Translations are then provided by contributors at crowdin and as new translations are provided, automatic PRs per changes in a given language will be opened on this repo by the automations bot @scientificpythontranslations.
  5. We set the minimum % of a translated language to be included at 90%. This is defined here and can be modified https://github.com/Scientific-Python-Translations/zarr.dev-translations/blob/main/.github/workflows/sync_translations.yml#L21 as needed.
  6. As new translations are included, (lets say now french has over 90%) the _config.yml file needs to be manually updated here
  7. The plugin I wrote will copy the data files to the needed locations so you do not need to do any extra manual work.
  8. You can read more about the whole workflow at https://scientific-python-translations.github.io/docs/
  9. Status and % of translations for the project is updated weekly at https://scientific-python-translations.github.io/status/#zarrdev

Regarding

thanks so much for this work! I confess I'm also confused about how we would keep these translations updated?

Any changes to the site must go to the /content/en/ and /content/_data/en/... folders for content to be translatable.

The PRs will be created weekly and automatically by the bot if new translations are detected. An example of such PR can be seen for scipy at scipy/scipy.org#644


Please let me know if you have any additional questions. I would be happy to reply :)

Cheers 🚀

@goanpeca goanpeca force-pushed the add-language-switcher branch from 18b3fb3 to 95be444 Compare May 18, 2025 00:14
@joshmoore
Copy link
Member

Jekyll in general does not seem to provide a standard solution for handling translations.

Understood. Sorry to have caused so much trouble. Do you think what you've come up with here will work for others? i.e. overall value to try this direction or are we tilting at windmills?

Here are a list of changes that were made:

(points not listed are 👍)

  1. All the website source content was moved to the content folder.
    ...
  2. The dockerfile, README and readthedocs config were updated to build the site from the content folder as bundle exec jekyll serve --source content/

Seems fine since this no longer shows up in the URL, but can you say why it's needed?

  1. A custom small plugin /content/_plugins was created to be able to copy the files inside the /content/_data/ folder to the /content/en/_data folder so it can be translated and uploaded to the service.
  2. All the images were moved to the /content/assets/img folder so they are not duplicated

Nice! Definitely agree that this is a nicer solution than duplicating (triplicating, etc.) all assets.

  1. I added files to handle redirects from old pages to the new ones that include the en prefix

❤️ ❤️

Now regarding how the translations workflow works

This is great. Thank you!

@goanpeca
Copy link

goanpeca commented May 19, 2025

Understood. Sorry to have caused so much trouble. Do you think what you've come up with here will work for others? i.e. overall value to try this direction or are we tilting at windmills?

It could eventually, mostly the changes just come from the theme. Some static generators just provide a solution from the get go, Jekyll has some plugins, but unmaintained. Also since we use a platform to do the translations via volunteers, we need to make the process as easy as possible. This means not having to change configuration options as part of the translations to make things work, like adding a lang: es in frontmatter etc. So this approach allows for just that :)

Seems fine since this no longer shows up in the URL, but can you say why it's needed?

Having any content that is or could be translatable on the root of the repo was making the automation harder, so on all the other projects (scipy, numpy, pandas, networkx, xarray) we opted for having a dedicated folder that included the content to be translated and the translations separated by language. Since we have content that does not need to be translated living outside and (slides, redirects, etc) and jekyll does not provide a way to point to several source content folders, I thought it would be just better to have all content inside.

@goanpeca goanpeca force-pushed the add-language-switcher branch from fa033a4 to 5778b8e Compare May 19, 2025 14:28
@goanpeca goanpeca force-pushed the add-language-switcher branch from 54b5d4f to 4165b58 Compare May 19, 2025 17:32
@goanpeca goanpeca force-pushed the add-language-switcher branch from 4165b58 to 3496128 Compare May 19, 2025 17:34
@goanpeca
Copy link

goanpeca commented May 21, 2025

Hi @joshmoore and @d-v-b !

Wanted to do a gentle ping and ask if you had any more questions/suggestions :)

Thanks!

@joshmoore
Copy link
Member

Thanks for all the updates, @goanpeca! If there are no further updates from your side, I'd suggest we give every one a last chance to review and then move forward.

Other than merging and setting the github pages setting, is there anything you need from me/us?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants