• DevInternals
  • Need Community Input on Non-English Character Support

Well, I found one small issue while I was translating from english to serbian language. In the serbian language, we have a difference between the masculine and the feminine. In fact, some words do not have the same ending depending on gender. Here's an example to better understand:

For the masculine:

English: John started the discussion a few seconds ago.
Serbian: John je započeo diskusiju pre nekoliko sekundi.

For the feminine:

English: Barbara started the discussion a few seconds ago.
Serbian: Barbara je započela diskusiju pre nekoliko sekundi.

As you can see there is a difference between the words "započeo" and "započela". When I was translating from english to serbian I had to use the universal word "započeo/la" so users could use correct form of the word depending of the situation. It would be great if during registration users could choose what gender they are, whether they are male or female. The translation would be adjusted accordingly. This would make the translation much more meaningful and make it easier for users to use and read on forum. I am not an expert in literature and grammar, but I think this does not only apply to the serbian language but to all the languages of the former Yugoslavia and I think it also applies to the russian language, but I do not want to speak on their behalf. This isn’t such a big deal, but it would be great to have.

    bryantmilan it's not strictly related to this thread's topic as it's not related to character encoding. We do have an existing issue for genders but it's a very complex issue as we need both the translator library to support genders but also some way to know the gender which is quite delicate. Feel free to give input on the GitHub issue if you have experience with this kind of translation feature flarum/core528

    The search functionality doesn't work well on my Hebrew communities.

    It doesn't search parts of the title (for example if you type in just 1 word from the title).

      danielunited This falls under

      tankerkiller125 UTF-8 search is awful (flarum/core#2003)

      I believe, we have discussed some ways to resolve this by allowing search drivers for services like Elasticsearch, Algolia, etc. as MySQL/MariaDB is not good at UTF-8 search regardless of how you configure it.

      10 days later

      I also notice as issue with SEO. It displays the date of each comment and its URL in search results.

        danielunited I'm not sure if this is a UTF-8 related thing or a regular forum thing.... I personally don't see that issue happening on Google with discuss, maybe someone else with a international forum can confirm?

        Edit: Further testing, I put the URL of the forum you showed the issue on into google and I do not see that happening. Mind sharing which search engine you see this on?

          6 days later

          Client language recognition

          wget -qO "vendor/flarum/core/src/Locale/LocaleServiceProvider.php" \
          "$GITHUB_ROOT/flarum/core/src/Locale/LocaleServiceProvider.php"

          Allow registration of Chinese name

          sed -i "s#a-z0-9-#-a-z0-9\\x7f-\\xff#" \
          vendor/flarum/core/src/User/UserValidator.php

          Support @ Chinese name

          sed -i "s#a-z0-9-#-a-zA-Z0-9\\x7f-\\xff#" \
          vendor/flarum/mentions/src/ConfigureMentions.php

          Cancel the minimum length of username

          sed -i's#min:3#min:1#' \
          vendor/flarum/core/src/User/UserValidator.php \

          Allow searching for IDs shorter than three characters

          sed -i's#length>=3\&#length>=1\&#' \
          vendor/flarum/core/js/dist/forum.js

          Improve support for Chinese urls (fix some Chinese nickname user page analysis issues)

          sed -i's#getIdForUsername($id)#getIdForUsername(urldecode($id))#' \
          vendor/flarum/core/src/Api/Controller/ShowUserController.php
          To

          url only shows id

          sed -i'/discussion->slug/d' \
          vendor/flarum/core/src/Api/Serializer/BasicDiscussionSerializer.php
          sed -i -r's#(discussion->id).$#\1#' \
          vendor/flarum/core/views/frontend/content/index.blade.php
          sed -i'/idWithSlug =/s/..
          $/;/' \
          vendor/flarum/core/src/Forum/Content/Discussion.php
          sed -i's#+(i.trim()?"-"+i:"")##' \
          vendor/flarum/core/js/dist/forum.js

            ceerker Although this does allow UTF-8 it is not extendable easily and does not give the forum owner complete control over how they want to slug things. I think our goal will to make minor changes to core that allow Extension developers to define slugging techniques and then allow the admin to choose from those options in the dashboard. Very similar to the way we handle email and "Front Page" settings.

            However some of those things in the script do point out some issues we'll have to work on resolving (such as min search length)

              5 days later
              17 days later

              cccRaim Thanks for your suggestion a similar discussion exist on this forum., the issue with the solution is that it requires modification of the MySQL default config. This is not something we can reasonably require/request all users/people to do and as such we won't be using this method. At some point we plan to have search drivers which would greatly help with the situation.