• DevInternals
  • Need Community Input on Non-English Character Support

The URL skips the regional language part ( from title ? )

I have not tried the 3 points you mentioned though. Content and title works for me right now.

    meetdilip I believe that this falls under point two if your referring to the fact that a title in say russian cyrillic won't be converted into a proper slug but stead becomes a bunch of dashes (if I remember how that ends up right) but any english in that title will be correct.

      tankerkiller125 a title in say russian cyrillic won't be converted into a proper slug but stead becomes a bunch of dashes

      It does not create a bunch of dashes, but a simple, proper alpha-numeric URL like in WordPress.

        meetdilip Does it currently do that? Or is that what your hoping it will do? I'm trying to figure out exactly what you mean by "regional language part" in your original reply to the discussion.

          I use a non English language title. Instead of a SEO URL that we get with an English title, it shows

          https://domain.cloud/d/55

          all by itself.

          tankerkiller125 Does it currently do that?

          Yes. It automatically skips the non English part.

          tankerkiller125 Or is that what your hoping it will do?

          It is working like that now

          Well, I found one small issue while I was translating from english to serbian language. In the serbian language, we have a difference between the masculine and the feminine. In fact, some words do not have the same ending depending on gender. Here's an example to better understand:

          For the masculine:

          English: John started the discussion a few seconds ago.
          Serbian: John je započeo diskusiju pre nekoliko sekundi.

          For the feminine:

          English: Barbara started the discussion a few seconds ago.
          Serbian: Barbara je započela diskusiju pre nekoliko sekundi.

          As you can see there is a difference between the words "započeo" and "započela". When I was translating from english to serbian I had to use the universal word "započeo/la" so users could use correct form of the word depending of the situation. It would be great if during registration users could choose what gender they are, whether they are male or female. The translation would be adjusted accordingly. This would make the translation much more meaningful and make it easier for users to use and read on forum. I am not an expert in literature and grammar, but I think this does not only apply to the serbian language but to all the languages of the former Yugoslavia and I think it also applies to the russian language, but I do not want to speak on their behalf. This isn’t such a big deal, but it would be great to have.

            bryantmilan it's not strictly related to this thread's topic as it's not related to character encoding. We do have an existing issue for genders but it's a very complex issue as we need both the translator library to support genders but also some way to know the gender which is quite delicate. Feel free to give input on the GitHub issue if you have experience with this kind of translation feature flarum/core528

            The search functionality doesn't work well on my Hebrew communities.

            It doesn't search parts of the title (for example if you type in just 1 word from the title).

              danielunited This falls under

              tankerkiller125 UTF-8 search is awful (flarum/core#2003)

              I believe, we have discussed some ways to resolve this by allowing search drivers for services like Elasticsearch, Algolia, etc. as MySQL/MariaDB is not good at UTF-8 search regardless of how you configure it.

              10 days later

              I also notice as issue with SEO. It displays the date of each comment and its URL in search results.

                danielunited I'm not sure if this is a UTF-8 related thing or a regular forum thing.... I personally don't see that issue happening on Google with discuss, maybe someone else with a international forum can confirm?

                Edit: Further testing, I put the URL of the forum you showed the issue on into google and I do not see that happening. Mind sharing which search engine you see this on?

                  6 days later

                  Client language recognition

                  wget -qO "vendor/flarum/core/src/Locale/LocaleServiceProvider.php" \
                  "$GITHUB_ROOT/flarum/core/src/Locale/LocaleServiceProvider.php"

                  Allow registration of Chinese name

                  sed -i "s#a-z0-9-#-a-z0-9\\x7f-\\xff#" \
                  vendor/flarum/core/src/User/UserValidator.php

                  Support @ Chinese name

                  sed -i "s#a-z0-9-#-a-zA-Z0-9\\x7f-\\xff#" \
                  vendor/flarum/mentions/src/ConfigureMentions.php

                  Cancel the minimum length of username

                  sed -i's#min:3#min:1#' \
                  vendor/flarum/core/src/User/UserValidator.php \

                  Allow searching for IDs shorter than three characters

                  sed -i's#length>=3\&#length>=1\&#' \
                  vendor/flarum/core/js/dist/forum.js

                  Improve support for Chinese urls (fix some Chinese nickname user page analysis issues)

                  sed -i's#getIdForUsername($id)#getIdForUsername(urldecode($id))#' \
                  vendor/flarum/core/src/Api/Controller/ShowUserController.php
                  To

                  url only shows id

                  sed -i'/discussion->slug/d' \
                  vendor/flarum/core/src/Api/Serializer/BasicDiscussionSerializer.php
                  sed -i -r's#(discussion->id).$#\1#' \
                  vendor/flarum/core/views/frontend/content/index.blade.php
                  sed -i'/idWithSlug =/s/..
                  $/;/' \
                  vendor/flarum/core/src/Forum/Content/Discussion.php
                  sed -i's#+(i.trim()?"-"+i:"")##' \
                  vendor/flarum/core/js/dist/forum.js

                    ceerker Although this does allow UTF-8 it is not extendable easily and does not give the forum owner complete control over how they want to slug things. I think our goal will to make minor changes to core that allow Extension developers to define slugging techniques and then allow the admin to choose from those options in the dashboard. Very similar to the way we handle email and "Front Page" settings.

                    However some of those things in the script do point out some issues we'll have to work on resolving (such as min search length)

                      5 days later
                      17 days later

                      cccRaim Thanks for your suggestion a similar discussion exist on this forum., the issue with the solution is that it requires modification of the MySQL default config. This is not something we can reasonably require/request all users/people to do and as such we won't be using this method. At some point we plan to have search drivers which would greatly help with the situation.