Hi everyone,

To give this new tag some body without throwing myself in as dead weight 😂 Here is an idea I had based on the recent question from masihdindar.

  • Is it technically possible to offer a fully unicode compliant slug handler?
  • Do we want to offer this as a core, bundled or community feature?

I hope the team can answer the technical complications involved and the community will upvote if they like this proposal.

    luceos Is it technically possible to offer a fully unicode compliant slug handler?

    If we mean allow UTF-8 characters raw (no transliteration) then the answer is yes, it's actually the whole reason we implemented slug drivers and during our development of the drivers extender we actually did have a UTF8 slug driver we played and tested with if I remember correctly.

    However this driver system only works for users and discussions, not tags, although because of the way the driver extender works I think we could maybe add tags?

    The driver system was designed to extend any model, so extensions that add pages can use it too so long as they register the model and the slugger they want to use in the extend.php file. I think good documentation on this feature would probably help a ton.

    luceos Do we want to offer this as a core, bundled or community feature?

    I have no idea on this one, I do think that maybe we should offer at least one other alternative to our current slugging system in core? And let the community come up with anything else they want?

    UTF-8 slugs for discussions are trivial, since lookup can be done by the ID part.

    Tags does need to be converted to use the extensible slug system. I don't see any technical reason not to do so, except that we'd need to rename the slug attribute.

    That all being said, flarum/framework3429 is a bit concerning: we'll need to figure out why it's failing to query by a unicode slug. Perhaps we might need to add a URL decode step? Either way, if we convert it to our current slug system, that could be incorporated into the driver.

    I think it would be good to offer UTF-8 slug drivers out of the box for users, discussions, and tags. It's a widely requested feature, and has limited overhead. If we make our system compatible with it, no reason not to support it.

    I think this needs to be an extension, not part of Core. Unicode URLs tend to be problematic since many clients convert them to Latin characters, and it hurts SEO.

    But you won't get it. Special characters should be encoded and this is what browser will do. If you copy such URL from browser, you will get https://discuss.flarum.org/d/30832-%E0%A4%B9%E0%A4%BF%E0%A4%82%E0%A4%A6%E0%A5%80-%E0%A4%B9%E0%A4%AE%E0%A4%BE%E0%A4%B0%E0%A4%BE-%E0%A4%A6%E0%A5%87%E0%A4%B6-%E0%A4%AD%E0%A4%BE%E0%A4%B7%E0%A4%BE%E0%A4%8F-%E0%A4%B9%E0%A5%88, and Flarum can not change this behavior. So shared URLs will look like gibberish.

      rob006 I am glad google understands the gibberish and brings the users that native language experience.

      Though technically URL is https://en.wiktionary.org/wiki/%E0%A4%B9%E0%A4%BF%E0%A4%82%E0%A4%A6%E0%A5%80 but when we copy between browsers we mostly get https://en.wiktionary.org/wiki/हिंदी

      Give it as an extra option we will chose whichever we like 🙂

        Hari Can you share URLs to these pages? I tested https://ndtv.in/ and all I get are URLs with standard alphanumeric (latin-only) slugs.

        • Hari replied to this.

          Hari Just copy pasting that first link direct from the browser to the browser in Edge (Chrome) URL bar back to here results in https://ndtv.in/topic/%E0%A4%B9%E0%A4%BF%E0%A4%82%E0%A4%A6%E0%A5%80-%E0%A4%96%E0%A4%AC%E0%A4%B0

          I'm fairly certain for that all browsers it's standard to re-encode the text. HOWEVER, it may be because I'm in an English speaking country and all my preferences are set to English. I do not know how they behave when the defaults are set to other languages.

          So the whole point is slug looks gibberish and you guys don't want to support this feature? At least as an option or an a extension 🙃

            Hari it's already possible for everything except tags as an extension. I think the discussion is whether to include it as an option in core.

              askvortsov if tags can't join the party that is okay. What we see is if the feature is in core it will be automatically taken care by the team. I know in a way it's a burden to the theme. If extension is a easy solution in terms of dev friendly .. let us go with extension. Both flarumates and Dev's will be happy.

              I don't think why this can not be in core, because flarum can support utf8 out of the box and if needed admin can set it to unicode in the settings panel (like we change settings for usernames ID/full name)

              Again go with whichever is easier to Dev's and future proof.

              3 months later

              This proposal is going to be closed:

              I think this solves the need for further slug handling. If not, feel free to flag the discussion.

              PS all of this is coming in v1.5 😉

              luceos locked the discussion .