Reorganization of translation keys
We can also use the same separators both for namespaces and context, but this would limit the actual keys to be flat, so we wouldn't be able to structure them anymore.
All locale keys would follow this pattern:
namespace.key.context
e.g. core.log_in.button
(the context would be optional)
- Edited
Franz We can also use the same separators both for namespaces and context
I was thinking that should be possible...
Franz but this would limit the actual keys to be flat, so we wouldn't be able to structure them anymore.
I'm not sure what you mean by this. (Ah well, I guess my technical savvy only goes so far...)
Would doing this mean translators couldn't use context in other ways, such as for plurality or gender, for example?
- Edited
@Franz With YML, is this possible?
namespace.key1: value1 base
namespace.key1.context1: value1 context1
namespace.key1.context2: value1 context2
So, that if app.trans('namespace.key1')
is called, you would get value1 base
, but with a call of app.trans('namespace.key1.context1')
you would be able to access the context1 value?
Added: So, devspeak - is the YML markup directly converted into an object, dropping any scalar value tied to an upper layer, or can you intercept this and add custom functionality to it?
DSitC No, you'd have to have another sub-key for the base value:
namespace:
key1:
default: value1 base
context1: value1 context1
context2: value1 context2
Dominion Would doing this mean translators couldn't use context in other ways, such as for plurality or gender, for example?
That's an issue I had in the back of my mind while I was writing my first Toby – while there might be some way to make it work, it would be undoubtably more complex. For lack of a better example:
core:
delete_post:
one:
title: Delete Post
button: Delete
other:
title: Delete Posts
button: Delete
# VS
core:
delete_post:
title:
one: Delete Post
other: Delete Posts
button:
one: Delete
other: Delete
Come to think of it though, the logical option is the second example I gave. It's fundamentally the same as the underscore suffixes, just in a different format. The real issue is what happens when you mix fallbacks with plurals, like so:
core:
delete_post:
one: Delete Post
other: Delete Posts
// JavaScript
app.trans('core.delete_post.button', {count: 3});
core.delete_post
has sub-keys, but how will app.trans know that they're plural sub-keys rather than context ones? It's hard to think about without actually writing some code ... maybe there's a logical way to make it work, but I guess my point is this: by reducing duplication for translators, we probably increase the complexity of the system. Food for thought.
Mind you, I'm in a bit of a rush right now so probably not thinking very precisely. Hell, I haven't even read @Dominion's big post yet. I'll hopefully have time tonight to sit down and review this whole thing.
- Edited
Toby The nightmare really gets started when you have a language that has a lot of different pluralization (zero, one, a few, many, a lot) and then also need to add gender specific values into the mix. Oh joy. ;-)
Ideally there should be a system thats allow for such complex expressions in the language files, but also lets you just write simple key-value pairs if your language does not need it.
- Edited
@Toby ... Funny you should bring up "Delete" as an example, as I was just thinking along the same lines.
It seems to me that the real value in putting context into the code (as opposed to the key names) lies in the ability to apply multiple contexts at once.
Let's say, for example, that we want to use the string core.delete as the title of a confirmation dialog, as in your example. And let's also imagine that a translator needs to use a different word when talking about deleting users (as opposed to posts or discussions). So we end up with two different types of context that can be combined for four variations, like so:
- core.delete +content +title
- core.delete +content +button
- core.delete +user +title
- core.delete +user +button
From this standpoint, namespace.key.context doesn't seem like much improvement over namespace.key_context.
Toby by reducing duplication for translators, we probably increase the complexity of the system. Food for thought.
As you said way back in your first reply to this thread ... and I don't think there's any way around that. Probably the best (and easiest to implement) solution we've seen so far is the one suggested by @DSitC involving key-to-key reference:
log_in_title: Log In
log_in_action: => log_in_title
That sort of thing would allow a translator to replace the key reference with a variant translation, but it's just putting the extra complexity in the YML instead of the code, and would probably complicate things like pluralization horribly.
Please don't rush on my account, I'm happy using this time to mull the situation over. Learning a lot, too!
DSitC Ideally there should be a system thats allow for such complex expressions in the language files, but also lets you just write simple key-value pairs if your language does not need it.
I agree!
Additional ideas, solutions and food-for-thought: https://slexaxton.github.io/Jed/
- Edited
Hmm. I seem to have flip-flopped a bit regarding approach, and it's occurred to me to wonder why.
As Franz said, it was indeed @Toby who first mentioned the idea putting context in the code (in his first reply above, which I find myself unable to mention for some reason). He also added a caution about the extra complexity this would involve, and I agreed that it didn't seem worth the trouble:
Dominion In fact, it sounds like the sort of thing that, if you're going to do it at all, it should be applied across the board according to some standardized scheme. And that would be a lot of work.
Yet when Franz brought the idea up again, I found myself thinking it might be worth the trouble:
Dominion It would be a rather big change to make, because it would be best to do apply it everywhere, but definitely worth the trouble!
Why did I suddenly find the idea so appealing? Well, after thinking about how difficult it would be to provide translators the information they need while keeping the key naming scheme both consistent and efficient, I began to think that it might be easiest manage the consistency angle in the code. It seems to me that it would be easier to devise a format for adding context there, than it would be to define a consistent key name format.
(Implementation, however, would be an entirely different matter.)
But even if we're okay with the added complexity that Toby warned of, that's not really the end of it. We'd have to come up with some way of letting the translators know what their options are. Without that info, translators would be forced to peek at the code to see what context keys were available. So we'd have to provide them with documentation, and then we'd have to make sure the code adhered to the rules in the documentation.
... And that means the context would have to be supplied uniformly in the code, everywhere. Which means my instinct (that it's the sort of thing that needs to be applied across the board) was spot on.
So ... given the extra effort involved, is it worth it? Let's look at the numbers. Of over 100 strings, only seven are reused in a way that could pose an issue for translators, and only one of those (core.email) strikes me as truly urgent. In terms of instances (app.trans calls) it comes to about 18 out of 128, of which only four are urgent.
These numbers will change as we add strings for the admin interface, take extensions into account, etc. But assuming they don't change too much, that's a lot of work to cover only a few situations. Again, Toby's caution springs to mind.
From this perspective, it seems we were right to focus on the key names. So instead of looking for ways to bend YAML to our collective will, we should probably be thinking about how we can provide translators with descriptive key names that strike a good balance between consistency and efficiency.
I'm starting to get some ideas about that, but I need some more time to flesh them out. So I'll leave this here for now.
- Edited
Concerning more complex pluralization rules, here's a pretty exhaustive overview: http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html
Take a look at arabic, that's really... well... special...
OK, time for me to review everything that's been said and offer a solid opinion on all of this!
Thanks! This is actually a huge source of pride for me. Whenever I see someone writing "login" as a verb, I die a little inside.
Dominion From this perspective, it seems we were right to focus on the key names. So instead of looking for ways to bend YAML to our collective will, we should probably be thinking about how we can provide translators with descriptive key names that strike a good balance between consistency and efficiency.
Completely agreed.
Dominion But if we know we'll have one [a translation UI] eventually, then for the time being we can settle for key names that are less than optimally descriptive and consistent.
But in the case of the duplicate strings used in different contexts, would we not need to split them and name them regardless? To me, the translation UI sounds like an amazing tool (and I will reply to that thread soon!), but I still think this problem should be solved to a large degree by a solid key naming scheme. I'm not expecting to come up with something where translators can just look through the YAML file and immediately interpret the meaning of certain prefixes/suffixes – that's where the UI would help. But from the coder's perspective, they should be confident that they can follow a set of rules and name a key correctly and consistently. Does that make sense?
So after considering all of this – but admittedly, without having been through all of the strings and their uses in excruciating detail like @Dominion has – I think my preference would be to enforce contextual suffixes, chosen from a list, for all keys. i.e. Greater consistency, at the expense of efficiency:
- Split up core.log_in into core.log_in_title and core.log_in_action? Yes.
- Rename core.thingamajig to core.thingamajig_action (even if there is no core.thingamajig_title)? Yes.
I understand this will result in more duplication than may be ideal, but on my brief look through Dominion's amazing matrix of the strings and their uses, I noticed that all of the duplicated strings are very short. Since we're proposing a suffix, they will all be listed together (alphabetically) in the YAML file; The translator will be able translate them all at once, usually just by copy+pasting the first one onto the others. My point is: Are we really losing that much efficiency?
Anyway, that's where my thoughts currently stand. What do y'all think?
Toby If YAML as a translation source is to stay, i'd cast my vote in that favor, too. :-)
Personally, I'd prefer a format that would allow to omit the context, both in definition and translation method call and make intelligent fallbacks, depending on the data given. But with YAML this is not really possible.
- Edited
Toby Whenever I see someone writing "login" as a verb, I die a little inside.
I'm the same way. Oh, I may slip up now and then when trying to get my thoughts down quickly, but I think it's important to get it right when using language in things intended for publication (like buttons in a software app.)
Toby Are we really losing that much efficiency?
On the whole I tend to agree, the loss isn't all that huge. But before I answer in greater detail (which I'll do after I've had breakfast and coffee), let me ask a quick question to be sure I'm understanding you correctly. You're talking about:
- Creating discrete keys where the string is reused in a different way (e.g. button/link versus title), but
- Using the same key where the string is reused in a different place (e.g. login box versus signup box)
Have I got that right? I agree ... even in cases where translators may not need a variant, it's best to be proactive.
DSitC Personally, I'd prefer a format that would allow to omit the context, both in definition and translation method call and make intelligent fallbacks, depending on the data given. But with YAML this is not really possible.
Could you describe what you have in mind, with a concrete example or two perhaps?
Dominion Sure thing.
language file:
key1: value1
key1.context1: value1.1
key1.context2: value1.2
key2.context1: value2.1
key2.context2: value2.2
key3: value3
code:
app.trans('key1'); // ==> "value1"
app.trans('key1', 'context2'); // ==> "value1.2"
app.trans('key1', 'context5'); // ==> "value1"
app.trans('key2'); // ==> "value2.1"
app.trans('key2', 'context2'); // ==> "value2.2"
app.trans('key2', 'context5'); // ==> "value2.1"
app.trans('key3'); // ==> "value3"
app.trans('key3', 'context1'); // ==> "value3"
- Edited
DSitC Sure thing.
Thanks! I'm following you now.
I agree, that would be way to go if we want to condition by context in the code. But as I said above, I'm not sure we can justify the work it would take to put that system in place. Now if Flarum were a much bigger project (say, along the lines of Tiki Wiki or something) it might be a different story.
For our purposes, I think it may be best to limit the use of context to conditioning regular grammatical phenomena, such as plurality or gender. Anything more than that may well be more than translators want to deal with.
I'm also a bit concerned that the fallback key would be less descriptive than the variants. I'll explain in detail below. First I'd like to reply to the latest by Toby as promised (though it's been a while since eggs and baccy):
Toby But from the coder's perspective, they should be confident that they can follow a set of rules and name a key correctly and consistently. Does that make sense?
It does! Of course, even a very complicated system can be made consistent if the rules are detailed enough. But in order to make life easier for translators, it's a good idea to give the devs a consistent set of rules that will be easy to follow. So simpler is definitely better.
Toby I think my preference would be to enforce contextual suffixes, chosen from a list, for all keys.
I agree, although I wasn't thinking along those lines at first. For much the same reason that @DSitC would rather have fallbacks, my instinct would be to suffix only one of the key names. Why use two bits of information to make the distinction when one would do? It would be easier to implement, and it could be made consistent. But it would require more complex rules (e.g., "in a button vs. title situation, the latter gets the prefix") which in the long run would probably end up being less efficient.
What's worse, however, is that it would leave one of the keys less descriptive than the other. This makes life harder for translators. I'll have to put on my translator hat to explain why.
One change of hats later...
Okay. When I'm translating resources, the first thing that comes to mind isn't "What does this string mean?" That bit goes by so fast, I barely notice it. No, the first thing I want to know is: "Where can I see this?" I want to know how the string is used so I can check how much space I've got to work with, imagine my translation in context, and so on.
The suffix on core.thingamajig_title will get me most of the way there. Once I know it's a dialog title, I can probably figure out how to display the dialog box. But what of core.thingamajig? There's nothing there to say what sort of a thing I should be looking for. Button? Link? Table heading? I'm left guessing, with only the string itself as a clue.
So yes, we should put suffixes everywhere. But that still gets me only partway. Your idea of prefixes can also come in handy in some cases. For example, putting core.notification_method_alert and core.notification_method_email next to each other will help the translator recognize that this "Email" is different from the other four. (I see you were anticipating the core.email issue already!)
But ... say core.thingamajig_action (like core.log_in_action) is used in multiple locations. As a translator, I want to check all of them, because maybe there's one or two cases where my translation will be too long to fit comfortably in the space provided. So where should I be looking? For that matter, how do I know when I've found them all? At present, I'd have to run a global search on every file with an app.trans call to be sure. That were best avoided.
Unfortunately, adding this information to the keys would mean creating a discrete key for every string in the program, and that would definitely be going too far. As you said:
Toby that's where the UI would help.
As an alternative, we could provide some info as comments in the YML file. That wouldn't take care of the other two birds I mentioned, but it would serve as a stopgap. And since I'm going to be editing key names anyway, maybe this would be a good opportunity to put that information in as well.
And if we do that ... I've been wondering if it would be possible to leverage those comments for display by the GUI, if and when you get around to adding one. If so, then it might be a good idea to give a little thought now to how the comments may best be formatted. (Though to do that, I guess we'd have to give some thought to the design of the GUI. Hoo-boy, add one little idea and the work just starts piling up!)
... I'd best wait for your comments on that thread.
In the meantime, I think I've got a handle on what's needed now, so I'll start revising my matrix with the new names and draft up a set of rules to explain them. At some point I'll probably want to ask for your help in clarifying the list of suffixes available, and so on ... but that's probably a few days off.
- Edited
Actually, I've already come up with a couple questions I thought I'd better ask sooner rather than later.
About suffixes:
I could probably come up with a list of names for things in a GUI (such as title, action, etc.), but what I may come up with may not match the technical terms you're already using. You wouldn't happen to have a handy list of names sitting around, would you?
If you don't have list, I can just poke around the code and see what you're using for class names.
About prefixes:
My instinct is to use prefixes to group things by location. In many cases, that could double as a hint as to which files the string is used in. For example, strings used only in the "Change Email" modal would get the "change_email_" prefix, while the button that opens dialog would be "settings_change_email_action".
The advantage to this is that it would allow translators to concentrate on and finish specific areas of the UI in a fairly efficient manner (the stumbling block there being any global strings involved). The downside is that this will scatter duplicate strings about, instead of clumping them together. I figure we've all got Search functions in our editors for that, but I thought I'd ask for your take on things.
From the string creation point of view, grouping by location might make it easier for devs to come up with consistent key names. But it will also make it harder to know when they're creating a string that already exists elsewhere. They might overlook an existing global string, or a string from another location that should be merged with the new string to form a global. We'd need a way to prevent that.
The best way might be to create some sort of string database that can be searched from either direction. I'm not sure how practical that would be, though.