Reorganization of translation keys

Dominion

This thread is for discussion of issue #265.

Just so you know, my current plan of attack is as follows:

Create a matrix comparing en.yml with JS files in js/forum/src/components: Done! 🙂
Use the matrix to identify strings that are reused in more than one context: Almost done!
Spend some time thinking how the reuse of strings could impact key names.
Include that thinking when refining the key naming strategy.

I've run into a small snag while trying to identify strings that are reused, but before I get to that (it might be possible to ignore the snag) I thought I'd stop for a sanity check. Here's a couple questions for @Toby :

Is it possible for translators to create variant strings on their own?

You've said that Flarum's implementation of YAML will allow translators to create their own string variants to cover things like pluralization and gender. How far does this go? Could one, for example, create variant strings to cover a Stumbling block 4 situation, then condition the choice of string based on, say, which JS file is using the string? Or will translators need to ask the devs to add new strings in cases like that? (I'm guessing it'll be the latter.)

(I really need to learn more about YAML. Can anyone recommend a good, readable introduction?)

Are language resources for extensions self-contained?

That is, if an extension uses the "thingamajig" string that's also in the core, will it contain a duplicate of that resource? Or will it refer to core.thingamajig? (Again, I'm guessing it'll be the latter.)

Toby

Fantastic work! ⭐️

Dominion Is it possible for translators to create variant strings on their own?

Covering the stumbling block 4 situation with sub-keys is an interesting idea. Thinking aloud, we could potentially make it work like this:

// en.yml
change_password:
  default: Change Password
  button: Click here to change your password
  title: Here is where you can change your password

// JavaScript
app.trans('change_password', {context: 'button'}); // Click here to change your password
app.trans('change_password', {context: 'title'}); // Here is where you can change your password
app.trans('change_password', {context: 'doesnt_exist'}); // Change Password
app.trans('change_password'); // Change Password

// another_locale.yml
// Context variants aren't necessarily required
change_password: Change Password

// JavaScript
app.trans('change_password', {context: 'button'}); // Change Password

This could be powerful, because it allows the JavaScript code to be very verbose about what context a translation is being used in, without necessarily bloating the translations themselves with lots of duplicate strings. But if the translator has a need to take advantage of that context information to provide alternate translations, they can.

However, this does introduce some complexity, so we want to be careful. I think it really depends on the real-world application of this idea ... how often do we actually have duplicate strings in different contexts that are homogeneous in English but heterogeneous in other languages? Would it be a significant improvement over the context suffix system (change_password_button, etc.)?

Dominion Are language resources for extensions self-contained?

Extensions are free to reference core translations (like core.thingamajig) if they wish. It'd generally be recommended that they do, unless they're using a translation in a completely different sense/context/meaning, such that it might not make sense in other languages.

Dominion

Thanks for the answers! With these gaps in my understanding filled in, it's easier to see the parameters of the job.

Toby Covering the stumbling block 4 situation with sub-keys is an interesting idea. Thinking aloud, we could potentially make it work like this:

That's a very interesting idea! It's good to know this sort of thing is possible, and I'm sure it'll come in handy somewhere down the line. But as to using it to make translation keys more adaptable, like you, I would be a bit concerned about the added complexity.

In fact, it sounds like the sort of thing that, if you're going to do it at all, it should be applied across the board according to some standardized scheme. And that would be a lot of work. Given the minimalistic approach you've taken with regard to the use of language in the UI, the trouble would outweigh the possible benefits. As you say:

Toby how often do we actually have duplicate strings in different contexts that are homogeneous in English but heterogeneous in other languages?

Probably more often than you'd expect ... but on the whole, not often enough to justify such a sweeping change to the way things are done. Which leaves us with the possibility of applying this method on a piecemeal basis, as needed. But there too, it ends up being easier to just add a new key to the YML file and change the JS as needed.

At any rate, I was mostly wondering whether it would be possible for translators to add needed variants to the YML file without requesting changes to the code (or changing it themselves, if they're able). It sounds as if that is not possible, which is pretty much as expected; but now that I'm sure, I can keep it in mind as we go forward.

Toby Extensions are free to reference core translations (like core.thingamajig) if they wish. It'd generally be recommended that they do ...

That's also good to be aware of, since it means adding, changing, and renaming the keys in the core will impact any extensions that are using them. That's not a huge deal, since it will be an easy enough matter to change core.thingamajig to extension.thingamajig if the need arises. But it's another thing to keep in mind when thinking about naming schemes.

... I've got another question for you (about that snag I mentioned), but to stave off my tendency toward huge walls of text, I'll put it in a new post after a quick break. 😉

Toby

DSitC No, you'd have to have another sub-key for the base value:

namespace:
 key1:
  default: value1 base
  context1: value1 context1
  context2: value1 context2

Dominion Would doing this mean translators couldn't use context in other ways, such as for plurality or gender, for example?

That's an issue I had in the back of my mind while I was writing my first Toby – while there might be some way to make it work, it would be undoubtably more complex. For lack of a better example:

core:
 delete_post:
  one:
   title: Delete Post
   button: Delete
  other:
   title: Delete Posts
   button: Delete

# VS

core:
 delete_post:
  title:
   one: Delete Post
   other: Delete Posts
  button:
   one: Delete
   other: Delete

Dominion

Hmm. I seem to have flip-flopped a bit regarding approach, and it's occurred to me to wonder why.

As Franz said, it was indeed @Toby who first mentioned the idea putting context in the code (in his first reply above, which I find myself unable to mention for some reason). He also added a caution about the extra complexity this would involve, and I agreed that it didn't seem worth the trouble:

Dominion In fact, it sounds like the sort of thing that, if you're going to do it at all, it should be applied across the board according to some standardized scheme. And that would be a lot of work.

Yet when Franz brought the idea up again, I found myself thinking it might be worth the trouble:

Dominion It would be a rather big change to make, because it would be best to do apply it everywhere, but definitely worth the trouble!

Why did I suddenly find the idea so appealing? Well, after thinking about how difficult it would be to provide translators the information they need while keeping the key naming scheme both consistent and efficient, I began to think that it might be easiest manage the consistency angle in the code. It seems to me that it would be easier to devise a format for adding context there, than it would be to define a consistent key name format.

(Implementation, however, would be an entirely different matter.)

But even if we're okay with the added complexity that Toby warned of, that's not really the end of it. We'd have to come up with some way of letting the translators know what their options are. Without that info, translators would be forced to peek at the code to see what context keys were available. So we'd have to provide them with documentation, and then we'd have to make sure the code adhered to the rules in the documentation.

... And that means the context would have to be supplied uniformly in the code, everywhere. Which means my instinct (that it's the sort of thing that needs to be applied across the board) was spot on.

So ... given the extra effort involved, is it worth it? Let's look at the numbers. Of over 100 strings, only seven are reused in a way that could pose an issue for translators, and only one of those (core.email) strikes me as truly urgent. In terms of instances (app.trans calls) it comes to about 18 out of 128, of which only four are urgent.

These numbers will change as we add strings for the admin interface, take extensions into account, etc. But assuming they don't change too much, that's a lot of work to cover only a few situations. Again, Toby's caution springs to mind.

From this perspective, it seems we were right to focus on the key names. So instead of looking for ways to bend YAML to our collective will, we should probably be thinking about how we can provide translators with descriptive key names that strike a good balance between consistency and efficiency.

I'm starting to get some ideas about that, but I need some more time to flesh them out. So I'll leave this here for now.

Dominion

Okay, I started out by looking for instances of app.trans and comparing them to en.yml, as you suggested. That gave me a list that allows me to identify which strings are being reused, and where. (At the moment, as far as I can tell, there are 18 strings that are reused. Most are reused only once, but a few are used two or three times.)

But I've also come across some strings that don't correspond to app.trans instances in any of the JS files you indicated. This is a wee problem, because it's hard to say whether these are being reused or not. It also makes me wonder if the strings that do correspond to app.trans instances aren't also being reused in other files I haven't checked.

Here's a list of the strings in question. I'd be grateful if you could let me know where else in the code I should be looking for them (and potentially reuses of other strings as well).

activity -- related to Activity.js ??
cannot_reply
cannot_reply_help
confirm_delete_discussion
controls
delete
delete_forever
deleted
discussion_renamed_post -- related to DiscussionRenamedPost.js ??
group_admin -- see note
group_admins -- see note
group_guest -- see note
group_guests -- see note
group_member -- see note
group_members -- see note
group_mod -- see note
group_mods -- see note
log_in_to_reply
posted_a_reply
powered_by_flarum -- not a problem, we know where this goes!
prompt_rename_discussion
rename
reply
restore
started_a_discussion

(Note: Group name strings are database items, so we probably don't have to worry about finding them in the code.)

In case you're wondering, I'm concerned about how phrases are reused at this point for a couple reasons:

First, we need to consider whether we'll need some sort of "global" category to handle keys that are reused in a wide variety of different contexts. This could be represented by a "global_" prefix, or by the lack of a prefix. In either case, how we think about key names could be affected by the presence or absence of such a category.

Second, since I'm in the process of surveying key names anyway, it might be worth considering the possibility of adding new keys for cases where the string is clearly being reused with different meanings. We'd want to be careful about that, since our effort to remove a potential Stumbling block 4 situation could end up creating a Stumbling block 3. There's a balance to be considered there. But like I say, it may be worth thinking about.

Bonus snag:

EDIT: Never mind. Looking at EventPost.js after a full night's rest, it's pretty clear what this does. Sorry!

Toby

Dominion I'd be grateful if you could let me know where else in the code I should be looking for them (and potentially reuses of other strings as well).

Sorry about this! I forgot to mention a few places:

js/forum/src/utils/DiscussionControls.js
js/forum/src/utils/PostControls.js
js/forum/src/utils/UserControls.js
js/lib/components

That covers most of them, but there are a few left to explain:

activity is no longer used and can be safely disregarded.
this.descriptionKey() is a sort of "dynamic" key – the actual key comes from elsewhere. If you look in DiscussionRenamedPost.js, it has a matching descriptionKey() function which returns the actual key: core.discussion_renamed_post.
powered_by_flarum is not currently used, but it should remain because we'll use it again in the future.
started_a_discussion is no longer used and can safely be disregarded.

I don't think a global_ prefix is necessary; the lack of a prefix is fine in my opinion.

Dominion it might be worth considering the possibility of adding new keys for cases where the string is clearly being reused with different meanings

Yes, absolutely. That's one of the reasons I wanted your help to do this review! 🙂

Dominion

Thank you for responding so quickly!

Toby Sorry about this! I forgot to mention a few places:

No worries!

Toby activity is no longer used and can be safely disregarded.

Such keys will eventually be deleted from the YML, I hope? Unused strings are generally a pain for translators.

Toby That's one of the reasons I wanted your help to do this review!

Ah, then you weren't wondering after all! 😉

Thanks, I think this gives me what I need to get going with this job. But ... I've thought of another discussion we might have while I'm working on the keys. It's occurred to me to wonder if this wouldn't be a good time to revisit the idea of adding in a translation UI.

I've got some ideas about why such a UI might be a good idea, and how it might work, that I'll put in another thread. I'm only bringing it up here because the decision to add in such a UI could also affect our thinking about the keys.

Dominion

Toby

Sorry to bug you again ... I checked the locations you indicated, but am still missing code files for these strings:

deleted: [deleted]
joined_the_forum: Joined the forum
posted_a_reply: Posted a reply

The first one wasn't in the files you listed above. The other two are ones I failed to include in my original list. Sorry!

Judging by their similarity to started_a_discussion, I guess the latter are also obsolete?

(I'm thinking we don't really need to worry about deleted because I have a pretty good idea where it's used. 😉 )

Toby

Dominion Such keys will eventually be deleted from the YML, I hope?

Yes, I did this just now actually!

Dominion I've got some ideas about why such a UI might be a good idea, and how it might work, that I'll put in another thread. I'm only bringing it up here because the decision to add in such a UI could also affect our thinking about the keys.

Great! Look forward to reading it 🙂

Dominion

Toby Great! Look forward to reading it

Okay, here it is: http://discuss.flarum.org/d/843-ideas-for-a-translation-ui

(I'm linking it here because it includes some thinking relevant to the present discussion.)

Qiaeru

Again, such a pleasure to read those messages. Keep up the good work!

Dominion

@Qiaeru Glad to be of service! I hope it results in something that'll make your job easier!

Toby

Correct. deleted is used when referring to a user that no longer exists. The other two are obsolete. All fixed. 🙂

Dominion

Thank you! And yay!

My matrix is pretty much complete. Now for analysis.

Dominion

I've finished my first pass at the translation keys, looking at cases where a string is being used in more than one place. There are 19 such cases, of which 13 will pose no problem (i.e., the string is used with the same sense everywhere).

The remaining six cases are described below. Please bear in mind that the goal at present isn't to decide what to do about key names in these cases, but to think about whether or not we should do something about splitting them, and what that decision will imply for translation key naming in general.

Button vs. Dialog Title

There are four cases where a string is used as both a button and the title of a dialog box:

core.change_email: Change Email
core.change_password: Change Password
core.log_in: Log In
core.sign_up: Sign Up

I think most translators will be able to handle all these without requesting separate strings, so there's probably no need for preemptive action. On the other hand, they are few enough and short enough that we could split them up now and not worry about the duplicate phrases being a burden on translators. I'll discuss this in more detail below.

Table Heading vs. Link

The string "Discussions" (core.discussions) is currently being used in two places:

As a heading in the dropdown box listing the results of a search
As a link on the user page (followed by a number indicating quantity)

The fact that it's a heading in one case and a link in the other doesn't make any difference to the meaning. But the fact that the latter is followed by a number (how many discussions the user has started) does add an additional sense to the latter. This may become a reason for handling it as a separate string.

For that matter, it may also be a good idea in the second case to include the number in the string. This is because some languages may need to do linguistic things to it. (For example, Japanese generally adds a character after any quantity to indicate the type of thing being counted.) Of course, this would make the second case a different string.

The same would be true of the "Posts" link (core.posts) that appears above the "Discussions" link on the user page.

Table Heading vs. Text Box Label

The string "Email" (core.email) is currently being used in four places:

As a label or placeholder for a text entry box (three instances)
As a header in the notification settings table (one instance)

This is a clear case of a word being used in two different senses. As a text box label, it's prompting the user to enter his or her email address. As a table header, it's talking about email as a notification method, and has nothing to do with the address. It's very likely that a translator will want to translate the word differently in each of these two contexts.

Moreover, there's a big difference between the two in the amount of space available to the translator. This is another factor that will determine how the phrase can/should be translated. So I think we may have ample reason to go ahead and separate the latter instance out as a separate string.

Discussion

With the above information in hand, we're now ready to take our first steps toward settling on a key naming scheme.

@Toby has said that key names should be descriptive and consistent. These are worthwhile goals, but there are times when it is hard to do both at once. The problematic cases of string reuse described above will help us see why. Let's begin by asking:

Are prefixes and suffixes really necessary?

After all, when we're talking about a one- or two-word string (which is what most of the reused strings are), there's really no better way to describe the string than to use the string itself as the string name! This is especially true when a string is used in several different places in the UI. In such cases, we can't add prefixes or suffixes to indicate where the string is used without creating a number of duplicate strings, which will increase the translator's workload.

Granted, it might be nice to add some information to help the translator figure out what sense a word is being used in, where it can be found in the UI, and so on. But that's only really necessary when the meaning of the string is inherently ambiguous, as I pointed out here. In most cases it's not all that hard for a translator to guess what, say, "Log In" means. (Especially if it's spelled correctly. I congratulate @Toby on being one of the virtuous few who get it right! 🙂 )

But let's say that, in order to remain consistent, we want to tag the string for every button name and dialog box title with a suffix that will let the translator know how the string is being used. That's reasonable, because it's also descriptive after another fashion. But in cases where strings are being reused, such as the four "Button vs. Dialog Title" cases described above, this will end up forcing our hand: we will have to split those four strings to accommodate the consistent naming. We will no longer be able to put that off until a translator requests a split.

That isn't a huge problem. After all, it's only four little strings. But as we add more strings, and take the extensions into account, the number of duplicate strings created for the sake of consistency will continue to grow. Eventually we could end up with a real Stumbling block 3 situation.

Key names should be descriptive and consistent, but there's a case to be made for efficiency too!

What happens when we decide to split a string?

Let's forget (temporarily) about the possibility of adding prefixes and suffixes to everything and talk about what happens when we decide to split a string. The "Button vs. Dialog Title" cases will come in handy for that, too.

Take, for example, the strings core.log_in and core.sign_up. Each is used in four places: three times as a button name and once as a dialog title. If we decide to split off the dialog titles as separate strings, using suffixes as described here to distinguish between them and the buttons, we can proceed in one of two ways:

We can add a suffix to the title string only (core.log_in_title), leaving the button string as-is (core.log_in).
We can add a suffix to both the title string (core.log_in_title) and the button string (core.log_in_action).

The former course would result in a pair of string names that are not consistent with each other. But at least one of the pair will remain consistent with any other dialog title or button strings that don't have suffixes yet. One benefit of this approach is that it is easy to implement, since you only have to change one string name, which is used in two places.

The latter course would result in a pair of string names that are consistent with each other, but inconsistent with any dialog title and button string names that don't have suffixes. (Of course we could fix that by going the "suffixes for all" route described above; but we're forgetting about that possibility now, remember?) The downside to this approach is that you end up having to make a lot more changes: two string names, used in a total six places.

It seems it might be good to think about efficiency not only in terms of key name length and quantity of duplicate keys, but also ease of implementation.

Two approaches to key naming

I think we can boil down all the above (Yes! At last, a TL/DR! 😃 ) by saying that we can slant our key naming scheme in one of either two directions:

Greater consistency, at the expense of efficiency
Greater efficiency, at the expense of consistency

I should add that while I've been looking to cases of string reuse for clues, single-use strings won't remain unaffected.

Let's say we're using the word "Thingamajig" as a button name, and that button is the only place it appears it in the UI. Do we need to give it a suffix to indicate it's a button? The greater consistency approach would argue yes.

And let's not forget the possibility that we may eventually add a dialog box titled "Thingamajig". If we somehow arrive at a policy that requires suffixes on every string that's being used in more than one context, we'll not only have to add a new core.thingamajig_title string, we'll have to rename core.thingamajig to core.thingamajig_button at that point. Or something like that. I think.

My question for Toby (and anyone else who's interested)

Sorry for making you read all this stuff, but I wanted to get your informed opinion on how to go forward. Ultimately, it comes down to a rather simple choice: more consistency, or more efficiency?

I didn't want to make that decision by fiat ... in fact, the more I think about this stuff, the more I'm inclined to toss it all aside and go with the key names you've got. (By which I mean: limit myself to minor tweaks, slanted heavily in favor of efficiency.) I think there's a limit to what can be achieved using key names as the sole tool; which is why I think it might be a good idea to give some thought at this point to a translation UI capable of providing translators with more info about the strings than we can put into the key names.

I'm not saying we have to think about starting work on such a UI right away. But if we know we'll have one eventually, then for the time being we can settle for key names that are less than optimally descriptive and consistent.

Please let me know what you think!

PS: As for extensions, at this point I think we'll just have to pick a policy and hope it scales well. 😛

Toby

OK, time for me to review everything that's been said and offer a solid opinion on all of this!

Dominion In most cases it's not all that hard for a translator to guess what, say, "Log In" means. (Especially if it's spelled correctly. I congratulate @Toby on being one of the virtuous few who get it right! )

Thanks! This is actually a huge source of pride for me. Whenever I see someone writing "login" as a verb, I die a little inside. 😛

Dominion From this perspective, it seems we were right to focus on the key names. So instead of looking for ways to bend YAML to our collective will, we should probably be thinking about how we can provide translators with descriptive key names that strike a good balance between consistency and efficiency.

Completely agreed.

Dominion But if we know we'll have one [a translation UI] eventually, then for the time being we can settle for key names that are less than optimally descriptive and consistent.

But in the case of the duplicate strings used in different contexts, would we not need to split them and name them regardless? To me, the translation UI sounds like an amazing tool (and I will reply to that thread soon!), but I still think this problem should be solved to a large degree by a solid key naming scheme. I'm not expecting to come up with something where translators can just look through the YAML file and immediately interpret the meaning of certain prefixes/suffixes – that's where the UI would help. But from the coder's perspective, they should be confident that they can follow a set of rules and name a key correctly and consistently. Does that make sense?

So after considering all of this – but admittedly, without having been through all of the strings and their uses in excruciating detail like @Dominion has – I think my preference would be to enforce contextual suffixes, chosen from a list, for all keys. i.e. Greater consistency, at the expense of efficiency:

Split up core.log_in into core.log_in_title and core.log_in_action? Yes.
Rename core.thingamajig to core.thingamajig_action (even if there is no core.thingamajig_title)? Yes.

I understand this will result in more duplication than may be ideal, but on my brief look through Dominion's amazing matrix of the strings and their uses, I noticed that all of the duplicated strings are very short. Since we're proposing a suffix, they will all be listed together (alphabetically) in the YAML file; The translator will be able translate them all at once, usually just by copy+pasting the first one onto the others. My point is: Are we really losing that much efficiency?

Anyway, that's where my thoughts currently stand. What do y'all think?

Franz

Hmm, I just cannot write such a long reply, but I only have a short idea anyway, so here it comes... 😉

Toby may even have suggested this somewhere, not sure whether it's really my idea. But can't we use descriptive keys like "sign_up.button" when translating a string in the code, but only force translators to define "sign_up"? The lookup will always fallback to "sign_up" unless the translator has specifically added a "sign_up.button" to their locale file.

We basically get the best of both worlds (less work for the translator, more specific translations where necessary). The only thing we'd have to figure out is how to distinguish the namespace separators (flarum.core) from the context separators (sign_up.button)... Any suggestions?

Dominion

Ooooh. That would certainly allow us to do things efficiently, while giving the the translator both the necessary info and a good bit of flexibility. It would be a rather big change to make, because it would be best to do apply it everywhere, but definitely worth the trouble!

I'll have a think about distinguishing the separators. In the meantime, here's an example from LoginModal.js:

  title() {
    return app.trans('core.log_in');
  }

versus

            {Button.component({
              className: 'Button Button--primary Button--block',
              type: 'submit',
              loading: this.loading,
              children: app.trans('core.log_in')

I don't suppose there's any way the translator could do something in the YML file that would leverage the information in the className or type? If that were only possible, we'd have the context already mostly in place.

LATER...

If not, how about using a hashtag between the key name and the context?

Also, would we want to apply context to only one situation, to make the distinction efficiently? Or everywhere, to give the translator maximum freedom in how they handle the situation? For example:

  title() {
    return app.trans('core.log_in#title');
  }

This would be enough to allow adequate handling of the situation. But putting a hashtag on the button as well would give the translator freedom to handle either the title or the button as the variant.

DSitC

Dominion No need to hardwire it into the string in a special way. Just let the code writer assemble the translation key:

return app.trans('core.log_in.'+this.title);

... or something like this. 😉

Dominion

@DSitC Thanks! I still have no idea how the coding side of this works.

I would assume it's possible to add multiple contexts to a single string with this method. Is that the case?

Dominion

Hey! Here's a question I should have asked a long time ago:

Up till now the discussion has focussed on how we can use YAML to handle variation by means of one-to-many correspondences, i.e., one key name realised as multiple strings, conditioned by context from the code.

Would YAML also be able to handle many-to-one correspondences? For example, in Japanese we might need:

log_in_title: ログインしてください
log_in_action: ログイン

But in English, both titles could be realised as the same string:

log_in_title, log_in_action: Log In

If that sort of thing is possible, we could stuff as much context in the key names as we like, without worrying about creating extra work for the translator where it isn't necessary.

DSitC

I found nothing about multiple keys referencing a single value in the spec.

However, to avoid repetition, you could specify (and evaluate) a special value syntax for references to other keys:

log_in_title: Log In
log_in_action: => log_in_title

That could tell the i18n parser of flarum to determine for the value of log_in_action by taking the contents of the log_in_title key.

Dominion

DSitC That was going to be my next question. 😃

Thanks for responding, that gives me something to think about.

Dominion

Okay, my first move has been to go through and pull out the globals. Next I plan to start grouping the rest of the strings by location, and give some thought to prefixing. Then it'll just be a matter of finalizing the suffixes.

But before I get on to that, the process of organizing the globals has raised a couple questions.

Is it possible to combine strings?

I think I may have been a bit too optimistic about a couple of reuse instances. Cases in point:

The "Log In" link at the bottom of the signup modal
The "Sign Up" link at the bottom of the login modal

At first glance, these two links look like the other "Log In" and "Sign Up" links/buttons. But they're different in that they come with context, i.e. the core.before_log_in_link and core.before_sign_up_link strings, respectively. Some translators may need the freedom to embed the link in the context sentence, like so:

If you already have an account, please log in instead.

Even if there is no need for non-link text after the link, the hardcoded space separating the link from the context is bound to cause trouble for some translators. So each of these string pairs should be handled as one.

There's no need for you to act on these just yet, since there may be others. I'll compile a complete list of changes that need to be made when I'm ready to start editing key names. Or I can make the changes myself, with your approval, if you can help me out with the syntax. (I'm even less experienced with JS than I am with PHP.) For now, I'd merely like to confirm that making such changes won't create any problems.

How about unique key names?

After removing the above-mentioned pair of instances, we can summarize the globals situation thusly: we've got a total of 14 global strings, each used in only two or three places, for a total of just 35 app.trans calls.

That's not an awful lot. In fact, the numbers are so small that I've started to wonder whether it might be a good idea to use a unique key name for every string. Here's how we could do it:

The dev would start by prefixing every key name by location.
Each string would therefore be grouped with all other strings in the same location.
The key names for global strings would be followed by a reference as DSitC has suggested.
The globals would be grouped together for easy location.
Comments on globals would merely list the unique keys that reference them.

Please note that this doesn't mean we'd necessarily have to use a unique key name for every app.trans call. Cases such as core.bio_placeholder, which is used twice in the same location, could use the same key name. But it would mean adding 21 new keys, and about 35 lines to the YML file (not counting comments).

This approach would have advantages for both translators and devs:

From the translator's point of view, it would make it easier to locate a global string that's being used in the location he/she is concentrating on, and then quickly cross-check whether the translation will work in other locations where the string is used. And if for some reason the global string just isn't working out for a specific location, the translator would not need to ask for the string to be split: he/she could just replace that reference with a string value that fits.

Of course all the keys that we have decided to split (like the "button versus title" situations) would also reference the globals, so that would reduce the number of duplicate strings to be translated to zero. And in the rare case where a translator finds him/herself translating two different English strings into the exact same phrase, he/she can extract that phrase as a global and point both keys at it, again without bothering the devs.

From the developer's point of view, there is the obvious advantage of not having to handle as many requests for new strings. Beyond that, it will allow us to make the rules for naming keys simpler and easier to follow.

Of course, someone will have to check whether there's a global string to be referenced in each case, but this would no longer need to be done as part of the coding process. Adding strings to code would become a simple matter of (1) adding a new, unique key name (including a quick check to be sure that it is indeed unique) and then (2) adding that key and its string to the YML file. The extraction of duplicates as globals could be left for later cleanup, which is an easy task that doesn't need to be done by a programmer.

The downside to all this would be any performance issues that might arise from the referencing mechanism. Not to mention the effort involved in implementing such a mechanism, of course. 😉

Please let me know what you think of this idea!