Reorganization of translation keys

Dominion · Sep 10, 2015

Okay, my first move has been to go through and pull out the globals. Next I plan to start grouping the rest of the strings by location, and give some thought to prefixing. Then it'll just be a matter of finalizing the suffixes.

But before I get on to that, the process of organizing the globals has raised a couple questions.

Is it possible to combine strings?

I think I may have been a bit too optimistic about a couple of reuse instances. Cases in point:

The "Log In" link at the bottom of the signup modal
The "Sign Up" link at the bottom of the login modal

At first glance, these two links look like the other "Log In" and "Sign Up" links/buttons. But they're different in that they come with context, i.e. the core.before_log_in_link and core.before_sign_up_link strings, respectively. Some translators may need the freedom to embed the link in the context sentence, like so:

If you already have an account, please log in instead.

Even if there is no need for non-link text after the link, the hardcoded space separating the link from the context is bound to cause trouble for some translators. So each of these string pairs should be handled as one.

There's no need for you to act on these just yet, since there may be others. I'll compile a complete list of changes that need to be made when I'm ready to start editing key names. Or I can make the changes myself, with your approval, if you can help me out with the syntax. (I'm even less experienced with JS than I am with PHP.) For now, I'd merely like to confirm that making such changes won't create any problems.

How about unique key names?

After removing the above-mentioned pair of instances, we can summarize the globals situation thusly: we've got a total of 14 global strings, each used in only two or three places, for a total of just 35 app.trans calls.

That's not an awful lot. In fact, the numbers are so small that I've started to wonder whether it might be a good idea to use a unique key name for every string. Here's how we could do it:

The dev would start by prefixing every key name by location.
Each string would therefore be grouped with all other strings in the same location.
The key names for global strings would be followed by a reference as DSitC has suggested.
The globals would be grouped together for easy location.
Comments on globals would merely list the unique keys that reference them.

Please note that this doesn't mean we'd necessarily have to use a unique key name for every app.trans call. Cases such as core.bio_placeholder, which is used twice in the same location, could use the same key name. But it would mean adding 21 new keys, and about 35 lines to the YML file (not counting comments).

This approach would have advantages for both translators and devs:

From the translator's point of view, it would make it easier to locate a global string that's being used in the location he/she is concentrating on, and then quickly cross-check whether the translation will work in other locations where the string is used. And if for some reason the global string just isn't working out for a specific location, the translator would not need to ask for the string to be split: he/she could just replace that reference with a string value that fits.

Of course all the keys that we have decided to split (like the "button versus title" situations) would also reference the globals, so that would reduce the number of duplicate strings to be translated to zero. And in the rare case where a translator finds him/herself translating two different English strings into the exact same phrase, he/she can extract that phrase as a global and point both keys at it, again without bothering the devs.

From the developer's point of view, there is the obvious advantage of not having to handle as many requests for new strings. Beyond that, it will allow us to make the rules for naming keys simpler and easier to follow.

Of course, someone will have to check whether there's a global string to be referenced in each case, but this would no longer need to be done as part of the coding process. Adding strings to code would become a simple matter of (1) adding a new, unique key name (including a quick check to be sure that it is indeed unique) and then (2) adding that key and its string to the YML file. The extraction of duplicates as globals could be left for later cleanup, which is an easy task that doesn't need to be done by a programmer.

The downside to all this would be any performance issues that might arise from the referencing mechanism. Not to mention the effort involved in implementing such a mechanism, of course.

Please let me know what you think of this idea!

Dominion · Sep 10, 2015

For simplicity, I limited the above discussion of unique key names to the core. Things get slightly trickier if we take extensions into account. Here are some things we'd need to consider:

The proposal implies that core strings can't be used directly in any extension code. We'd want to have a line for each string in the extension YML. This is to preserve uniqueness; direct use of core strings would negate the advantages of the system.

So all realization of extension keys as strings from core would be handled by the YML referencing mechanism. Is this likely to cause any issues?
Would extensions be allowed to reference non-global core strings? (I would suggest that this be allowed only when the name of the extension key exactly matches that of the core key being referenced, i.e. when the string is used in the same manner and location as the referenced core string.)
Seen from this angle, namespacing could be handled as a simple fallback mechanism: if you don't find a string in the extension YML, look for it in the core YML!
When an extension wants to reference a non-global core string that isn't used in the same manner or location (assuming we choose to allow that), should that string be separated out as a global? (This seems a reasonable thing to do, but it would increase the number of references in the core YML, obviously.)
When a core key is referenced by an extension, should the core key be given a comment to indicate this?

Regarding the last two points, it goes without saying that we'd only be able to do this for bundled extensions. Third party devs would need to track their string usage on their own and be ready to make adjustments if a core string that they've been referencing gets changed. (But the uniqueness factor would make it easier for them to respond to such a situation, since they could merely replace the reference with a string.)

There may be other things I'm not taking into account. My thinking re: extensions is still a bit wooly at this point.

Franz · Sep 10, 2015

Sounds good to me. Very solid.

Any negative performance impact of the referencing mechanism can be compensated for by simply compiling all locales into one PHP file (with references already resolved) whenever an extension is added / updated.

Dominion · Sep 10, 2015

Franz Any negative performance impact of the referencing mechanism can be compensated for by simply compiling all locales into one PHP file (with references already resolved) whenever an extension is added / updated.

Ooh, nice!

It occurred to me that we'd need to put some sort of check on referencing within the same file, so that when Key A takes it to Key B and it finds another reference there, it throws an error. Otherwise a careless translator could easily send it into a loop. But we might want to make it possible for a key in an extension to reference a key in the core, and then take one further hop from there.

Toby · Sep 11, 2015

Agreed, everything you've outlined sounds good. Regarding extensions, referencing core translations will be fine. I don't think core should make accommodations for any extensions, even if they're bundled – so no, if there's no reason for a string to be a global in core, then it shouldn't be made a global.

Regardless, let's just focus on getting the basics of this system implemented first, and then we can tweak!

Dominion It occurred to me that we'd need to put some sort of check on referencing within the same file, so that when Key A takes it to Key B and it finds another reference there, it throws an error. Otherwise a careless translator could easily send it into a loop. But we might want to make it possible for a key in an extension to reference a key in the core, and then take one further hop from there.

Good thinking. We'll build in some kind of loop detection

Can we quickly discuss the format that references should take? Possibilities:

core:
  # What @DSitC originally proposed
  log_in_action: => core.log_in_title 

  # Would it be safe to omit the prefix and assume anything
  # in the format of foo.bar is a reference? My thought is
  # probably not...
  log_in_action: core.log_in_title 

  # Other ideas...
  log_in_action: > core.log_in_title
  log_in_action: ~core.log_in_title
  log_in_action: @core.log_in_title

I think @DSitC's original syntax is probably the safest, but just wanted to open the discussion.

Dominion · Sep 11, 2015

Okay, since it seems we all agree, I'll get underway on the assumption we'll be doing it this way.

Toby I don't think core should make accommodations for any extensions, even if they're bundled – so no, if there's no reason for a string to be a global in core, then it shouldn't be made a global.

Yes, that makes sense. Noted.

Toby Can we quickly discuss the format that references should take? Possibilities:

I agree that a plain foo.bar is probably best avoided. Any of your "Other ideas" seem good, though there's a small outside chance that someone might want to begin a string with an "@". The syntax proposed by @DSitC would probably be safe, and it might be worth memorializing the fact that he suggested it.

So unless @Franz has any objections, I'm happy to go with that.

My next question is: How soon would it be possible to put the referencing mechanism/compiler in place?

There's no hurry on this, as it'll take me a while to get the final key name taxonomy figured out. But if it seems like taking a while, I'd want to plan for it. I could do the following as I adjust the key names in the YML and code:

Add the reference lines to the YML file, but comment them out.
Add alternative lines with the unique key names in the code, and comment those out too.

Then when the compiler is ready, it would be a simple matter of uncommenting those things, and removing the old lines with the non-unique key names from the code.

Like I say, it'll be a while before I'm ready to start on the actual editing, so there's no need to set a schedule right now. I just thought it would be a good idea to bring it up here so we can coordinate our efforts.

EDIT: I suppose an alternative would be to do it as two branches, one with the non-unique keys and another with the unique ones. But since I'm new to Git, it's probably safest to do it as described above. Unless it's not necessary, of course.

Toby · Sep 11, 2015

Dominion Go ahead and make the changes as if the referencing system is in place. Implementation should be easy so we'll be able to get that done very quickly whenever the time comes.

Dominion · Sep 11, 2015

Toby Will do!

Another quick question: we agreed earlier that globals should be given no prefixes, but as we've since decided to organize everything by location, I think it might be good to give the globals a standardized prefix as well. Doing so would allow us to:

Keep them together in a clump when extracting data for one purpose or another.
Add general comments about globals (e.g. instructions on how to reference) should we desire.

I was thinking of using a simple "x" as the prefix, to indicate that the keys could be used in various locations. But on second thought, it might be better to do something like "aaa" or "zzz" to put them together at the top or bottom of the file when we alphabetize.

... Though come to think of it, it's not very likely that we'll have many locations beginning with x, y, or z.

Do you think such a prefix would be a good idea? Any preference as to which prefix we should use?

Franz · Sep 11, 2015

I'm fine with that arrow syntax.

Dominion I was thinking of using a simple "x" as the prefix, to indicate that the keys could be used in various locations. But on second thought, it might be better to do something like "aaa" or "zzz" to put them together at the top or bottom of the file when we alphabetize.

How about an underscore? Not very pretty, but it should work well...

Toby · Sep 11, 2015

I would prefer no prefixes for globals, I don't think distinguishing them is particularly important? We can still group them together in the YAML file if need be, we don't necessarily have to alphabetically order the whole thing.

But if you think a prefix is absolutely necessary, how about global_ ?

Dominion · Sep 11, 2015

Thanks for the responses. Equal-greater arrow it is!

Franz How about an underscore?

Nice idea, but thinking ahead to writing documentation, it would be best to reserve that for talking about suffixes. Similarly, anything that looks like a key name and ends with an underscore will be considered a prefix.

Toby We can still group them together in the YAML file if need be, we don't necessarily have to alphabetically order the whole thing.

That's true ... it was more the "extraction" part I was thinking about. For example, in order to get an overview of the keys for this reorganization I've been shuffling them between Word and Excel, sorting them this way and that, etc. Sorting by name would scatter them to the four winds.

But I suppose that on the rare occasion when need to do that, I can just temporarily prefix them before I start any sorting. So I'll go ahead with no prefixes. As you say, that is the best way.

Dominion · Sep 11, 2015

Another oddity I noticed is that things like dialog titles will also have to be prefix-less.

Otherwise it would end up looking like: change_email_change_email_title.

Dominion · Sep 11, 2015

@Toby Just a heads up...

Dominion Of course all the keys that we have decided to split (like the "button versus title" situations) would also reference the globals, so that would reduce the number of duplicate strings to be translated to zero.

One thing I didn't take into account when writing this line is the suffixes on (potentially) problematic cases.

Since we're going to the bother of referencing, of course we'd want to set up globals for these too. It's very likely that many translators will be able to use the same string for both a title and a button! In such cases, extracting globals will mean removing the difference signaled by the suffix.

So it seems globals will be not only prefix-less, but suffix-less as well. In other words, your original key names will suffice in most of those cases.

Dominion · Sep 12, 2015

@Toby and @Franz ... just a couple quick updates to keep you apprised of my progress.

I've gone through about one-fifth of the strings, devising unique key names for each and making note of global references where needed. My thinking about key names has been evolving a bit during this process, so I thought you'd want to know how things are shaping up.

Regarding globals:

As I suspected, the global strings will all use their existing key names. Not only are they all short and descriptive enough, they will also be cross-referenced back to the unique keys that reference them, so there's no need to make them any more complicated.

Regarding prefixes:

For the most part, I'm trying to stick with prefixes that are similar to the JS filenames, which tend to describe the location where strings are used well enough. But there are some cases where a bit of rearranging seems in order. For example, there are four JS files related to the composer:

Composer
DiscussionComposer
EditPostComposer
ReplyComposer

It might be useful to translators to group these together, so I'm using a general-to-specific scheme for prefixes:

composer_
composer_discussion_
composer_edit_
composer_reply_

I was afraid this might become too cumbersome, but so far it seems to be working out well enough.

Regarding suffixes:

Since we're going with unique key names, it occurred to me that the suffixes could be a bit more detailed (and therefore descriptive). For example, you had suggested "_action" as a suffix for things like buttons, links, etc. That was fine when a single key had to cover multiple uses of the same string. But unique keys give allow us to give translators more intuitive descriptions of how a string is used. If it's a button, we can call it a button.

This means the list of suffixes will get a bit longer, but I don't think it will be any less consistent. And ultimately developers may find it easier to use suffixes that are less abstract and closer to natural language.

Dominion · Sep 12, 2015

Quick question about:

core.controls in js/lib/components/Dropdown.js
core.notifications in js/forum/src/components/NotificationsDropdown.js
core.post_number in js/forum/src/components/PostMeta.js

Under what conditions would a user (or translator) be able to see these strings?

Toby · Sep 14, 2015

Dominion

core.controls: On the right side of a user profile page
core.notifications: Next to the notifications icon on mobile
core.post_number: In the post permalink dropdown (click on a post's time)

Dominion · Sep 14, 2015

Thanks, that'll allow me to wrap up my first draft.

In a bit I should have a list of terms I'm using for suffixes; there are a couple I'd like to get your opinion on.

Dominion · Sep 14, 2015

Toby core.controls: On the right side of a user profile page

(Facepalm) Of course. I couldn't find it because I was looking at my own profile page.

So this seems to be the only moderator-specific key in the YML file so far. Interesting! I think I'll leave it out of the reorganization for the time being, and come back to it when we're adding in some other moderator functions like Suspend and so on. (Unless it also has a non-moderator use?)

Toby core.notifications: Next to the notifications icon on mobile

... Now now that you mention it, I haven't even been thinking about mobile while coming up with key names. I hope it won't be a problem if key names are based mainly on how strings are used in the desktop layout. (I don't suppose too many translators will be using the mobile UI for localization work.)

Fortunately, this seems to be the only key I've found that doesn't seem to be used in the desktop layout. And hey, that reminds me: I've been thinking it would be good to have tooltips for the bell and flag. Would it be possible to use core.notifications for such a tooltip? Then I could call it core.notifications_tooltip or something.

On a slightly related note...

I just thought I'd mention this wee inconsistency in labeling that I noticed while working on the keys.

The Settings page refers to notifications delivered via the bell dropdown as "alerts". But everything in the bell dropdown refers only to "Notifications". Should the word "Alert" be in there somewhere?

One possibility would be to change the string for core.notifications in NotificationDropdown.js to something like "Alerts" or "Alert Center". That string could be used as a tooltip for the desktop layout too, as I said above.
Another way would be to change the "Alert" on the Settings page to "Header" and leave the dropdown as it is. Notifications delivered via the header, also sort of makes sense.
Or ... you could just say that "a foolish consistency is the hobgoblin of little minds" and leave everything as it is. Also a very reasonable choice, and gives bonus points for quoting Emerson.

Please let me know which of these three options you like; it will decide how I prefix the keys for the dropdown.

Dominion · Sep 14, 2015

Next step: Here's all the suffixes I've got. I'm trying to find names that translators will recognize easily, but they have to be easy for developers to remember and use as well. Let me know if you think any of them should be changed.

User actions

_button
_command -- displayed as a dropdown menu item
- Would a suffix such as _dropdown be more appropriate?
_confirmation -- displayed in a confirmation dialog handled by the browser
_help -- displayed prior to a user action (e.g. click button to receive an email, etc.)
_link
_message -- displayed after a user action (e.g. confirmation mail has been sent, etc.)
_prompt -- combines text prompting a user action with a link

Structural

_column -- displayed as the heading above settings or information arranged in a column
_dialog -- displayed as the title of a modal
- I think this will be more recognizable for translators than _modal would be.
_field -- displayed as a label above or next to a data entry field
_heading -- displayed as the heading above information presented as a list
_placeholder -- displayed as the default content of an empty data entry field
_row -- displayed as the heading next to settings or information arranged in a row
_section -- displayed as the label for a group of settings (e.g. "Account" on the settings page)
- Perhaps _area would be a better name?
_tooltip
_title -- displayed in the window title (e.g. for the Settings page)
- This is the reason I'm not using _title for modal titles...

Information

_notification -- displayed as notification content (e.g. when a discussion was renamed, etc.)
_text -- any informative string that doesn't fit any of the above categories

That's what I've got now. I've come up with a few more than I was expecting, but not nearly as many as I had feared. If you can suggest a better name for any of the above, or any ideas for better ways of organizing strings according to how they're used, I'm all ears!

PS: We'll probably have to add a few (such as _page maybe) when we get to the ACP. Hopefully not too many.

Toby · Sep 16, 2015

Dominion I think I'll leave it out of the reorganization for the time being, and come back to it when we're adding in some other moderator functions like Suspend and so on. (Unless it also has a non-moderator use?)

Previously it was used as the default label for a dropdown button (if not overridden), but on second thoughts I think that's a bad idea, so I've made it specific to to the user profile dropdown.

Dominion I hope it won't be a problem if key names are based mainly on how strings are used in the desktop layout.

That's fine. We've tried to keep the mobile layout as semantically similar to the desktop layout as possible, so I don't think there will be any issues here.

Dominion I've been thinking it would be good to have tooltips for the bell and flag. Would it be possible to use core.notifications for such a tooltip? Then I could call it core.notifications_tooltip or something.

Sounds good!

Dominion The Settings page refers to notifications delivered via the bell dropdown as "alerts". But everything in the bell dropdown refers only to "Notifications". Should the word "Alert" be in there somewhere?

Let's change the "Alert" on the Settings page to "Web".

Thoughts on the Suffixes

I think we should combine _button and _command, because these elements can actually end up being the same thing. For example, take a look at the "Reply" button and the commands in the adjacent dropdown on the right... Now take a look on mobile by tapping the three dots in the header... The "Reply" button shows up as a command! Same code/element, just different appearance.

In terms of picking one, I think I prefer _command.
I prefer _section over _area.
Everything else looks good, though could I please ask to see some examples of how each of the "structural" suffixes are used?