Monday, July 15, 2024
HomeRuby On RailsRoutinely sentence-case i18next translations - BigBinary Weblog

Routinely sentence-case i18next translations – BigBinary Weblog


We use i18next to deal with our localization
requirement. Now we have written in nice element how we use
i18next and react-i18next libraries
in our purposes.

As our translations grew, we realized as a substitute of including each mixture of the
texts as separate entries within the translation file, we will reuse most of them by
using the i18next interpolation function.

Interpolation is
one of the used functionalities in i18n. It permits integrating dynamic
values into our translations.

1{
2  "key": "{{what}} is {{how}}"
3}

1i18next.t("key", { what: "i18next", how: "nice" });
2// -> "i18next is nice"

Downside

As we began to make use of interpolation an increasing number of, we began seeing lot of textual content
with irregular casing. As an illustration, in one in all our apps, we’ve an Add button
in a number of pages.

1{
2  "addMember": "Add a member",
3  "addWebsite": "Add a web site"
4}

As a substitute of including every textual content as an entry within the translation file as proven above,
we took a little bit of a generic method and began utilizing interpolation. Now our
translation recordsdata began to seem like this.

1{
2  "add": "Add a {{entity}}",
3  "entities": {
4    "member": "Member",
5    "web site": "Web site"
6  }
7}

That is nice, but it surely has a slight downside. The ultimate textual content fashioned seemed like
this.

We are able to see the Member continues to be capitalized, we would have liked it to be correctly sentence-cased like this.

We first thought we might simply add .toLocaleLowerCase() to the dynamic worth.

1t("add", { entity: t("entities.member").toLocaleLowerCase() });

It labored effective. However typically, builders would neglect so as to add .toLocaleLowerCase()
in a variety of locations. Secondly, it began to pollute our code with an excessive amount of
.toLocaleLowerCase().

As at all times, we determined to extract this downside to our
neeto-commons-frontend
package deal.

Options we checked out

At first, it appeared like a quite simple downside. We thought we will simply use the
post-processor
function. We simply must sentence-case all the textual content on post-process like
this.

1const sentenceCaseProcessor = {
2  kind: "postProcessor",
3  identify: "sentenceCaseProcessor",
4  course of: textual content => {
5    // Sentence-case textual content.
6    return (
7      textual content.charAt(0).toLocaleUpperCase() + textual content.slice(1).toLocaleLowerCase()
8    );
9  },
10};
11
12i18next
13  .use(LanguageDetector)
14  .use(initReactI18next)
15  .use(sentenceCaseProcessor)
16  .init({
17    sources: sources,
18    fallbackLng: "en",
19    interpolation: {
20      escapeValue: false,
21      skipOnVariables: false,
22    },
23    postProcess: [sentenceCaseProcessor.name],
24  });

Voila! Now onwards all of the texts shall be correctly sentence-cased, we not
want so as to add .toLocaleLowerCase(). Nice? Probably not.

We quickly realized that not each textual content ought to be sentence-cased, there are lots
of circumstances the place we have to protect the unique casing. Listed below are some examples.

1Your file is bigger than 2MB.
2Disconnect Google integration?
3No outcomes discovered along with your search question "Oliver".
4Your Api Key: AJg3c4TcXXXXXXXXX
5No web, neetoForm is offline.

These examples clearly present why it is not a easy downside. We require a extra
focused and nuanced answer. Upon revisiting the problem, we discovered that our
preliminary answer of including .toLocaleLowerCase() does work, however it is a bit
verbose.

So we determined to attempt
customized formatters.
So as a substitute of including .toLocaleLowerCase() we created a pleasant customized formatter
referred to as lowercase.

1i18next.providers.formatter.add("lowercase", (worth, lng, choices) => {
2  return worth.toLocaleLowerCase();
3});

1{
2  "add": "Add a {{entity, lowercase}}",
3  "entities": {
4    "member": "Member",
5    "web site": "Web site"
6  }
7}

This works completely, but it surely does not clear up the verbosity downside. As a substitute of
including .toLocaleLowerCase() in JavaScript recordsdata, we’re now including it in
translation JSON recordsdata – primarily simply transferring the issue to a unique
place.

We would have liked a greater answer that required minimal effort.

The concept right here is to lowercase all dynamic values by default and create a
formatter to deal with exceptions. To realize this, we mixed our earlier
post-processor and a brand new formatter. The brand new formatter which we referred to as anyCase
can be utilized to flag any dynamic half within the textual content that must be excluded from
lowercasing. The post-processor will ignore these specific components of the textual content
whereas sentence-casing.

1const ANY_CASE_STR = "__ANY_CASE__";
2i18next.providers.formatter.add("anyCase", (worth, lng, choices) => {
3  return ANY_CASE_STR + worth + ANY_CASE_STR;
4});

1{
2  "message": "Your file is bigger than {{dimension, anyCase}}"
3}

The post-processor we wrote tried to determine these components of the textual content marked
by anyCase formatter utilizing sample matching and retaining the unique casing.
Nevertheless, this method failed when the textual content contained an identical phrases in each
the dynamic and static components of the textual content. It ended up lowercasing each phrases,
which isn’t the output we would have liked.

Remaining answer

Earlier than we focus on the ultimate answer, i18next lately modified how a formatter
is added, which is what we’ve been utilizing to this point, like beneath.

1i18next.providers.formatter.add("underscore", (worth, lng, choices) => {
2  return worth.change(/s+/g, "_");
3});

Earlier than this, i18next had totally different syntax, which they now name legacy formatting
is like beneath.

1i18next.use(initReactI18next).init({
2  sources: sources,
3  fallbackLng: "en",
4  interpolation: {
5    format: (worth, format, lng, choices) => {
6      // All our formatters ought to go right here.
7    },
8  },
9});

Now again to our authentic downside.

We’d like to ensure when making use of formatting it solely codecs dynamic components. For
this, we discovered that if we use the legacy model of formatting, it provides an
possibility referred to as alwaysFormat: true. One factor to recollect right here is that if we select
to make use of this flag, the most recent type of formatting doesn’t work. Which means we
want to maneuver all our customized formatters to legacy format operate.

1i18next.use(initReactI18next).init({
2  sources: sources,
3  fallbackLng: "en",
4  interpolation: {
5    escapeValue: false,
6    skipOnVariables: false,
7    alwaysFormat: true,
8    format: (worth, format, lng, choices) => {
9      // All of your formatters ought to go right here.
10    },
11  },
12});

This isn’t an issue for us, as a result of we’re already sustaining all our customized
formatter in a single place(neeto-commons-frontend package deal). Now the formatter is
utilized to each dynamic textual content. This method additionally overcame the “an identical phrases
within the textual content downside” that we encountered with the earlier model of the
formatter. Let’s take a look at our up to date formatter.

1const LOWERCASED = "__LOWERCASED__";
2const lowerCaseFormatter = (worth, format) => {
3  if (!worth || format === ANY_CASE || typeof worth !== "string") {
4    return worth;
5  }
6  return LOWERCASED + worth.toLocaleLowerCase();
7};

To elaborate on the code, the formatter lowercases all dynamic texts and
prefixes them with __LOWERCASED__. This prefixing is important as a result of the
formatter lacks details about the place this particular piece of textual content initially
appeared within the full textual content. By including this prefix, if the lowercased textual content
occurs to be the primary a part of the output, we will revert it in the course of the
post-processing stage. And that is exactly what we achieved within the
post-processor.

1const sentenceCaseProcessor = {
2  kind: "postProcessor",
3  identify: "sentenceCaseProcessor",
4  course of: worth => {
5    const shouldSentenceCase = worth.startsWith(LOWERCASED); // Examine if first phrase is lowercased.
6    worth = worth.replaceAll(LOWERCASED, ""); // Take away all __LOWERCASED__
7
8    return shouldSentenceCase ? sentenceCase(worth) : worth;
9  },
10};

Beneath is every part put collectively, When you’re considering a working instance of
the identical, checkout this
gist.

1const LOWERCASED = "__LOWERCASED__";
2const ANY_CASE = "anyCase";
3
4const sentenceCase = worth =>
5  worth.charAt(0).toLocaleUpperCase() + worth.slice(1);
6
7const lowerCaseFormatter = (worth, format) => {
8  if (!worth || format === ANY_CASE || typeof worth !== "string") {
9    return worth;
10  }
11  return LOWERCASED + worth.toLocaleLowerCase();
12};
13
14const sentenceCaseProcessor = {
15  kind: "postProcessor",
16  identify: "sentenceCaseProcessor",
17  course of: worth => {
18    const shouldSentenceCase = worth.startsWith(LOWERCASED);
19    worth = worth.replaceAll(LOWERCASED, "");
20
21    return shouldSentenceCase ? sentenceCase(worth) : worth;
22  },
23};
24
25i18next
26  .use(LanguageDetector)
27  .use(initReactI18next)
28  .use(sentenceCaseProcessor)
29  .init({
30    sources: sources,
31    fallbackLng: "en",
32    interpolation: {
33      escapeValue: false,
34      skipOnVariables: false,
35      alwaysFormat: true,
36      format: (worth, format, lng, choices) => {
37        // different formatters
38        return lowerCaseFormatter(worth, format);
39      },
40    },
41    postProcess: [sentenceCaseProcessor.name],
42    detection: {
43      order: ["querystring", "cookie", "navigator", "path"],
44      caches: ["cookie"],
45      lookupQuerystring: "lang",
46      lookupCookie: "lang",
47    },
48  });

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments