Internationalising Plumi

The 3.0 release of Plumi will be internationalised, and include a few translations. There are a few guides on how to internationalise and translate Plone products (the good ones are linked below). I haven’t found one that sums it up from start to finish, so while I was going through the process of internationalising Plumi (which is now in the trunk of svn!), I wrote this guide.

Concepts

Before translating a product, the product needs to be set up for translation. i.e. all of the strings that appear to the end user need to be marked in a way that they can be replaced with a translated version. So rather than saying ‘print “Hello World!”’ in the code, an internationalised program will say ‘print translate(“hello_world”)’ or something to that effect. The translate function will work out (elsewhere) which language the user is working in, and look up a dictionary to translate the string. Somewhere, there will be a dictionary which maps “hello_world” to “Hello World!” if we’re working in English. The first part of the process – finding strings in the code and templates, and setting them up to be translated – is called internationalisation or i18n for lazy typists (like most good coders). (There are 18 letters between the i and the n). So i18n requires finding all of the user-facing strings in the project, marking them up in some way, and creating a dictionary for the initial language (English in this case) to use. There is more complicated stuff in there too, such as allowing for right-to-left scripts, and dynamic content that might have a different grammar, but I won’t go into this here.

After the i18n is complete (or even before), localisation (l10n) starts to happen. L10n is like translation, but with the more complicated stuff (script direction, grammar, etc) thrown in. I’m only going to consider the translation side of things. L10n basically involves translating the dictionary described above. So the program will see the string “hello_world”. The English dictionary will map this string into the English phrase “Hello World!”, while the French dictionary might map it into “Bonjour le Monde!”

Underneath the Hood

Much of the free software world uses the GNU gettext library to handle i18n and l10n. To the programmer, this is basically one function: ‘translate’ as used above. But since programmers are lazy, most of the time this is aliased to the function _ (that’s a single underscore). So if you see in a program _(“hello_world”) than that is probably an i18n’ed program (depends on the language of course…). The _ function will find (somehow) the context (i.e. language in use) then translate appropriately if it can. Of course it needs the dictionary files available in order to do this. And in a complex product like Plone, where might be many things that need translating, so possibly many different dictionaries (all for the same language) – e.g. a dictionary for the main Plone product, a dictionary for the Comment product, etc. So we get i18n domains – basically different dictionaries that apply to specific areas of the code. A dictionary will provide a particular domain, and a product (or piece of code) will declare which i18n domain it wants to use. So adding a patch to the plone core, you’d want to use the Plone domain, whereas creating your own product, you might want to create your own domain and use that.

Those dictionary files are typically kept in a locales directory. There is usually a .pot file for each domain, and a number of directories (by country code) for each translation e.g. fr/ for the French translation, which will contain another directory LC_MESSAGES, containing .po files (one for each domain) and .mo files. .po stands for Portable Object and is where the translations for a particular language live. .pot is the Template .po file, which is given to translators – it contains the string definitions and the initial language. .mo files are compiled .po files for faster computer lookups. Recent versions of Plone automatically compile .mo files on startup, but it is probably better to distribute compiled .mo files with your product once a translation is complete.

The older system of i18n in plone uses the i18n directory. .po, .pot and .mo files still live there, but in a flat format. A product ‘collective.foo’ might have an i18n directory, in which would live the files foo.pot, foo-fr.po, foo-de.po and so on, each of these being part of the ‘foo’ i18n domain. In the same directory there could be the files plone.pot and plone-de.po and plone-fr.po, which would extend the already existing plone domain with strings specific to that product. However, the i18n folder for translations is part of the old way of doing things, and if possible, you should use the locales folder instead.

Plumi and Plone

Plumi is a collection of products – plumi.skin, plumi.content, plumi.mediahost, (and some others) and plumi.app which pulls them all together. Two approaches exist for i18n’ing plumi – create an i18n domain for each product and ship the translations with each product, or create a single i18n domain ‘plumi’ and ship the translations as a separate product ‘plumi.locales’ (or perhaps as part of plumi.app). There are advantages and disadvantages with both, but we will go with the latter.

Step 1: Create the product/place for the translations

First up we’ll create a product ‘plumi.locales’ that will hold the locales, and not a lot else.

$ paster create -t plone plumi.locales

You could of course, skip this step and keep the locales in your product’s directory, in which case you would ignore the above step, and keep subsequent steps relative to your package’s directory.

Going into plumi.locales/plumi/locales, configure.zcml should become something like:

<configure
    xmlns="http://namespaces.zope.org/zope"
    xmlns:five="http://namespaces.zope.org/five"
    xmlns:i18n="http://namespaces.zope.org/i18n"
    i18n_domain="plumi">
  <five:registerPackage package="." initialize=".initialize" />
  <i18n:registerTranslations directory="locales" />
</configure>

And you’ll need to create the locales directory (yes, plumi.locales/plumi/locales/locales). You could now add directories for the languages you will be translating to, e.g. id, then create LC_MESSAGES into those directories:

~/.../plumi.locales/plumi/locales$ mkdir locales; mkdir locales/id; mkdir locales/id/LC_MESSAGES

Step 2: Internationalise your code

For each product (this example will use plumi.content) that is going to use the plumi localisations, we’ll need to modify the configure.zcml file so it has:

        i18n_domain="plumi"

inside the first <configure> tag. We don’t need to register the translations directory inside any other product, as they are handled elsewhere.

In the __init__.py file in plumi.content/plumi/content, we need to add (near the top)

from zope.i18nmessageid import MessageFactory
plumiMessageFactory = MessageFactory('plumi')

Then in each source file that includes user-facing strings, we need to add (near the top)

from plumi.content import plumiMessageFactory as _

which will allow us to use the function _() to translate things, e.g. a translatable string would look like:

_(u'hello_world', default='Hello World!')

which will define a msgid hello_world and a default msgstr for that as ‘Hello World’. Alternately we could just use:

_(u'Hello World!')

which would result in the msgid being “Hello World” and there being no default – just the msgid being used as the msgstr for the default translation.

Finished putting every single user-facing string in the above format? Good, now we can create the .pot file. First, we’ll need to install i18ndude. This is in pypi, so if you have setuptools installed, you can just run:

$ easy_install-2.4 i18ndude

Then we create the pot files from the locales directory (e.g. plumi.locales/plumi/locales):

$ i18ndude rebuild-pot --pot locales/plumi.pot --create plumi ../../../plumi.app/ ../../../plumi.content/ ../../../plumi.skin/

This creates the .pot file from all strings found in the plumi.app, plumi.content and plumi.skin directories. This command should be re-run after any changes to any strings in the code.

Next we need to create .po files, which will hold the actual translations. This can be done through the .po file editor, which can open a .pot file and generate the relevant .po (more on this later). But after the .pot file is updated, any existing .po files will also need to be updated. This can be done by the following line (which will also create any non-existent .po files):

$ i18ndude sync --pot locales/plumi.pot locales/id/LC_MESSAGES/plumi.po locales/ms/LC_MESSAGES/plumi.po

Step 3: Internationalise your templates

Since plone templates hold a lot of user facing strings, they need to be i18n’ised as well. The first thing to do is to tell the template which domain to use. In the main tag for the template (<html> if your template is HTML), you need to specify the namespace, and define the domain to use:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
      ...
      xmlns:i18n="http://xml.zope.org/namespaces/i18n"
      lang="en"
      i18n:domain="plumi">

If your template is set for inclusion in another template, and doesn’t have <html> tags, you can specify the i18n:domain inside a <tal>, <div> or <span> tag (in fact, any tag) to encompass your content. In fact, you can even specify the domain for individual elements. Suppose you wanted most strings for a template to come out of the plone domain, but had one or two strings specific to your product, you could put i18n:domain=”plone” in the <html> tag, and for the product strings put something like:

<span i18n:domain=”myproduct” i18n:translate=””>My Product's String</span>

which also shows the basics of how to set a translations in a template: add i18n:translate to the tag. If you have a string ID, you can put that in too:

<a href=”http://foo.com” i18n:translate=”linktext_foo”>Foo Website</a>

Or leave it empty to use the string itself as the string ID:

<h1 i18n:translate=””>Welcome to Plumi</h1>

The i18n:translate attribute can be put into any tag – if you don’t have a tag around your string, you can put a <span> around it, which will avoid messing up any formatting.

Tag attributes can also be translated – suppose you have image alt text you want translated. You specify which attributes should be translated like this:

<img src=”foo.jpg” alt=”Image of foo” title=”Foo” i18n:attributes=”alt title”>

You can also specify msgid’s by using a semi-colon separated list instead

<img src=”foo.jpg” alt=”Image of foo” title=”Foo” i18n:attributes=”alt foo-alt; title foo-title”>

Things get a little more tricky with dynamic content. If you want to translate an event, you might have a string which includes a dynamic date:

<p>The event starts on <span tal:content=”here/start_date”>15 Dec 2012</span>.</p>

Adding i18n to the <p> tag would create:

<p i18n:translate=”event_starts_on”>The event starts on <span tal:content=”here/start_date”>15 Dec 2012</span>.</p>

but this isn’t quite enough, due to the dynamic span. So we give the dynamic content an i18n:name attribute, which will replace it in the .pot and .po files:

<p i18n:translate=”event_starts_on”>The event starts on <span tal:content=”here/start_date” i18n:name=”start-date”>15 Dec 2012</span>.</p>

The generated .po/.pot string will look like:

msgid “event_starts_on”
msgstr “The event starts on ${start-date}.”

Which will let the translation team know there is dynamic content. The ${start-date} variable should be left as-is by the translation team, but can be moved relative to the other words in the string. This mechanism allows for the translating team to see complete phrases that can be translated, rather than short fragments which don’t necessarily follow the same order in other languages.

Step 4: Internationalise your content

Plone content translations are handled by LinguaPlone. For user generated content, LinguaPlone adds a ‘translate to’ option to most pages, allowing users to translate content on the site. However, there are a couple of traps:

By design, Linguaplone does not allow for the concept of fall-back languages. So if you click on the Indonesian flag to get the site in Indonesian, you will only see content that has been translated in Indonesian – not any English content. The exception is any content that is language neutral – this will be displayed whichever language is selected. But by default, all content is created with a language attribute, so we have to explicitly enable this behaviour for new content. In the ZMI for the plone site, you could go to portal_languages, and check Create content initially as neutral language to do this, or set it in code with

lang = getToolByName(self, 'portal_languages')
lang.start_neutral = 1

(you can also set other language related attributes here, such as the supported languages, and whether or not to display language flags on the site).

There is also the issue of content that your product creates when it is installed. To automatically generate translations for this content, while keeping the translations in .po files requires some python work. In the case of Plumi, the main auto-generated content is vocabularies, folders and collections (smartfolders) which pull in content based on its vocabulary.

Site content created using GenericSetup XML will have language attributes set already, and this is not changeable via GenericSetup, so I ended up setting this in code, using:

fldr.setLanguage('')

for each folder in the product install code, but this ended up not being needed at all, as I got rid of the GenericSetup folder creation, doing it in code instead, but this may be useful to some people anyway.

Vocabularies

Vocabularies need translation. In our case the only direct contact the user has with vocabularies is when they are creating a video or callout. For that side of the equation, we need to tell the Archetypes schema what translation domain to use for displaying the strings, as well as i18n’ing the label and description as per above. So the ‘Categories’ field in plumivideo.py will look like:

    atapi.LinesField(
        'Categories',
        storage=atapi.AnnotationStorage(),
        widget=atapi.MultiSelectionWidget(
            label=_(u"Video Categories"),
            description=_(u"The video categories - select as many as applicable."),
            i18n_domain='plumi',
        ),
        vocabulary=NamedVocabulary("""video_categories"""),
	languageIndependent=True,
    ),

Any other fields that use vocabularies will need the i18n_domain set to whatever domain your translations are in (plumi in this case, or for ATCountryWidget, atcw).

Collections

The other side of the vocabs equations is in browsing content. This is handled by Collections (or Smart Folders as they were once known). Collections are definitely content, so each collection needs to have a translation created. Folders are also content, and any folders created via GenericSetup need translations added (since GenericSetup doesn’t seem to do this itself). So I decided to get rid of the GenericSetup structure for creating the initial folders, instead creating them (along with translations) in the installer code. Translation was done by creating two helper functions, called where-ever a folder or collection was created or deleted:

from zope.component import getUtility
from zope.i18n import ITranslationDomain
from zope.i18nmessageid import MessageFactory
_ = MessageFactory("plumi")

def createTranslations(portal,canon):
    parent = canon.getParentNode()
    wftool = getToolByName(portal,'portal_workflow')
    plumiDomain = getUtility(ITranslationDomain, 'plumi')
    plumiLanguages = plumiDomain.getCatalogsInfo()
    langs = []
    for lang in plumiLanguages.keys():
        if str(lang) != 'test':
            langs.append(str(lang))
    for lang in langs:
        transId = '%s-%s' % (canon.id, lang)
        transTitle = plumiDomain.translate(canon.title,
                                           target_language=lang)
        transDesc = plumiDomain.translate(canon.description,
                                          target_language=lang)
        if not hasattr(parent, transId):
            if parent != portal and parent.hasTranslation(lang):
                #if parent folder has a translation, put the clone in that
                translation = parent.getTranslation(lang).manage_clone(canon,
                                                    transId)
            else:
                translation = parent.manage_clone(canon, transId)
            translation.setTitle(transTitle)
            translation.setDescription(transDesc)
            translation.setLanguage(lang)
            translation.addTranslationReference(canon)
            publishObject(wftool, translation)

def deleteTranslations(canon):
    for translation in canon.getBRefs():
        canon.getParentNode().manage_delObjects(translation.id)

The createTranslations function first tries to find if the parent node has a translation for the given language, and put the new translation in that if possible. Using the translate() function from the ITranslationDomain class allows the strings to be specified in .po files. This particular function tries to make a translation into each language found in the ‘plumi’ domain. If a particular string is not translated in a present language, it will still create the translated object, but will use the canonical (untranslated) string(s) to create it.

The deleteTranslations functions just goes through all of the linked translations for an object and deletes them.

Creating a folder and its translations (after first deleting it and translations if they exist) looks like:

    try:
        canon = getattr(self, 'taxonomy')
        deleteTranslations(canon)
        self.manage_delObjects(['taxonomy'])
    except:
        pass
    self.invokeFactory('Folder', id = TOPLEVEL_TAXONOMY_FOLDER,
                       title = _(u'Browse Content'))
    taxonomy_fldr = getattr(self,TOPLEVEL_TAXONOMY_FOLDER,None)
    publishObject(wftool,taxonomy_fldr)
    createTranslations(self,taxonomy_fldr)

Creating a collection (in this case as a child of ‘categ_fldr’) and its translations now looks like:

        categ_fldr.invokeFactory('Topic', id=new_smart_fldr_id,title=vocab[1])
        fldr = getattr(categ_fldr,new_smart_fldr_id)
        type_criterion = fldr.addCriterion('Type', 'ATPortalTypeCriterion' )
        type_criterion.setValue("Plumi Video")
        type_criterion = fldr.addCriterion('getCategories', 'ATListCriterion' )
        type_criterion.setValue(vocab[0])
        type_criterion.setOperator('or')
        state_crit = fldr.addCriterion('review_state', 'ATSimpleStringCriterion')
        state_crit.setValue('published')
        sort_crit = fldr.addCriterion('modified',"ATSortCriterion")
        sort_crit.setReversed(True)
        fldr.setLayout(layout_name)
        publishObject(wftool,fldr)
        createTranslations(self,fldr)

That’s all folks

And that just about wraps it up. This document was written over the course of a couple of weeks while actually i18n’ing plumi. Sometimes I did something, then realised it was wrong and had to do it again, and have possibly left fragments of the wrong way in the document. Work is already underway for the next-generation i18n system for Plone, and things will undoubted change, hopefully making some things easier.

References:

http://plone.org/documentation/how-to/i18n-for-developers – Basic reference from the plone site
http://plone.org/documentation/how-to/product-skin-localization – Basic reference for templates
http://grok.zope.org/documentation/how-to/how-to-internationalize-your-application – a slightly more detailed reference
http://maurits.vanrees.org/weblog/archive/2007/09/i18n-locales-and-plone-3.0 – Maurits’ site has heaps of useful info about i18n, and Plone in general
http://pypi.python.org/pypi/Products.LinguaPlone/ – info about LinguaPlone, from the source
http://www.mail-archive.com/ngo@lists.plone.org/msg00449.html – some examples of the LinguaPlone API
http://www.upfrontsystems.co.za/courses/plone/ch02s05.html – older document about translations in Plone, but some good examples.
http://plone.org/support/forums/core#nabble-td3142356 – discussion about the next generation i18n system for Plone

Leave a Reply