Gentry is the generic file parser technology that KantanMT uses to parse XML based file formats. Gentry is script driven and uses Rule files (*.rul) to instruct your KantanMT engine what to translate. These rule files are easy to create by using a simple text editor and it’s not uncommon for a complete XML parser to be built in a matter of minutes.
For example, suppose you want to translate the following XML file using your KantanMT engine:-
<? xml version="1.0" encoding="UTF-8" ?>
Introduction Acme Technologies provide intelligence, visual analysis, and forecasting for data centers We protect servers by allocating them to a suitable virtual recovery environment in case of a service outage.
In the scenario above, the elements <para> and <title> need to be translated by your KantanMT engine. We can use a Gentry rule file to define these as follows:-
Here we define para and title as roots. These are the fundamental parsing blocks for a Gentry rule file. Defining an element as a root means that all its content is extracted and translated by your KantanMT engine. Applying this rule to this scenario ensures that every <para> and <title> element is translated by your KantanMT engine.
However, suppose you only want to translate chapters that are to be printed - in other words, <chapter> elements that have a source attribute equal to ‘printed’. Adding this conditionality is simple with Gentry:-
In this rule, we only translate <para> elements if their <chapter> element has its attribute source equal to ‘printed’, ignore all other <para> elements!
Now you can see the power of Gentry and its rule files!
A Gentry rule file defines all the root elements that are to be translated by your KantanMT engine. It looks like this:-
para title (.*) $1 $1
This section defines all the Root elements in your XML document that you want your KantanMT engine to translate.
This section defines extraction, insertion, and output rules for your root elements. You must have at least one Regex defined in this section of the file.
|<gextractrule>||This element defines a matching regex for each root element. In the example above it matches everything from each root element.|
|<gextractOutputRule>||This element defines how each matching root element is presented to your KantanMT engine. In the example above, the matching element is passed straight through to KantanMT and no additional formatting is added to it.|
|<ginertRule>||This element defines how the matching translated root element is returned from your KantanMT engine and inserted back into your XML document. In the example above, the translation is inserted directly into the XML document with no additional formatting or text added.|
KantanMT Regex (Regular Expressions) build PEX and Gentry rule files. They are similiar to standard regular expressions with a few modifications to make them more powerful and flexible. For more information, please click here.