TM Search Options

There is one tab containing the TM search parameters. Its appearance is as follows:

The Search tab contains the following elements:

· Search for source segment automatically when a translation unit is opened:

If this checkbox is checked, MetaTexis will automatically search in the specified TMs when a TU is opened and the translation box is empty (no translation or TM segments present).

If the Untranslated units only option is checked, the search will only be executed if the translation box is empty.

If the All (also check for different translations) option is checked, the search will also be executed if there is already a translation in the document. If TM contains a translation for the current source text which differs from the translation in the document, a special dialog is displayed which lets you decide what to do with the existing translation in the document (see the Working with TM check results).

If the checkbox Update search in Main TM only is checked, the search for existing translation in the TM is restricted to the main TM.

This function is very important to ensure a consistent translation of your document, for, as a general rule, identical sentences should have the same translation. This is especially true for technical documents.

· Apply language classes:

If this checkbox is checked, MetaTexis will look for language classes rather than for the exact language defined. There are several languages that have variants. For example, there are many variants of the English, French and Spanish languages. If language classes are applied, MetaTexis treats the variants of a language as the same language, e.g. English (UK) and English (USA) are treated as one language. So, if a TM contains segments in different languages that belong to the same language class (e.g. "English (UK)" and "English (US)"), they are all included in the search. On the other hand, if this checkbox is unchecked, MetaTexis will only include the segments that are in the same language as the source language of the MetaTexis document (see Document options).

· Restrict search to these categories:

If you enter a category in this text box, the TM search is restricted to segments with this category. You have to be careful with this command: Make sure that the category entered actually exists in the TMs.

If you enter more than one category, they must be separated by a semicolon.

· Restrict search to these translators:

If you enter a translator name in this text box, the TM search is restricted to segments whose last editor is one of the specified translators. You have to be careful with this command: Make sure that the translators entered actually exist in the TMs.

If you enter more than one translator, they must be separated by a semicolon.

· Minimum similarity for selecting TM segments:

In this text box, you define the lower limit of similarity which a TM segment must reach to be presented to the translator. The percentage refers to the number of words which are identical. A TM segment in a database is only selected if at least X % of the words are identical with the words of the source segment searched for.

Example: You have defined 60% as the minimum level of similarity (default). And you want to translate the sentence: "He loves Alicia." You let MetaTexis search for TM segments in the main TM, which contains only three segments (and their translations): 1) "He hates Enrique.", 2) "He wants Alicia.", and 3) "He loves Shakira.". TM segments 2) and 3) are selected because 2 of 3 words are equal (66.6%), whereas TM segment 1) is not selected because only one word is equal (33.3%).

· Minimum similarity for selecting TM segments in case of identical sub-segments:

In some cases, it can make sense to select a segment from a TM even if the minimum similarity has not been reached, namely when a sub-segment is identical.

Example: In the two sentences "She loves Enrique desperately, but hopelessly." and "She loves Enrique." the sub-segment "She loves Enrique" is identical. The similarity value is 50%. Therefore "She loves Enrique" does not meet the normal minimum similarity criterion, based on the number of words. However, if the similarity criterion for sub-segments is less than 50%, the segment is selected from the TM, after all.

Note

Note: The settings made in this tab have a great impact on the speed of the search process when the TMs are big: The lower the values, the slower the search process. The higher the values, the faster the search process. If you experience bad search performance there are two ways to improve: Firstly, set the minimum similarity to higher values. Secondly, if the option "Use TM as TDB" is switched on, switch it off.

· Ignore index fields:

This checkbox is not shown for tagged documents. If this checkbox is active, index fields in TUs are ignored when MetaTexis executes TM searches, and TUs are saved without any index fields, if RTF saving active.

· Ignore internal tags:

This checkbox is only shown for tagged documents. If this option is checked, internal tags will be ignored and not be saved in the TM. This is relevant for tagged documents such as HTML or XML documents. You are advised to activate this option because internal tags usually only contain formatting information.

· Use TM also as TDB:

If this option is checked, the TM will not only be searched as TM, but also as TDB, that is, the TUs in the TM will be treated as terminology. This can further increase your translation efficiency, for example, when the text to be translated contains segments consisting of several smaller sentences already translated before.

Caution: In the case of big TMs this option can have a very negative impact on the search performance. So, if you experience bad search performance, make sure that this option is switched off.

· Language chain searching:

If this option is checked, the search will be extended to find more TUs if the TM contains multi-lingual content. For example, let’s assume that you are translating a text from English to French (EN->FR). If the TM contains TUs in the language combinations EN->IT and IT->FR, where one EN segment is very similar or identical to the segment currently searched, the TM search will usually not be successful because there is no EN->FR dataset in the TM. However, if the language chain searching is active, MetaTexis will look further. And if the IT segments are identical, MetaTexis will actually find the French translation of the Italian text and assign it to the English source text, and an EN->FR hit will be displayed. This search even works across TMs!

Moreover, if the inverse searching is active, the language chain search even works if the language direction are mixed, e.g. MetaTexis will find a match if the TM has the TUs IT->EN and FR->IT.

· Inverse search (the databases must be enabled for inverse searching):

If this option is checked, the TMs will also be searched for matches with the opposite language direction. This option only works if the database is activated for inverse searching and saving when it is created (see Local MetaTexis Databases). Combined with the language chain-searching feature, this opens up amazing possibilities (see above).

· Handling for prevalent words:

To improve the search speed in TMs MetaTexis uses a special technology for words used in very many segments (like "the" and "a" in the English language). If the search speed in TMs is slow you can try to improve by changing the method used for the handling of prevalent words. There are three options: "Off", "Special handling", and "Skip". In the case of option "Off" prevalent words are handled like all other words. In the case of option "Special handling" the prevalent words are treated in a special way. In the case of option "Skip" prevalent words are not included in the fuzzy search. The latter option might be useful for languages with words or signs that occur in virtually every segment or for languages with many prevalent words or signs.

· Threshold for prevalent words handling:

The threshold for prevalent words handlings determines which words are handled as prevalent words. The default value is 33%, that is if a word occurs in more than 33% of all TUs in a TM, the word is view as prevalent word by MetaTexis.