How are Trados fuzzy matches defined?
Thread poster: Harry Bornemann
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 00:36
English to German
+ ...
Nov 2, 2004

Does someone know the exact formula?

For example, if I have a first segment
"one two three" and a second segment
"two three four"
I thought that the second one would be a 66% match, because two thirds of it are contained in the first one.
But Trados tells me that this is only a 34% match, although it is a simple example without formatting, tags etc..


 
Jerzy Czopik
Jerzy Czopik  Identity Verified
Germany
Local time: 08:36
Member (2003)
Polish to German
+ ...
I don't know the exact formula Nov 2, 2004

but in your example not only 1/3 of the segment has changed, but the placement of words has changed too. As Trados does not "understand" the text, it analyses it (mathematical), taking words, abbreviations, declination and position in sentece, formatting, tags and so on into consideration.
I could imagine, that the matches for "one two three" and "two two three" could be something about 66% (four has one letter more than one, so perhaps this is important too).

Regards
Je
... See more
but in your example not only 1/3 of the segment has changed, but the placement of words has changed too. As Trados does not "understand" the text, it analyses it (mathematical), taking words, abbreviations, declination and position in sentece, formatting, tags and so on into consideration.
I could imagine, that the matches for "one two three" and "two two three" could be something about 66% (four has one letter more than one, so perhaps this is important too).

Regards
Jerzy
Collapse


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 00:36
English to German
+ ...
TOPIC STARTER
two two three Nov 2, 2004

Thanks Jerzy,

You guessed right: "two two three" gives 67%.

Nevertheless I would still need the definition.
I wonder whether it is patented or even a secret,
although it is the base of our calculations...


 
Jerzy Czopik
Jerzy Czopik  Identity Verified
Germany
Local time: 08:36
Member (2003)
Polish to German
+ ...
Write an email to Nov 2, 2004

support-de // trados.com, maybe they can give you an asnwer.

Regards
Jerzy


 
Selçuk Budak
Selçuk Budak  Identity Verified
Local time: 09:36
English to Turkish
+ ...
Check "Penalties" Tab Nov 2, 2004

Trados, when determining % fuzzy match, takes the "penalties" conditions you set in "Options/Translation Memory Options"

For example, if you set "formatting differences penalty" to 10,
It substracts this figure from the overall macth figure. Assume that you have two identical strings: "one two three," "one two three" If there is a formatting difference, you would not get 100%, but 90% mathc (assuming that other parameters are set to 0)

So to increase fuzzy match,
... See more
Trados, when determining % fuzzy match, takes the "penalties" conditions you set in "Options/Translation Memory Options"

For example, if you set "formatting differences penalty" to 10,
It substracts this figure from the overall macth figure. Assume that you have two identical strings: "one two three," "one two three" If there is a formatting difference, you would not get 100%, but 90% mathc (assuming that other parameters are set to 0)

So to increase fuzzy match, lower penalty values defined in the said Tab.

h.i.h.
Collapse


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 00:36
English to German
+ ...
TOPIC STARTER
I need to know it exactly Nov 2, 2004

Thanks Selçuk,

but I need to know the complete definition (formula, algorithm),
so I wrote to Trados support, as Jerzy suggested.

I will post the results,

Harry


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 00:36
English to German
+ ...
TOPIC STARTER
It is a Business Secret Nov 3, 2004

I received the answer from Trados:
They won't tell me, because it is one of their critical business secrets.
So we will never know what fuzzy match really means.


 
Brandis (X)
Brandis (X)
Local time: 08:36
English to German
+ ...
I think it goes on character basis Nov 3, 2004

HI! Well trados won´t tell us. I think they do on character basis. a = a is 100 % and ab gives 50% each etc., But that is very mathematical.
Brandis


 
Harry Bornemann
Harry Bornemann  Identity Verified
Mexico
Local time: 00:36
English to German
+ ...
TOPIC STARTER
Different definitions of different vendors Nov 3, 2004

Trados:
It is a critical business secret.
Déjà Vu:
A fuzzy match is one in which the sentence retrieved from the Translation Memory is not identical but only similar to the one currently being translated. The percentage you see is calculated by taking into account how many words differ, how the embedded codes differ, and also the order words are in. The percentage is not something accurate (except from a very specific point of view) and should only be used as an indicator to the translator who must then evaluate these matches and accept or reject them on the basis of his knowledge and understanding.

The actual method for calculating these percentages, which is very closely related to the way the searches are done, I cannot reveal.
Wordfast:
A fuzzy match is a non-exact match for a TU source segment (present in the memory) as compared to a document's source segment (the segment we want to translate). The algorithm is very complex. On longer sentences, the program calculates the precentage of words that are found in the two segments.
Words that begin with the same letters (like "Frau" and Frauen") but which are different will of course be counted as being present in both sentences, but of course, with some degree of penalties based on how many letters differ, and of course with some threshold limits.

Fuzzy-match algorithms are not really trade secrets, but it would takes pages and pages to describe them. What matters, essentially, is that the similarity is rated in a percentage way, based on words (exact or simlar) rather than letters, and most fuzzy-match algorithms are essentially the same with only very minor differences. I real life, translators care for anything above 80% and *really* care for 99 and 100%. Anything lower, practically, needs attention and re-translation anyway. Bickering over finer fuzziness points, which is very language-dependent, is an activity I would not spend too much time on
Passolo:
It is a business secret only revealed to big clients who need it to optimize their work flow.

Still no accurate definition found...

[Edited at 2004-11-04 12:02]


 
Klas Törnquist
Klas Törnquist
Local time: 08:36
English to Swedish
+ ...
Unacceptable Nov 3, 2004

Harry_B wrote:

I received the answer from Trados:
They won't tell me, because it is one of their critical business secrets.
So we will never know what fuzzy match really means.


I think this is quite unacceptable. Many agencies base their discount demands on Trados analyses. Some even use all the different Trados fuzzy percentages.
IIRC, Trados (or Translationzone) has even published "recommended rate reductions" for fuzzies.
Thus, Trados more or less decides "standard discounts" and still they won't tell us what the fuzziness really is.

Klas


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How are Trados fuzzy matches defined?







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »