This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Is there a way to fine tune settings to get better fuzzy matches?
Thread poster: Brent Sørensen
Brent Sørensen Germany Local time: 23:21 Member (2016) German to English + ...
Apr 16, 2020
Let's say that I have the following segment in my TM. Although this is still considered an acceptable alternative name, most botanists now use the name Lamiaceae in referring to this family.
My new translation has the following segment: 1. Most botanists now use the name Lamiaceae in referring to this family. 2. This is still considered an acceptable name, and most botanists now use it when referring to this family.
Let's say that I have the following segment in my TM. Although this is still considered an acceptable alternative name, most botanists now use the name Lamiaceae in referring to this family.
My new translation has the following segment: 1. Most botanists now use the name Lamiaceae in referring to this family. 2. This is still considered an acceptable name, and most botanists now use it when referring to this family.
Both would give fuzzy matches with a score of 70% or so. However, Segment 1 gives a much more useful fuzzy match. All I have to do is delete the first part of the sentence. For Segment 2, I will essentially have to translate from scratch, and it will not save me much time.
Is there any way that I can adjust the settings (possibly penalty settings) so that I will get only fuzzy matches of Type 1. ▲ Collapse
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
You could write a regex to do a "cleanup" and convert to a single name.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Brent Sørensen Germany Local time: 23:21 Member (2016) German to English + ...
TOPIC STARTER
Could you elaborate
Apr 17, 2020
Hi Anthony, Thanks for your response. Could you elaborate on that a bit? Thanks, Brent
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
DZiW (X) Ukraine English to Russian + ...
A per-project fitted value
Apr 17, 2020
Brent, while certain language pairs combo may tend to specific fuzziness levels, CATs have different and proprietary formulae to calculate fuzzy matches (which is but Levenshtein/Edit distance modifications). To add to the confusion, texts in some fields and styles may be prone to fluctuation badly, so even with similar SVO-languages, I believe it’s a per-project value. Empirically... See more
Brent, while certain language pairs combo may tend to specific fuzziness levels, CATs have different and proprietary formulae to calculate fuzzy matches (which is but Levenshtein/Edit distance modifications). To add to the confusion, texts in some fields and styles may be prone to fluctuation badly, so even with similar SVO-languages, I believe it’s a per-project value. Empirically increase the fuzzy match threshold, eliminating too many false positives.
On the other hand, I think it’s not funny when translators have to waste their time on tweaking and struggling with the hardware or software instead of doing the job. Why should one make wild guesses and wait for a prompt (fuzzy match) already knowing the translation, I wonder?--Unless it’s about mimicking somebody else’s ‘uniformity’ guidelines, lex, and style, of course. ▲ Collapse
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.