Pages in topic:   < [1 2]
Custom Machine translation, a new revolution on the way?
Thread poster: Philippe Locquet
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Excellent description, thank you! Nov 8, 2018

Anton Konashenok wrote:
MT


@Anton:
I have to thank for this, your post means a lot since it reflect real experience (even if wasn’t an enjoyable one) and presents some of the real-word problems that can be encountered. I’d like to address some of the problems that you came across:
_Not the intended use: Some may have high expectations hoping that their engine will be able to replace human translation. The goal at first should be to get suggestions that will include relevant terminology when below the fuzzy threshold of a connected TM (i.e. anything below 60 or 70% match).
_Quality of training data: Thank you for mentioning this. You get a good cook to deliver a tasty dish if you give him rotten eggs. The data has to be good. Upon upload, a good Custom MT should automatically eradicate some data that is not relevant. It can cut the input data in half depending on the data.
_Amount of data: Not mentioning glossaries, but focusing on TM, the nominal amount to build from scratch should be between 15 to 20 million words. It’s a lot and it has to be.
_Metrics: They can’t compare with actual human reading of the output. Metrics can be useful to the technician at the beginning while working blind. I give you an example: If I’m building for someone an engine from Hindi to Russian, I will have no idea at all of the quality, I will have to go by metrics at first.
_Human review: This is where the real testing lies. A good Custom Machine Translation platform should provide you with these. Here are a few examples or tests:
*Editing: the reviewer can be asked to edit the text. If he struggles (time + feedback) the engine is not ready for production and needs reworking.
*Comparing: the reviewer can be asked to compare two versions of text by two different engines (useful for comparing versions of the same engine or compare with a different tool etc.)
*Evaluation: What you were asked to do: give feedback and qualify issues according to preset parameters.
So the way these tools are managed has a lot to do with how it will feel for the translator. It is true and logical that they improve with time (same as with TMs). However bad data is a bad as bad TM, you can’t really use it. The sector is new so those handling these tools are fairly new at it too. So the margin for improvement is really there. BTW, my answers are based on the tool I know mentioned above.

Thanks again for your comment I really enjoyed reading it.
My bests 😊


 
Robert Rietvelt
Robert Rietvelt  Identity Verified
Local time: 21:40
Member (2006)
Spanish to Dutch
+ ...
Saying Nov 8, 2018

In Holland we say 'a chain is just as strong as its weakest link'. This is what Anton is saying.

Thank you for your clear explanation.


Philippe Locquet
 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Indeed! Nov 8, 2018

Robert Rietvelt wrote:

In Holland we say 'a chain is just as strong as its weakest link'. This is what Anton is saying.

Thank you for your clear explanation.


Thanks Robert
Couldn't agree more!

My bests


 
Kaspars Melkis
Kaspars Melkis  Identity Verified
United Kingdom
Local time: 20:40
English to Latvian
+ ...
strict glossary adherence is sometimes overrated Nov 8, 2018

Because translation quality is inherently difficult to measure, there is a tendency to overuse glossary adherence as a quality metric and that can lead to undesirable outcomes.

While in general I agree that glossary is important, however, there are some pitfalls when evaluation of glossary adherence is used for QC.

One case is when a term can have several meanings. For example, “power” can have different translations depending on subject (maths, statistics, optics,
... See more
Because translation quality is inherently difficult to measure, there is a tendency to overuse glossary adherence as a quality metric and that can lead to undesirable outcomes.

While in general I agree that glossary is important, however, there are some pitfalls when evaluation of glossary adherence is used for QC.

One case is when a term can have several meanings. For example, “power” can have different translations depending on subject (maths, statistics, optics, physics). Many technical texts will combine several aspects but the glossary may not include all fields. It is still up to a translator to evaluate if the given glossary term applies in the given sentence. I have often found that the term “statistical power” is mistranslated either because the translator was relying on the given glossary too strictly or were afraid to use a correct translation because the QC tool would signal it as an error.

Another is when glossaries get overpopulated with too many terms that can also have an ordinary meaning. For example, “a head” can be a specific part in some device that needs to be translated consistently and sometimes not with the same word as the head of a human, so it gets included in the glossary. The glossary will be used for all projects from the same client even if this term is never used again in technical sense. Again, it can be quite confusing to a translator who receives instructions “to follow the glossary without exception”.

Some terms in English text can easily change from a noun to a verb or an adjective but that may not be the case in the target language. One example is “screening, to screen, screened (patients)”. It is possible that the translation needs to use a different word in each case, or that the verb needs to be translated with a longer phrase. Compiled glossaries can rarely predict such cases.

It may be even the case when applicability of a term varies between texts of different registers. It is especially important in pharmacy where the language for product information/patient leaflet needs to be adjusted. Sometimes a professional term might not be appropriate for patients in English but be fine in other languages, or vice versa – some English terms might be fine to both doctors and patients but in other languages they have to be changed.

With this I don't want to minimise the necessity to use correct terminology in translations. I have seen translations where even simple terms are translated incorrectly. The most common mistake I have seen is “median” which for some reason often gets translated as “average”. Maybe this mistake happens because a translator was not really qualified for a given text. LSP may try to address quality issues by introducing strict glossary adherence which might work. But it can also mask translator's incompetence and creates texts that pass all QC metrics but still be quite incomprehensible.

After all, it is much easier to fix a translation that is good overall but uses non-standard terms than the one which has all the right glossary terms but poor readability.
Collapse


Matheus Chaud
Alessandra Muzzi
 
Kaspars Melkis
Kaspars Melkis  Identity Verified
United Kingdom
Local time: 20:40
English to Latvian
+ ...
another reason why glossaries may cause lazy translations Nov 8, 2018

If a translator finds an unfamiliar term, it forces them to do research and not only find the translation of the term but get better understanding of the subject. If all terms are already provided, translation task may seem easy and translator may not even realize that there is something more to it.

I remember a case where there was a mistake in translation of “formulation” which can have several meanings even in pharmaceutical context: 1) an actual drug for specific use, 2) a
... See more
If a translator finds an unfamiliar term, it forces them to do research and not only find the translation of the term but get better understanding of the subject. If all terms are already provided, translation task may seem easy and translator may not even realize that there is something more to it.

I remember a case where there was a mistake in translation of “formulation” which can have several meanings even in pharmaceutical context: 1) an actual drug for specific use, 2) a dosage form, 3) process of finding the composition of a drug, 4) production of a drug.

Those who can read Russian, will certainly enjoy this article how translators struggle with this term: http://provizor.trworkshop.net/2012/12/09/formulation/
Collapse


 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Useful video Nov 9, 2018

Hi there!

Since many found this topic interesting by sounds a bit technical, I put together a video trying to describe what's in it for translators and LSPs. I hope you'll find it useful and interesting.
If you did, leave a like and share!
Link below:

https://youtu.be/0YeyhA4Jnv4

My bests to all...
See more
Hi there!

Since many found this topic interesting by sounds a bit technical, I put together a video trying to describe what's in it for translators and LSPs. I hope you'll find it useful and interesting.
If you did, leave a like and share!
Link below:

https://youtu.be/0YeyhA4Jnv4

My bests to all
Collapse


 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Curiosity didn't kill the C.A.T. not even its Master! Nov 12, 2018

Hi to all,

Just a quick one to address the elephant in the room:
After looking at all the posts made here and if you watched the video I published, we can draw a few conclusions.

_Some may have thought that MTs would replace human translation. It will hardly happen, not in the immediate future anyway!
_Some may have thought that MTs would replace CAT tools, it's very unlikely too.

But when used correctly tools such as Custom Machine Translations
... See more
Hi to all,

Just a quick one to address the elephant in the room:
After looking at all the posts made here and if you watched the video I published, we can draw a few conclusions.

_Some may have thought that MTs would replace human translation. It will hardly happen, not in the immediate future anyway!
_Some may have thought that MTs would replace CAT tools, it's very unlikely too.

But when used correctly tools such as Custom Machine Translations, used with a good CAT tool can definitely make a huge improvement in workflows when working for clients having big volumes to translate.

So curiosity didn't kill the cat, not this time

My bests
Collapse


 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Glossary making Nov 12, 2018

Kaspars Melkis wrote:

If a translator finds an unfamiliar term, it forces them to do research and not only find the translation of the term but get better understanding of the subject. If all terms are already provided, translation task may seem easy and translator may not even realize that there is something more to it.

I remember a case where there was a mistake in translation of “formulation” which can have several meanings even in pharmaceutical context: 1) an actual drug for specific use, 2) a dosage form, 3) process of finding the composition of a drug, 4) production of a drug.

Those who can read Russian, will certainly enjoy this article how translators struggle with this term: http://provizor.trworkshop.net/2012/12/09/formulation/



Hi Kaspars ,

Regading this issue, there is a good way around:
_Have a vast glossary that has no inconsistency in target: 1 source term = only 1 target term. It should work fine in your CMT
_In your CAT tool, it would be wise to have one reference glossary (not for fast insertion) that has one source term and a vast number of possible target terms separated by a comma and sapce. This helps the translator to select the adequate term to avoid errors.

I've seen that used in many cases and it has worked well.

My bests


 
Shouguang Cao
Shouguang Cao  Identity Verified
China
Local time: 04:40
English to Chinese
+ ...
a different search and replace Nov 28, 2018

GT4T's 'fix MT with glossary' is not simple search and replace.

If a word is CONSTANTLY translated wrong, simple replacing of the MT translation with your own translation will work.

But with GT4T it doesn't matter what MT will translate a word into as it actually replaces the source word.

GT4T glossary file contains 'source - target' pairs, not 'wrong translation - correct translation' pairs.

The post-translation of GT4T works like this.
... See more
GT4T's 'fix MT with glossary' is not simple search and replace.

If a word is CONSTANTLY translated wrong, simple replacing of the MT translation with your own translation will work.

But with GT4T it doesn't matter what MT will translate a word into as it actually replaces the source word.

GT4T glossary file contains 'source - target' pairs, not 'wrong translation - correct translation' pairs.

The post-translation of GT4T works like this.

1) Mark a word as untranslatable so the word will remain untranslated in the translation.
2) Replace that untranslated word in MT translation with your glossary.

This feature thus can be used to mark 'untranslatable' terms. If you add a term to GT4T glossary leaving the translation the same with source, that term will always remain untranslated in MT translations.






[Edited at 2018-11-28 05:31 GMT]
Collapse


 
Philippe Locquet
Philippe Locquet  Identity Verified
Portugal
Local time: 20:40
English to French
+ ...
TOPIC STARTER
Adapt your CMT as you go Feb 6, 2019

Hi,

We have been discussing Custom Machine Translation here and I just wanted to let you all now that it is now possible to adapt your CMT while you are translating.

Wordfast has created an API connector for Kantan MT, this system allows you to adapt your MT directly from the CAT tool WFA in two ways: segment by segment or in one go at the end of translating your file.

I made a video describing this, this may enable to clarify what it does since it's someth
... See more
Hi,

We have been discussing Custom Machine Translation here and I just wanted to let you all now that it is now possible to adapt your CMT while you are translating.

Wordfast has created an API connector for Kantan MT, this system allows you to adapt your MT directly from the CAT tool WFA in two ways: segment by segment or in one go at the end of translating your file.

I made a video describing this, this may enable to clarify what it does since it's something completely new.

Watch it here: https://youtu.be/0Ta2dXRTvU4

My bests to all
Collapse


 
Pages in topic:   < [1 2]


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Custom Machine translation, a new revolution on the way?







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »