Tags diminishing machine translation results Thread poster: Thijs Vissia
|
I was wondering about the results I’m getting from Machine Translation through OmegaT. Currently I’m using one service that is available free of charge, MyMemory(Machine). I’m noticing that a lot of idiom is not being recognised and translated accordingly when tags are inbetween several words that make up an idiom. For example, I was translating the following sentence (in Dutch): “Ieder instituut gaat beschikken over meer geld en meer gebouwen.” ... See more I was wondering about the results I’m getting from Machine Translation through OmegaT. Currently I’m using one service that is available free of charge, MyMemory(Machine). I’m noticing that a lot of idiom is not being recognised and translated accordingly when tags are inbetween several words that make up an idiom. For example, I was translating the following sentence (in Dutch): “Ieder instituut gaat beschikken over meer geld en meer gebouwen.” With some tags/formatting strewn in, this became: “Ieder instituut gaat beschikken over meer geld en meer gebouwen.” In there, there is the widespread idiom “beschikken over” (meaning “have access to”, “have at ones disposal”). I don’t know about the quality of the machine translation at MyMemory, but as a widespread idiom this should be recognised and translated properly. When the tags were in there, this was returned as: “Everyone institute about more money and more buildings.” The connection between “ieder” and “institute” was broken by the tag, so instead of “every institute” it rendered this as “everyone institute”. Similarly, the composite verb “beschikken over”, was also interrupted by a tag, so the MT treated each piece separately, and apparently left out the verb entirely. However, after creating a new file without tags, it came back as: “Every institute will have more money and more buildings.” Which may not be my phrasing of choice but otherwise a fine translation. So I was wondering why the tags get sent out to the machine translation services in the first place? Is it so that all the formatting doesn’t need to be put back in manually afterwards? Wouldn’t it be almost as easy to strip the strings of the tags before sending the query to the MT service? Considering that OmegaT already needs to recognise tags as such (to treat them differently in the editor pane), wouldn't it be possible to make sending them to the MT service optional? It seems to me that sending the tags along is seriously reducing the quality of the MT results.
[Edited at 2019-03-03 10:37 GMT] ▲ Collapse | | | Translator can remove tags before translation | Apr 1, 2019 |
Thijs Vissia wrote: It seems to me that sending the tags along is seriously reducing the quality of the MT results. Translator can remove tags before pretranslation against TMX or using MT, http://www.condak.cz/nove/2019-03/31/en/00.html and put them back after pretranslation. Milan | | |
Milan Condak wrote: Translator can remove tags before pretranslation against TMX or using MT, (...) and put them back after pretranslation. Milan hi Milan, Ah, thank you for the clarification, I didn't realize you could put them back afterwards by toggling the option again, but of course the source file isn't changed. I somehow assumed this worked the same way as tagwipe, which does affect the source file. I think the documentation could be a bit clearer about this, or even the option in Preferences, 'Remove tags' seems rather definitive. But clearly this solves my problem, I can translate and use MT and manually put tags back after translating. cheers, Thijs | | | Samuel Murray Netherlands Local time: 05:20 Member (2006) English to Afrikaans + ... Fixed post (your membership fee will never buy fixed forum software) | Apr 2, 2019 |
Thijs Vissia wrote: I was wondering about the results I’m getting from Machine Translation through OmegaT. Currently I’m using one service that is available free of charge, MyMemory (Machine). I’m noticing that a lot of idiom is not being recognised and translated accordingly when tags are inbetween several words that make up an idiom. For example, I was translating the following sentence (in Dutch): “Ieder instituut gaat beschikken over meer geld en meer gebouwen.” With some tags/formatting strewn in, this became: “Ieder <f0>instituut gaat beschikken</f0><f1> </f1><f2>over meer geld</f2> en meer gebouwen.” In there, there is the widespread idiom “beschikken over” (meaning “have access to”, “have at ones disposal”). I don’t know about the quality of the machine translation at MyMemory, but as a widespread idiom this should be recognised and translated properly. When the tags were in there, this was returned as: “Everyone <f0> institute </f0><f1></f1><f2> about more money </f2> and more buildings.” The connection between “ieder” and “institute” was broken by the tag, so instead of “every institute” it rendered this as “everyone institute”. Similarly, the composite verb “beschikken over”, was also interrupted by a tag, so the MT treated each piece separately, and apparently left out the verb entirely. However, after creating a new file without tags, it came back as: “Every institute will have more money and more buildings.” Which may not be my phrasing of choice but otherwise a fine translation. So I was wondering why the tags get sent out to the machine translation services in the first place? Is it so that all the formatting doesn’t need to be put back in manually afterwards? Wouldn’t it be almost as easy to strip the strings of the tags before sending the query to the MT service? Considering that OmegaT already needs to recognise tags as such (to treat them differently in the editor pane), wouldn't it be possible to make sending them to the MT service optional? It seems to me that sending the tags along is seriously reducing the quality of the MT results.
[Edited at 2019-04-02 05:54 GMT] | | | There is no moderator assigned specifically to this forum. To report site rules violations or get help, please contact site staff » Tags diminishing machine translation results Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |