Filter Segments in TMX File that contain more than e.G. 5 Words
Thread poster: Sarah Jackowski
Sarah Jackowski
Sarah Jackowski
Germany
Local time: 19:21
English to German
Sep 18, 2014

Hi there,

I have an TMX Export of a Software user Interface and I would like to create a glossary or termbase from it for translators reference.
However, I would like to reduce the amount of segments by kicking out all segments which more than 5 words in the source text.

I'm open for any tool, but I thought I could do this with Olifant or Trados Studio 2014...

Olifant:
When I choose "View > Filter Settings" there are a few examples for filter
... See more
Hi there,

I have an TMX Export of a Software user Interface and I would like to create a glossary or termbase from it for translators reference.
However, I would like to reduce the amount of segments by kicking out all segments which more than 5 words in the source text.

I'm open for any tool, but I thought I could do this with Olifant or Trados Studio 2014...

Olifant:
When I choose "View > Filter Settings" there are a few examples for filters, and one filter already comes close to what I want:

"Source Text is longer than 255 characters"
The condition for it Looks like this:
LEN(Text_DE_DE) > 255

It would be great if I could adapt this rule to "Source Text is longer than 5 Words".
I also tried to work with the "longer than 255 characters" filter, turning it down to "longer than 30 / 40 characters", but I was not satisfied with the result. I think for my needs it's the best to filter by the amount of words.


In Trados Studio 2014 there is a similar Feature in the Translation Memories tab.
I Choose the TM, create a Filter and a condition
"Source Segment" + "greater than" but I have no idea what Kind of value to write down then... obviously, when I just write "5 Words" it wouldn't work...

Thanks a lot in advance for your help and happy translating
Collapse


 
RWS Community
RWS Community
United Kingdom
Local time: 19:21
English
If you use Studio... Sep 18, 2014

... and you seem to then maybe try this approach as it's fairly straightforward.

1. Convert the TMX to Excel using the Glossary Converter on the OpenExchange
2. Sort your rows in Excel by length, or use a formula to find the ones with 5 words or less
3. Remove the rows you don't want in Excel
4. Convert the Excel to a termbase using the Glossary Converter

Should be fairly simple I think.

Regards

Paul


 
Sarah Jackowski
Sarah Jackowski
Germany
Local time: 19:21
English to German
TOPIC STARTER
Tried that before, not happy with excel filtering options Sep 18, 2014

SDL Support wrote:

... and you seem to then maybe try this approach as it's fairly straightforward.

1. Convert the TMX to Excel using the Glossary Converter on the OpenExchange
2. Sort your rows in Excel by length, or use a formula to find the ones with 5 words or less
3. Remove the rows you don't want in Excel
4. Convert the Excel to a termbase using the Glossary Converter

Should be fairly simple I think.

Regards

Paul


Hi Paul,

thanks, but I already tried #2 before in Excel, but I could not find out what Kind of formula I have to create, I'm not a pro when it comes to creating regular expressions and stuff like that...


 
Sarah Jackowski
Sarah Jackowski
Germany
Local time: 19:21
English to German
TOPIC STARTER
Glossary Converter in the Open Exchange App Store Sep 18, 2014

By the way,

I see two versions of the glossary converter on the open Exchange App Store, one Charity Edition and one Free Edition, but it seems like I can not access the Free Edition...

I wanted to have a look on the free Edition, but then it's product page won't open, instead I am forwarded to a General page with Solutions for freelancers, Translation agencies etc.

Not that I wouldn't want to pay for the charity Edition, but I'd like to have a look on the
... See more
By the way,

I see two versions of the glossary converter on the open Exchange App Store, one Charity Edition and one Free Edition, but it seems like I can not access the Free Edition...

I wanted to have a look on the free Edition, but then it's product page won't open, instead I am forwarded to a General page with Solutions for freelancers, Translation agencies etc.

Not that I wouldn't want to pay for the charity Edition, but I'd like to have a look on the App's Reviews which are not available on the Charity Edition's overvview page...
Collapse


 
RWS Community
RWS Community
United Kingdom
Local time: 19:21
English
Excel and stuff... Sep 18, 2014

Hi Sarah,

I used this formulae:

=IF(LEN(TRIM(A1))=0,0,LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1)

I attached a spreadsheet here that will show you how to use it: https://www.dropbox.com/s/i10maoi58q2pyfx/count.xlsx?dl=0

On the links... I'm not sure why you can't get to it. It might be related to you having to tell the site what
... See more
Hi Sarah,

I used this formulae:

=IF(LEN(TRIM(A1))=0,0,LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1)

I attached a spreadsheet here that will show you how to use it: https://www.dropbox.com/s/i10maoi58q2pyfx/count.xlsx?dl=0

On the links... I'm not sure why you can't get to it. It might be related to you having to tell the site what kind of user you are (I recall seeing a problem for someone else like this), so a Freelancer, or an LSP or a Corporate... it can then display different content. Maybe look here which is the direct link to the developers site where you can download anyway: http://www.cerebus.de/glossaryconverter/

Regards

Paul
Collapse


 
Sarah Jackowski
Sarah Jackowski
Germany
Local time: 19:21
English to German
TOPIC STARTER
Finally... Worked out Sep 18, 2014

SDL Support wrote:

Hi Sarah,

I used this formulae:

=IF(LEN(TRIM(A1))=0,0,LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1)

I attached a spreadsheet here that will show you how to use it: https://www.dropbox.com/s/i10maoi58q2pyfx/count.xlsx?dl=0

On the links... I'm not sure why you can't get to it. It might be related to you having to tell the site what kind of user you are (I recall seeing a problem for someone else like this), so a Freelancer, or an LSP or a Corporate... it can then display different content. Maybe look here which is the direct link to the developers site where you can download anyway: http://www.cerebus.de/glossaryconverter/

Regards

Paul


Hi Paul,

thank you very much. Finally managed to get those things sorted in Excel

I will check out the link for the glossary converter.
When I tell the site that I am a Company, I am forwarded to the Trados Studio 2014 Product Information and not to the Glossary Converter in the App Store.

[Edited at 2014-09-18 12:42 GMT]

[Edited at 2014-09-18 12:54 GMT]


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 18:21
Member (2009)
Dutch to English
+ ...
PS: Jun 10, 2017

You can also easily sort columns in an excel file on text length (or anything else you want), using http://www.asap-utilities.com/

see e.g.
... See more
You can also easily sort columns in an excel file on text length (or anything else you want), using http://www.asap-utilities.com/

see e.g.: http://www.asap-utilities.com/blog/index.php/2013/04/23/tip-sort-your-data-on-anything-you-can-think-of/

Michael
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Filter Segments in TMX File that contain more than e.G. 5 Words







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »