Formats for term database exchange
Thread poster: langu2
Oct 30, 2013

Which file formats are usually used by translators for the exchange of term databases?

 
Heartsome Support
Heartsome Support
Local time: 19:43
txt-based Oct 31, 2013

File formats such as TBX, Excel, TXT etc. is usually used for this. It depends on your CAT-supported file formats for importing and exporting.

 
langu2
langu2
TOPIC STARTER
Terminology exchange formats Oct 31, 2013

Thanks for your reply. I am aware there are donzens of possible exchange formats, but which are mostly used by translators working with a professional terminology tool (not Excel)?

 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
CafeTran Oct 31, 2013

Depends on what you mean by 'professional terminology tool'. For me, CafeTran (my CAT tool) is a 'professional terminology tool', and it stores its terminology databases as tab-delimited UTF-8 text files, which, if you ask me, is the best format to store, edit, maintain and share term data.

Michael


 
Selcuk Akyuz
Selcuk Akyuz  Identity Verified
Türkiye
Local time: 14:43
English to Turkish
+ ...
xml Oct 31, 2013

TBX was created for this purpose but no, it is not used by translators.

Plain text files or comma / tab separated text files, and excel / word files are used by translators. Agencies or end clients sometimes send xml files (and all the other relevant files) if MultiTerm is used.

MultiTerm is not the ideal tool for simple, bilingual (source = target) term lists. But it is the best terminology management tool to store all metadata. And xml is the only way to share these
... See more
TBX was created for this purpose but no, it is not used by translators.

Plain text files or comma / tab separated text files, and excel / word files are used by translators. Agencies or end clients sometimes send xml files (and all the other relevant files) if MultiTerm is used.

MultiTerm is not the ideal tool for simple, bilingual (source = target) term lists. But it is the best terminology management tool to store all metadata. And xml is the only way to share these 'real' databases.

CafeTran? Michael, it was a good CAT tool for any translator but now I am lost in it. And it seems to me that the developer adds new features only for you.

Selcuk
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 13:43
English to Hungarian
+ ...
professional Oct 31, 2013

langu2 wrote:

Thanks for your reply. I am aware there are donzens of possible exchange formats, but which are mostly used by translators working with a professional terminology tool (not Excel)?

The term "professional terminology tool" makes me laugh. There is no such thing. MultiTerm is the most widely used terminology tool and 'professional' isn't the adjective that comes to mind.
Anyway, the TB exchange situation is a complete mess. There is no widespread standard shared by the overwhelming majority of tools. A lot of them can import xls so that's a reasonably good option. TBX was designed to be a universal standard (like TMX for TMs) but it never really took off. Tab separated txt is a decent option too. MultiTerm has an XML export format, but only MultiTerm and a handful of other tools can read it.

In short, decide what you want to use those TBs for and who you want to give them to, and choose a format based on that. The default generic option is, like it or not, xls.


 
langu2
langu2
TOPIC STARTER
Tab delimited-text files Oct 31, 2013

It is interesting to learn that tab-delimited text files are so popular.


Is the information in each row (tab-delimited files) in a specific order or does this vary from file to file?


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
MultiTerm is a hideous monster of a tool Oct 31, 2013

Hi Selcuk,

CafeTran was a good tool, and it is getting better every day.

Igor has added all kinds of great new features that benefit everyone. Not just me. It's a shame that you think only I benefit from them. If you had a careful look at the latest release, I think you would agree that there are all kinds of new things that anyone can benefit from, such as:

– source and target-side synonyms
– export CT segment notes straight to Word d... See more
Hi Selcuk,

CafeTran was a good tool, and it is getting better every day.

Igor has added all kinds of great new features that benefit everyone. Not just me. It's a shame that you think only I benefit from them. If you had a careful look at the latest release, I think you would agree that there are all kinds of new things that anyone can benefit from, such as:

– source and target-side synonyms
– export CT segment notes straight to Word documents
– new Quick Term Editor
– term prioritising based on term fields (Subject/Client)
– selectable metadata lists in glossaries (in TXT file stored on your computer)
– regular expressions in source terms in TXT Glossaries
– export project as bilingual document for review purposes (like memoQ's RTFs + DVX's word tables)

(taken from: http://cafetran.wikidot.com/pre-release-version )

I also (strongly) disagree, MultiTerm is not the best tool for real termbases with metadata. It's clunky and is very difficult to use. How many people do you know that own, use and love MultiTerm?

I have 8 different fields in my tab-delimited UTF-8 text file glossaries in CT, and can easily edit/maintain them in a good CSV editor, such as Ron's Editor.

Here is what my glossary header looks like:

#nl-NL #en-GB #Context #Subject #Client #Note #Definition #Usage example #Source #URL

XML is definitely not the only way to share these 'real' databases. Tab-delimited UTF-8 text files are the most transparent and interoperable format that exists.

Michael

[Edited at 2013-11-01 09:12 GMT]
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
@langu2: Oct 31, 2013

langu2 wrote:

It is interesting to learn that tab-delimited text files are so popular.

Is the information in each row (tab-delimited files) in a specific order or does this vary from file to file?


It can be in any order you like, and there can be as many rows as you need. You can also open them in any UTF-8-aware text editor or CSV editor. Also, if you use a good CSV editor, you can filter on headers in order to work with your data. Kind of like what you can do in Excel, but without worrying about all of the problems Excel has with character corruption.

For example, my glossaries consist of these fields:

#nl-NL: Dutch
#en-GB: English
#Context: Contextual Priority: http://cafetran.wikidot.com/using-context-aware-auto-assembling
#Subject: can be used for auto-assembly
#Client: can be used for auto-assembly
#Note:
#Definition
#Usage example
#Source: where I found the term / author of the term
#URL: clickable in CafeTran


Michael


 
Meta Arkadia
Meta Arkadia
Local time: 18:43
English to Indonesian
+ ...
Features kill Oct 31, 2013

Selcuk Akyuz wrote:
CafeTran? Michael, it was a good CAT tool for any translator but now I am lost in it. And it seems to me that the developer adds new features only for you.

I agree with Selçuk, and not for the first time. Without an underlying philosophy, heaps of features will kill any CAT tool - any product actually - including CafeTran. And this comes from a CafeTran fanboy.

langu2 wrote: I am aware there are donzens of possible exchange formats, but which are mostly used by translators working with a professional terminology tool (not Excel)?


Impossible to answer, methinks. TMX is supposed to be the industry standard, CSV (often as an Excel file, sorry for that, langu2) is probably the most used format, and I think using a DBMS would be the most professional approach. Those answers don't agree with your criteria.

Cheers,

Hans


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
TMX -> TBX Nov 1, 2013

Meta Arkadia wrote:

langu2 wrote: I am aware there are dozens of possible exchange formats, but which are mostly used by translators working with a professional terminology tool (not Excel)?


Impossible to answer, methinks. TMX is supposed to be the industry standard, CSV (often as an Excel file, sorry for that, langu2) is probably the most used format, and I think using a DBMS would be the most professional approach. Those answers don't agree with your criteria.

Cheers,

Hans


I suppose you meant TBX, right? TMX was never designed for terminology.

Michael


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
stop complaining about features in general, and start complaining about particular features Nov 1, 2013

Hi Hans,

Meta Arkadia wrote:

Selcuk Akyuz wrote:
CafeTran? Michael, it was a good CAT tool for any translator but now I am lost in it. And it seems to me that the developer adds new features only for you.

I agree with Selçuk, and not for the first time. Without an underlying philosophy, heaps of features will kill any CAT tool - any product actually - including CafeTran. And this comes from a CafeTran fanboy.

Cheers,

Hans


It's all very well complaining but perhaps you're not really being fair as I know that there are features among all the new features that you actually do like and use. That is, stop complaining about features in general, and start complaining about particular features. How's that for a 'philosophy'?

And anyway, Igor does have a philosophy, and constantly saying that he doesn't is actually an insult to his intelligence. Just because it differs from your philosophy (which is what, by the way?) doesn't mean it isn’t one.

Michael


 
langu2
langu2
TOPIC STARTER
Excel files Nov 1, 2013

And which tools can handle tab-delimited files + Excel files well?

 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 12:43
Member (2009)
Dutch to English
+ ...
@langu2: Nov 2, 2013

Before we continue, could you maybe explain exactly what it is you are trying to do? That is, are you looking for a terminology tool to actually use (and if so, for what), or are you doing some sort of a study, etc.? This might enable us to better answer your questions.

Also, do you mean a full-blown tool to create a dictionary, like t
... See more
Before we continue, could you maybe explain exactly what it is you are trying to do? That is, are you looking for a terminology tool to actually use (and if so, for what), or are you doing some sort of a study, etc.? This might enable us to better answer your questions.

Also, do you mean a full-blown tool to create a dictionary, like the TLex Dictionary Production Software Suite (http://tshwanedje.com/ ) or Unilex (http://www.acolada.de/unilex.htm ), something to manage terminology, like tlTerm (http://tshwanedje.com/terminology/ ), or a CAT tool with a built in terminology system, or something entirely different?

Michael

[Edited at 2013-11-02 19:14 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Formats for term database exchange






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »