regular expression in Xbench
Thread poster: bourriquet
bourriquet
bourriquet
Poland
Local time: 16:56
English to Polish
+ ...
Apr 25, 2015

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.

 
Mikhail Zavidin
Mikhail Zavidin
Local time: 18:56
English to Russian
+ ...
For example Apr 25, 2015

Hi!

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


Something like that:

([a-zA-Z0-9]+)=1\.@1


Hope this helps.


 
Riccardo Schiaffino
Riccardo Schiaffino  Identity Verified
United States
Local time: 09:56
Member (2003)
English to Italian
+ ...
Not sure that is what Bourriquet needs Apr 26, 2015

Mikhail Zavidin wrote:

Hi!

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


Something like that:

([a-zA-Z0-9]+)=1\.@1


Hope this helps.


Mikahil,

I think that the regular expression you suggest would find not only periods between words (what Bourriquet needs), but also dots inside numbers.

I.e., your regular expression would not only flag

word.word,

but also

12.456,39

I would suggest instead a simple

[a-zA-z]\.[a-zA-z]


 
Rolf Keller
Rolf Keller
Germany
Local time: 16:56
English to German
"A to Z" restricts the search to plain English Apr 26, 2015

I propose
([:letter:]+)[:punctuation:]([:letter:]+)

(No XBench here to test it, though.)


 
bourriquet
bourriquet
Poland
Local time: 16:56
English to Polish
+ ...
TOPIC STARTER
no results so far Apr 26, 2015

Thank you very much for fast replies.
However, none of these work in Xbench (checked against a Studio file with "word.word" mistake in target) after adding to checklist, but may be I'm doing something wrong.
It would be great if someone could verify themselves if their regular expression works in Xbench based on a Studio file with this kind of mistake included.

[Edited at 2015-04-26 10:22 GMT]


 
pep
pep
Local time: 16:56
English to Spanish
Remember to set the Regular Expression search mode Apr 26, 2015

These expressions should both work:

[:letter:]+\.[:letter:]+
[a-z]+\.[a-z]+

but you should set the search mode as Regular Expression.






[Edited at 2015-04-26 13:10 GMT]


 
Mikhail Zavidin
Mikhail Zavidin
Local time: 18:56
English to Russian
+ ...
Seems to be a RegEx bug Apr 26, 2015

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


The RegEx I've suggested seems to work in plain text files (including UTF-8). With Cyrillic symbols also. I've tried ([а-я]+)=1\.@1.

With Trados Studio 2011's files the Xbench 2.9 behaves weirdly. It doesn't work correctly even with Latin symbols.

I suggest you report a bug to ApSIC. Or may be in Xbench 3.0 it will work as expected because it supports Unicode or so.


 
Riccardo Schiaffino
Riccardo Schiaffino  Identity Verified
United States
Local time: 09:56
Member (2003)
English to Italian
+ ...
A correction to my suggestion (and to Pep's) Apr 26, 2015

pep wrote:

These expressions should both work:

[:letter:]+\.[:letter:]+
[a-z]+\.[a-z]+



Wile [:letter:]+\.[:letter:]+ (or even just [:letter:]\.[:letter:]) works correctly,

[a-z]+\.[a-z]+ (or, for that matter, the regex expression I had suggested: [a-zA-Z]\.[a-zA-Z])

only works for non-accented characters

So the solution to Bourriquet in Xbench should be

[:letter:]+\.[:letter:]+


 
bourriquet
bourriquet
Poland
Local time: 16:56
English to Polish
+ ...
TOPIC STARTER
now it works Apr 27, 2015

Thank you all for your contributions. I created a new checklist item with one of the regular expressions in the target column and I get results after selecting "Check ongoing translation..." This is what I wanted, thank you!

 
Oscar Martin
Oscar Martin
Spain
Local time: 16:56
English to Spanish
+ ...
RegEx Apr 27, 2015

Hi,

While [:letter:]+\.[:letter:]+ will find the segments you're looking for, there are some items that must be taken into account:

- When using regular expressions, Xbench evaluates the regex from left to right. That is, if the first element is [:letter:], it will evaluate all segments that contains a sequence of one or more e [:letter:]. If the Xbench project contains a high number of segments, the search may take up to some minutes.
- Search first an element th
... See more
Hi,

While [:letter:]+\.[:letter:]+ will find the segments you're looking for, there are some items that must be taken into account:

- When using regular expressions, Xbench evaluates the regex from left to right. That is, if the first element is [:letter:], it will evaluate all segments that contains a sequence of one or more e [:letter:]. If the Xbench project contains a high number of segments, the search may take up to some minutes.
- Search first an element that should be present or not but it is not regex syntax, but a character, a word, etc. For instance, this search can be optimized as
"\." "[:letter:]+\.[:letter:]+"
as a powerseach.

First, it will discard all segments that do not contain a dot. Then it will evaluate 2 sequences of letters separated by a dot instead of a space.

Regards,

Oscar
Collapse


 
Mikhail Zavidin
Mikhail Zavidin
Local time: 18:56
English to Russian
+ ...
Now it works for me too Apr 27, 2015

bourriquet wrote:

Thank you all for your contributions. I created a new checklist item with one of the regular expressions in the target column and I get results after selecting "Check ongoing translation..." This is what I wanted, thank you!


I have just made it clear: the segments in question in Trados Studio must be with translated status. Then everything works OK.


 
kirsty morgan
kirsty morgan  Identity Verified
United Kingdom
Local time: 15:56
Spanish to English
+ ...
XBench regular expressions tutorial? Oct 13, 2016

Does anyone know of a tutorial or guide that teaches the basic language of regular expressions as used by XBench? I currently use them to check punctuation differences but would like to find an easy-to-follow tutorial or guide to learn how to write more expressions. Most of the material I have found on regular expressions makes my head spin a little!
Any ideas gratefully received.


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 15:56
Member (2014)
Japanese to English
Try RegexBuddy Oct 13, 2016

kirsty morgan wrote:
Any ideas gratefully received.

XBench's regex flavor is POSIX extended (ERE), about which more here. That site also has many useful-looking tutorials. However, might I suggest - as an independent but satisfied customer - that you invest just under 30 euro in RegexBuddy.

I have been using regexes for nearly two decades, starting from when I was doing a lot of Perl in the late 1990s, so I'm not a stranger to this arcane branch of text processing. But I don't use them every day and it's easy to forget the details, so when I'm trying to make up a new regex, unless it's really simple I reach for RegexBuddy first.

I find the menus, little symbols and text explanations make it far easier to build up a regex, brick by brick. Then you can test it right there with some text from your document.

Support is top-notch. RegexBuddy includes access to an excellent forum where the developer helps out with regex questions for free and typically replies within 24 hours.

Regards
Dan


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

regular expression in Xbench






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »