Deleting spaces in Chinese word documents
论题张贴者: Iris Kleinophorst
Iris Kleinophorst
Iris Kleinophorst  Identity Verified
德国
Local time: 21:42
Chinese汉语译成German德语
+ ...
Oct 14, 2013

Hi

does anyone know a tool, function or regex to delete unnecessary spaces between Chinese characters resp. Chinese characters and Arabic numbers/Latin letters, e.g. in scanned PDF files? That is, a document with numerous other expressions where the spaces have to be kept, so that search and replace of spaces in Word does not work?

TIA
Iris


 
lbone
lbone  Identity Verified
中国大陆
Local time: 03:42
正式会员 (自2006)
English英语译成Chinese汉语
+ ...
regex case by case Oct 14, 2013

I think this is a work that needs the time by some person rather than by one or several expressions. Chinese characters are not easy to define simply by a regular expression (standard regular expressions are not supported by Microsoft Word), and sometimes blank spaces are meaningful such as in the title:

  第十四回 林如海捐馆扬州城 贾宝玉路谒北静王

Besides Chinese characters, spaces, digits, English letters and common English punctuation marks, t
... See more
I think this is a work that needs the time by some person rather than by one or several expressions. Chinese characters are not easy to define simply by a regular expression (standard regular expressions are not supported by Microsoft Word), and sometimes blank spaces are meaningful such as in the title:

  第十四回 林如海捐馆扬州城 贾宝玉路谒北静王

Besides Chinese characters, spaces, digits, English letters and common English punctuation marks, there are also Korean/Japanese characters, non-standard/double-byte symbols. You will need to judge and handle spaces involved separately.

Iris Kleinophorst wrote:

Hi

does anyone know a tool, function or regex to delete unnecessary spaces between Chinese characters resp. Chinese characters and Arabic numbers/Latin letters, e.g. in scanned PDF files? That is, a document with numerous other expressions where the spaces have to be kept, so that search and replace of spaces in Word does not work?

TIA
Iris
Collapse


 
Lawrence Lam
Lawrence Lam  Identity Verified
中国大陆
Local time: 03:42
English英语译成Chinese汉语
+ ...
wildcards Nov 2, 2013

Iris Kleinophorst wrote:

Hi

does anyone know a tool, function or regex to delete unnecessary spaces between Chinese characters resp. Chinese characters and Arabic numbers/Latin letters, e.g. in scanned PDF files? That is, a document with numerous other expressions where the spaces have to be kept, so that search and replace of spaces in Word does not work?

TIA
Iris


Finding and replacing characters using wildcards.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Deleting spaces in Chinese word documents






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »