Pages in topic:   [1 2] >
Translating a scanned pdf document
Thread poster: Sara Ferreira
Sara Ferreira
Sara Ferreira
Portugal
Local time: 13:11
English to Portuguese
+ ...
May 22, 2014

Hi everyone.

I was asked to translate a paper document, which was scanned and sent to me in pdf format.
It has text parts, as well as images and even hand written parts. I'm supposed to translate the text parts, but keeping the page exactly as it is, with the other elements. Is this possible? This is my first time dealing with a situation like this. Can anyone help?

Thanks


 
Per Magnus
Per Magnus  Identity Verified
Local time: 14:11
English to Norwegian
Translating pdf documents May 22, 2014

There are hundreds of threads discussing this problem, why don’t you try this google search:

proz translating pdf


 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 15:11
Member (2008)
English to Russian
+ ...
/// May 22, 2014

To translate a PDF document, you have to extract the text and images first. Then, you have to set up all the text and images anew, and only then you start translating.
It is the way people work in the 21-st cetrury

1) PDF
2) Optical Character Recognition
3) Formating of the source
4) Translation
5) Proofreading
6) Final formatting of the target


 
J Gallagher (X)
J Gallagher (X)  Identity Verified
Finland
Local time: 15:11
English to Finnish
+ ...
Download a trial version of Abbyy PDF Transformer May 22, 2014

Select the correct language and use a setting to keep the images. If the quality of the scanned pdf is good, you can achieve fairly good results. Likely need to fix this and that. Abbyy does not add those nasty line breaks after every row.

I'm sure there are other good tools, too. At least this one is not too expensive, a good investment.


 
Nikita Kobrin
Nikita Kobrin  Identity Verified
Lithuania
Local time: 15:11
Member (2010)
English to Russian
+ ...
Client should pay extra May 22, 2014

Sergei Leshchinsky wrote:

It is the way people work in the 21-st cetrury


Sergei has forgotten to mention that a client should pay extra for this preparation work. It may be quite difficult and tedious and take hours and sometimes even days. The cost of such preparation work may be higher than the cost of the translation itself.

Nikita Kobrin


 
Bernhard Sulzer
Bernhard Sulzer  Identity Verified
United States
Local time: 08:11
English to German
+ ...
Don't forget May 22, 2014

Sara Ferreira wrote:

Hi everyone.

I was asked to translate a paper document, which was scanned and sent to me in pdf format.
It has text parts, as well as images and even hand written parts. I'm supposed to translate the text parts, but keeping the page exactly as it is, with the other elements. Is this possible? This is my first time dealing with a situation like this. Can anyone help?

Thanks


... to charge extra for all that editing. you can either roll this into your per-word rate or add an extra editing fee. Estimate how long it will take you to get everything put in the right place (it usually takes longer than you think) and then apply a fair hourly rate for editing.

B

[Edited at 2014-05-22 15:20 GMT]


 
Ildiko Santana
Ildiko Santana  Identity Verified
United States
Local time: 05:11
Member (2002)
Hungarian to English
+ ...

MODERATOR
ABBYY PDF Transformer+ May 22, 2014

In an ideal world, your client should provide you with nice and clean, editable source files. Until this becomes standard practice, I recommend investing in an OCR software. I only wish I had done it sooner! It would have saved me years of struggle, as the majority of my work comes from legal documents that pass through many hands and copy machines before they are sent out for translation.

J Gallagher wrote:

Select the correct language and use a setting to keep the images. If the quality of the scanned pdf is good, you can achieve fairly good results. Likely need to fix this and that. Abbyy does not add those nasty line breaks after every row.

I'm sure there are other good tools, too. At least this one is not too expensive, a good investment.


I have spent years searching for the right tool and I can vouch for the ABBYY PDF Transformer that I just recently purchased. It works wonderfully, and only cost $80 (compared to Abbyy Fine Reader for about $300, which I *almost* chose). Careful proofreading is still an absolute must, but this software saves me hours of painful manual recreation of the horrid scanned photocopies of the worst kind (which my 21st-century clients still prefer for some odd reason)
http://pdftransformer.abbyy.com/


 
Peter Linton (X)
Peter Linton (X)  Identity Verified
Local time: 13:11
Swedish to English
+ ...
Keeping the page exactly as it is May 22, 2014

Sara Ferreira wrote:
I'm supposed to translate the text parts, but keeping the page exactly as it is, with the other elements. Is this possible?

Yes, this is possible, but likely to be the most complicated and time-consuming part of the job. The simplest way, perhaps even the only way, is to create a Word template, using styles to replicate the appearance of the PDF. If you are not familiar with styles you will have to do it all manually. You will never achieve an exact result.

As others have pointed out there is lots of information available. The good news is that this will happen to you only once, because next time you will be aware of the issues and will explain to your customer that keeping pages exactly as they are is complex and time-consuming, costs extra, and in any case you are a translator, not a desktop publishing expert.


 
Henry Hinds
Henry Hinds  Identity Verified
United States
Local time: 06:11
English to Spanish
+ ...
In memoriam
Agree May 22, 2014

I agree with Peter and wish to add that it is best to inform the client right from the start that he is not going to get a re-creation of the original format, but merely a translation in a simple format because you are not a desktop publisher. For legal documents that is normally quite sufficient, as legal documents do not tend to have a complicated format. They can be crowded, however, so for such purposes I often air out the document and put one page on two or even three in extreme cases. That... See more
I agree with Peter and wish to add that it is best to inform the client right from the start that he is not going to get a re-creation of the original format, but merely a translation in a simple format because you are not a desktop publisher. For legal documents that is normally quite sufficient, as legal documents do not tend to have a complicated format. They can be crowded, however, so for such purposes I often air out the document and put one page on two or even three in extreme cases. That also makes them much easier to read than the original. If there are graphics involved, any language on them will have to be translated separately and reference made to the original graphic. In my own case at least, I cannot reproduce graphics.

Anything beyond a simple format requires additional time and cost, and the client should be made aware of the cost factor, which will normally get him to say OK to a simple format. If the .pdf original is already in a simple format and good quality, I will not charge extra. I started working almost 43 years ago with paper documents so such material represents no special challenge to me.
Collapse


 
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 08:11
Member (2002)
Spanish to English
+ ...
Snapshot May 22, 2014

You can also use the PDF snapshot tool (located under the EDIT menu) to highlight and copy images into the word document. If the images contain text, just provide a translation underneath the image (you can resize the image to make it smaller if necessary) in whatever format that works best (straight text, glossary, text boxes for flowcharts, etc.).





[Edited at 2014-05-22 16:29 GMT]


 
Jorge Herran
Jorge Herran  Identity Verified
Peru
Local time: 07:11
Member (2014)
English to Spanish
A fast and usually easy way to do it. May 22, 2014

Hello, my personal flowchart for translating PDF when a client does not want a change on the layout:

- Try a demo of ABBY online OCR and if you like it buy a number of pages from it, it is an OCR service that is always up to date, cost effective, the fee per page is acceptable (last time I bought 200 pages for $10).

- That will produce a MS-DOCX (other formats too) that would be very accurate (in most cases), without altering the layout.

- Proofread the doc
... See more
Hello, my personal flowchart for translating PDF when a client does not want a change on the layout:

- Try a demo of ABBY online OCR and if you like it buy a number of pages from it, it is an OCR service that is always up to date, cost effective, the fee per page is acceptable (last time I bought 200 pages for $10).

- That will produce a MS-DOCX (other formats too) that would be very accurate (in most cases), without altering the layout.

- Proofread the document (sometimes it has minor issues with some characters, depending on the source document), when that happens I usually make first a highlighting of the mistake with one color and a propagation of the correction with a highlighting with another color (to save time and assess the changes easily).

- Once ready I translate it.

If you prefer a free way, I suggest you to use Capture2text.exe (freeware), however, its OCR is not as good as the ABBYY one.

Good luck.

Kind regards
Collapse


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 15:11
Member (2003)
Finnish to German
+ ...
Why not work on the image May 23, 2014

If the customer is really sure they want an copy with only the text translated I would convert the pfd into an image file, translate the texts separately using appropriate fonts and convert them into images, which can be copied onto the background using any image editing software.

 
Tom in London
Tom in London
United Kingdom
Local time: 13:11
Member (2008)
Italian to English
I agree May 23, 2014

Nikita Kobrin wrote:

Sergei Leshchinsky wrote:

It is the way people work in the 21-st cetrury


Sergei has forgotten to mention that a client should pay extra for this preparation work. It may be quite difficult and tedious and take hours and sometimes even days. The cost of such preparation work may be higher than the cost of the translation itself.

Nikita Kobrin


I agree with Nikita, Peter, and various others who have posted here; all of those extra tasks actually have nothing to do with translating and should not be included in your tariff for translating. Ideally you would not do them at all; your client should provide you with a Word file containing all the text to be translated. Over the years I have made good progress with my regular clients, and have persuaded them not to send me PDFs for translation.

[Edited at 2014-05-23 09:22 GMT]


 
neilmac
neilmac
Spain
Local time: 14:11
Spanish to English
+ ...
Don't ask, don't get May 24, 2014

Ildiko Santana wrote:

In an ideal world, your client should provide you with nice and clean, editable source files. Until this becomes standard practice, ...


This is why workable source files are a prerequisite for me. I make this quite clear in my terms and conditions. Clients who expect me to format or otherwise fiddle about wasting time with OCR, images, spacing, coloured text, shading ... etc get short shrift. I'd have to double my rates if I included these value-added services.


 
Ildiko Santana
Ildiko Santana  Identity Verified
United States
Local time: 05:11
Member (2002)
Hungarian to English
+ ...

MODERATOR
When asked, provide and get compensated May 24, 2014

neilmac wrote:
This is why workable source files are a prerequisite for me. I make this quite clear in my terms and conditions. Clients who expect me to format or otherwise fiddle about wasting time with OCR, images, spacing, coloured text, shading ... etc get short shrift. I'd have to double my rates if I included these value-added services.


This is true to an extent, and I typically ask if they can provide converted, editable copies - sometimes they can. However, I don't refuse to do it myself if I can and if it brings in profitable projects - which is often the case. I guess my clients that are not technically advanced enough for this kind of preparation of source files and have me do this for them are very happy to work with me not only because of receiving quality translations that convey the meaning and accurate context of the source but also provide the layout and (visual) 'feel' of the original. I see nothing wrong with being a perfectionist, and nothing wrong with asking for more - or even doubling our rates - for the added services we provide. ; )


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translating a scanned pdf document






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »