Extraction of text from Web site Thread poster: luka
| luka Spain Local time: 15:04 English to Spanish + ...
I have been asked to translate a big web site for a company. They do not have the source text for the site. I am looking for a tool which is capable of looking through all the pages, stripping out the code, leaving me with only the text to translate. Or any other suggestions on ways to do this. Thanks. | | | Rod Darby (X) Ghana Local time: 13:04 German to English + ... possible solution | Jan 16, 2007 |
luka, there's a shareware called Trellian which I believe will download the code of a site for you - I haven't tried it, but you might have a look. Rod
[Edited at 2007-01-16 11:26] | | |
Hola. I use WinHTTrack (http://www.httrack.com). It's free and it works wonders. It mirrors in your hard drive the website that you want to work with. Good luck, Jerónimo | | | Marc P (X) Local time: 15:04 German to English + ... Extraction of text from Web site | Jan 16, 2007 |
Why strip out the code from the pages? The customer will then have the job of putting it all back in again. Tools are available with which you can download entire web sites, retaining the directory structure. wget is an example: www.gnu.org/software/wget Once you have downloaded the site, you can translate the pages in a CAT tool which is capable of handling HTML. OmegaT, for ex... See more Why strip out the code from the pages? The customer will then have the job of putting it all back in again. Tools are available with which you can download entire web sites, retaining the directory structure. wget is an example: www.gnu.org/software/wget Once you have downloaded the site, you can translate the pages in a CAT tool which is capable of handling HTML. OmegaT, for example, will present you with the text for translation whilst keeping the entire web site structure - directories, images, the works - intact. It is possible, however, that your customer's web pages are created dynamically with data from a database. In this case, you will probably have to get the customer to deliver the data to you. Marc ▲ Collapse | |
|
|
|
If the site is that big, chances are it is database-driven. Using a software to donwload pages may result in your waiting ages to complete the download. In any case, you should only use the page for quoting purposes. The customer should send you the page they would like to have translated, or an export of the database if the content is generated that way. My 2 cents | | | luka Spain Local time: 15:04 English to Spanish + ... TOPIC STARTER Thank you very much | Jan 19, 2007 |
I want to thank all of you for your help. Eventually I have given up because the site is huge and I have told the client I can't find out the number of words and they should try to find the source files. Have a great weekend | | | Talent Success United States Local time: 08:04 Member (2006) English to Spanish + ...
|
|
Prior declarations or content management set-ups... | Apr 20, 2007 |
I have found that if a client is considering a relatively small static-html web site translation, they often prefer a complete price for the site (including graphical elements). My work has the attched conditions that all web page URLs requiring translation, are declared in advance - with the pages I have already seen listed by me in the quote. On much larger dynamic page (data-based) sites, the clients often have domestic-language web programmers/developers presen... See more I have found that if a client is considering a relatively small static-html web site translation, they often prefer a complete price for the site (including graphical elements). My work has the attched conditions that all web page URLs requiring translation, are declared in advance - with the pages I have already seen listed by me in the quote. On much larger dynamic page (data-based) sites, the clients often have domestic-language web programmers/developers present in their team. Well this is the ideal anyway. If this is the case, I find it makes sense to ask that their developer adds a column to their database in the language that I offer them (as they will have to eventually) + create a simple content administration page which I can access with a password. This way I can see the text to be translated and below have a field blank to input, save and revise the equivalent new-language version. When all is done, it takes no time for the developer to change the content reference variable in their web page template to the new language variable. It's all rather simple really
[Edited at 2007-04-20 14:59] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Extraction of text from Web site Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |